Transformers

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

ExecuTorch

ExecuTorch 는 웨어러블, 임베디드 장치, 마이크로컨트롤러를 포함한 모바일 및 엣지 장치에서 온디바이스 추론 기능을 가능하게 하는 종합 솔루션입니다. PyTorch 생태계에 속해있으며, 이식성, 생산성, 성능에 중점을 둔 PyTorch 모델 배포를 지원합니다.

ExecuTorch는 백엔드 위임, 사용자 정의 컴파일러 변환, 메모리 계획 등 모델, 장치 또는 특정 유즈케이스 맞춤 최적화를 수행할 수 있는 진입점을 명확하게 정의합니다. ExecuTorch를 사용해 엣지 장치에서 PyTorch 모델을 실행하는 첫 번째 단계는 모델을 익스포트하는 것입니다. 이 작업은 PyTorch API인 torch.export를 사용하여 수행합니다.

ExecuTorch 통합

torch.export를 사용하여 🤗 Transformers를 익스포트 할 수 있도록 통합 지점이 개발되고 있습니다. 이 통합의 목표는 익스포트뿐만 아니라, 익스포트한 아티팩트가 ExecuTorch에서 효율적으로 실행될 수 있도록 더 축소하고 최적화하는 것입니다. 특히 모바일 및 엣지 유즈케이스에 중점을 두고 있습니다.

class transformers.TorchExportableModuleWithStaticCache

< source >

( model: PreTrainedModel )

A wrapper module designed to make a PreTrainedModel exportable with torch.export, specifically for use with static caching. This module ensures that the exported model is compatible with further lowering and execution in ExecuTorch.

Note: This class is specifically designed to support export process using torch.export in a way that ensures the model can be further lowered and run efficiently in ExecuTorch.

forward

< source >

( input_ids: Tensor cache_position: Tensor ) → torch.Tensor

Parameters

input_ids (torch.Tensor) — Tensor representing current input token id to the module.
cache_position (torch.Tensor) — Tensor representing current input position in the cache.

Returns

torch.Tensor

Logits output from the model.

Forward pass of the module, which is compatible with the ExecuTorch runtime.

This forward adapter serves two primary purposes:

Making the Model torch.export-Compatible: The adapter hides unsupported objects, such as the Cache, from the graph inputs and outputs, enabling the model to be exportable using torch.export without encountering issues.
Ensuring Compatibility with ExecuTorch runtime: The adapter matches the model’s forward signature with that in executorch/extension/llm/runner, ensuring that the exported model can be executed in ExecuTorch out-of-the-box.

transformers.convert_and_export_with_cache

< source >

( model: PreTrainedModel example_input_ids: Tensor = None example_cache_position: Tensor = None ) → Exported program (torch.export.ExportedProgram)

Parameters

model (PreTrainedModel) — The pretrained model to be exported.
example_input_ids (torch.Tensor) — Example input token id used by torch.export.
example_cache_position (torch.Tensor) — Example current cache position used by torch.export.

Returns

Exported program (torch.export.ExportedProgram)

The exported program generated via torch.export.

Convert a PreTrainedModel into an exportable module and export it using torch.export, ensuring the exported model is compatible with ExecuTorch.

< > Update on GitHub

←DeepSpeed 특성 추출기→