Optimum documentation

Reference

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v1.23.3).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Reference

INCQuantizer

class optimum.intel.INCQuantizer

< >

( model: typing.Union[transformers.modeling_utils.PreTrainedModel, torch.nn.modules.module.Module] eval_fn: typing.Union[typing.Callable[[transformers.modeling_utils.PreTrainedModel], int], NoneType] = None calibration_fn: typing.Union[typing.Callable[[transformers.modeling_utils.PreTrainedModel], int], NoneType] = None task: typing.Optional[str] = None seed: int = 42 )

Handle the Neural Compressor quantization process.

get_calibration_dataset

< >

( dataset_name: str num_samples: int = 100 dataset_config_name: typing.Optional[str] = None dataset_split: str = 'train' preprocess_function: typing.Optional[typing.Callable] = None preprocess_batch: bool = True use_auth_token: typing.Union[bool, str, NoneType] = None token: typing.Union[bool, str, NoneType] = None )

Parameters

  • dataset_name (str) — The dataset repository name on the Hugging Face Hub or path to a local directory containing data files in generic formats and optionally a dataset script, if it requires some code to read the data files.
  • num_samples (int, defaults to 100) — The maximum number of samples composing the calibration dataset.
  • dataset_config_name (str, optional) — The name of the dataset configuration.
  • dataset_split (str, defaults to "train") — Which split of the dataset to use to perform the calibration step.
  • preprocess_function (Callable, optional) — Processing function to apply to each example after loading dataset.
  • preprocess_batch (bool, defaults to True) — Whether the preprocess_function should be batched.
  • use_auth_token (Optional[Union[bool, str]], defaults to None) — Deprecated. Please use token instead.
  • token (Optional[Union[bool, str]], defaults to None) — The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running huggingface-cli login (stored in ~/.huggingface).

Create the calibration datasets.Dataset to use for the post-training static quantization calibration step.

quantize

< >

( quantization_config: ForwardRef('PostTrainingQuantConfig') save_directory: typing.Union[str, pathlib.Path] calibration_dataset: Dataset = None batch_size: int = 8 data_collator: typing.Optional[DataCollator] = None remove_unused_columns: bool = True file_name: str = None **kwargs )

Parameters

  • quantization_config (Union[PostTrainingQuantConfig]) — The configuration containing the parameters related to quantization.
  • save_directory (Union[str, Path]) — The directory where the quantized model should be saved.
  • calibration_dataset (datasets.Dataset, defaults to None) — The dataset to use for the calibration step, needed for post-training static quantization.
  • batch_size (int, defaults to 8) — The number of calibration samples to load per batch.
  • data_collator (DataCollator, defaults to None) — The function to use to form a batch from a list of elements of the calibration dataset.
  • remove_unused_columns (bool, defaults to True) — Whether or not to remove the columns unused by the model forward method.

Quantize a model given the optimization specifications defined in quantization_config.

INCTrainer

class optimum.intel.INCTrainer

< >

( model: typing.Union[transformers.modeling_utils.PreTrainedModel, torch.nn.modules.module.Module] = None args: TrainingArguments = None data_collator: typing.Optional[DataCollator] = None train_dataset: typing.Optional[torch.utils.data.dataset.Dataset] = None eval_dataset: typing.Optional[torch.utils.data.dataset.Dataset] = None processing_class: typing.Union[transformers.tokenization_utils_base.PreTrainedTokenizerBase, transformers.feature_extraction_utils.FeatureExtractionMixin, NoneType] = None model_init: typing.Callable[[], transformers.modeling_utils.PreTrainedModel] = None compute_loss_func: typing.Optional[typing.Callable] = None compute_metrics: typing.Union[typing.Callable[[transformers.trainer_utils.EvalPrediction], typing.Dict], NoneType] = None callbacks: typing.Optional[typing.List[transformers.trainer_callback.TrainerCallback]] = None optimizers: typing.Tuple[torch.optim.optimizer.Optimizer, torch.optim.lr_scheduler.LambdaLR] = (None, None) preprocess_logits_for_metrics: typing.Callable[[torch.Tensor, torch.Tensor], torch.Tensor] = None quantization_config: typing.Optional[neural_compressor.config._BaseQuantizationConfig] = None pruning_config: typing.Optional[neural_compressor.config._BaseQuantizationConfig] = None distillation_config: typing.Optional[neural_compressor.config._BaseQuantizationConfig] = None task: typing.Optional[str] = None **kwargs )

INCTrainer enables Intel Neural Compression quantization aware training, pruning and distillation.

compute_distillation_loss

< >

( student_outputs teacher_outputs )

How the distillation loss is computed given the student and teacher outputs.

compute_loss

< >

( model inputs return_outputs = False num_items_in_batch = None )

How the loss is computed by Trainer. By default, all models return the loss in the first element.

save_model

< >

( output_dir: typing.Optional[str] = None _internal_call: bool = False )

Will save the model, so you can reload it using from_pretrained(). Will only save from the main process.

INCModel

class optimum.intel.INCModel

< >

( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )

INCModelForSequenceClassification

class optimum.intel.INCModelForSequenceClassification

< >

( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )

INCModelForQuestionAnswering

class optimum.intel.INCModelForQuestionAnswering

< >

( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )

INCModelForTokenClassification

class optimum.intel.INCModelForTokenClassification

< >

( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )

INCModelForMultipleChoice

class optimum.intel.INCModelForMultipleChoice

< >

( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )

INCModelForMaskedLM

class optimum.intel.INCModelForMaskedLM

< >

( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )

INCModelForCausalLM

class optimum.intel.INCModelForCausalLM

< >

( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )

INCModelForSeq2SeqLM

class optimum.intel.INCModelForSeq2SeqLM

< >

( model config: PretrainedConfig = None model_save_dir: typing.Union[str, pathlib.Path, tempfile.TemporaryDirectory, NoneType] = None q_config: typing.Dict = None inc_config: typing.Dict = None **kwargs )

< > Update on GitHub