Edit model card

SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("philipp-zettl/MiniLM-amazon_massive_intent-similarity")
# Run inference
sentences = [
    'vertel my die huidige tyd in ottawa',
    'query cooking',
    'request definition',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.8464
spearman_cosine 0.8129
pearson_manhattan 0.8205
spearman_manhattan 0.807
pearson_euclidean 0.8208
spearman_euclidean 0.8069
pearson_dot 0.7928
spearman_dot 0.8078
pearson_max 0.8464
spearman_max 0.8129

Semantic Similarity

Metric Value
pearson_cosine 0.908
spearman_cosine 0.8426
pearson_manhattan 0.8854
spearman_manhattan 0.8389
pearson_euclidean 0.8858
spearman_euclidean 0.8391
pearson_dot 0.8619
spearman_dot 0.839
pearson_max 0.908
spearman_max 0.8426

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss MiniLM-dev_spearman_cosine MiniLM-test_spearman_cosine
0.0031 100 7.4879 - - -
0.0062 200 6.4531 - - -
0.0093 300 6.4185 - - -
0.0125 400 4.5043 - - -
0.0156 500 5.1274 - - -
0.0187 600 6.0006 - - -
0.0218 700 4.8066 - - -
0.0249 800 3.9536 - - -
0.0280 900 4.7259 - - -
0.0311 1000 3.7583 2.6440 0.6640 -
0.0343 1100 3.9905 - - -
0.0374 1200 4.8914 - - -
0.0405 1300 3.895 - - -
0.0436 1400 3.1582 - - -
0.0467 1500 3.7172 - - -
0.0498 1600 3.6785 - - -
0.0529 1700 3.9632 - - -
0.0561 1800 3.9643 - - -
0.0592 1900 2.829 - - -
0.0623 2000 2.5923 2.3344 0.7459 -
0.0654 2100 3.1617 - - -
0.0685 2200 2.6366 - - -
0.0716 2300 4.3751 - - -
0.0747 2400 3.4732 - - -
0.0779 2500 2.5695 - - -
0.0810 2600 2.7479 - - -
0.0841 2700 2.5274 - - -
0.0872 2800 2.4204 - - -
0.0903 2900 4.1305 - - -
0.0934 3000 4.091 2.0951 0.7426 -
0.0965 3100 3.7972 - - -
0.0997 3200 2.6029 - - -
0.1028 3300 3.2422 - - -
0.1059 3400 3.3747 - - -
0.1090 3500 3.3358 - - -
0.1121 3600 2.8658 - - -
0.1152 3700 2.6436 - - -
0.1183 3800 2.2006 - - -
0.1215 3900 2.0549 - - -
0.1246 4000 2.4642 3.4108 0.7236 -
0.1277 4100 2.9219 - - -
0.1308 4200 2.6581 - - -
0.1339 4300 2.2697 - - -
0.1370 4400 2.7215 - - -
0.1401 4500 2.6023 - - -
0.1433 4600 1.8772 - - -
0.1464 4700 2.6885 - - -
0.1495 4800 2.6005 - - -
0.1526 4900 1.4849 - - -
0.1557 5000 2.4896 3.4860 0.7117 -
0.1588 5100 2.6038 - - -
0.1619 5200 2.0584 - - -
0.1651 5300 1.9156 - - -
0.1682 5400 1.467 - - -
0.1713 5500 0.5799 - - -
0.1744 5600 1.617 - - -
0.1775 5700 1.3764 - - -
0.1806 5800 3.067 - - -
0.1837 5900 2.2463 - - -
0.1869 6000 1.5466 2.5326 0.7721 -
0.1900 6100 1.4097 - - -
0.1931 6200 1.7852 - - -
0.1962 6300 1.2715 - - -
0.1993 6400 2.5585 - - -
0.2024 6500 2.4665 - - -
0.2055 6600 1.7246 - - -
0.2087 6700 1.145 - - -
0.2118 6800 1.614 - - -
0.2149 6900 1.7206 - - -
0.2180 7000 2.6349 2.6824 0.7652 -
0.2211 7100 2.1896 - - -
0.2242 7200 1.9106 - - -
0.2274 7300 1.3783 - - -
0.2305 7400 0.7119 - - -
0.2336 7500 1.5037 - - -
0.2367 7600 1.8365 - - -
0.2398 7700 1.3817 - - -
0.2429 7800 1.7101 - - -
0.2460 7900 1.6716 - - -
0.2492 8000 1.3013 3.5864 0.7401 -
0.2523 8100 1.5131 - - -
0.2554 8200 2.3699 - - -
0.2585 8300 1.6179 - - -
0.2616 8400 1.3 - - -
0.2647 8500 1.5151 - - -
0.2678 8600 2.8703 - - -
0.2710 8700 2.5076 - - -
0.2741 8800 1.9876 - - -
0.2772 8900 1.5823 - - -
0.2803 9000 1.0845 2.4197 0.7833 -
0.2834 9100 1.2871 - - -
0.2865 9200 1.3901 - - -
0.2896 9300 1.1607 - - -
0.2928 9400 2.1171 - - -
0.2959 9500 1.4335 - - -
0.2990 9600 0.801 - - -
0.3021 9700 1.4567 - - -
0.3052 9800 1.7046 - - -
0.3083 9900 1.4378 - - -
0.3114 10000 2.3191 2.3063 0.7903 -
0.3146 10100 1.6518 - - -
0.3177 10200 0.9857 - - -
0.3208 10300 2.2052 - - -
0.3239 10400 2.0443 - - -
0.3270 10500 2.08 - - -
0.3301 10600 2.0009 - - -
0.3332 10700 1.3274 - - -
0.3364 10800 1.0298 - - -
0.3395 10900 1.7127 - - -
0.3426 11000 1.3371 4.0607 0.7211 -
0.3457 11100 2.7555 - - -
0.3488 11200 4.1792 - - -
0.3519 11300 2.0931 - - -
0.3550 11400 2.4591 - - -
0.3582 11500 3.4962 - - -
0.3613 11600 1.9228 - - -
0.3644 11700 2.7295 - - -
0.3675 11800 1.5425 - - -
0.3706 11900 1.1586 - - -
0.3737 12000 1.1336 2.2959 0.7890 -
0.3768 12100 1.572 - - -
0.3800 12200 1.2827 - - -
0.3831 12300 1.6352 - - -
0.3862 12400 1.4708 - - -
0.3893 12500 1.4719 - - -
0.3924 12600 1.4136 - - -
0.3955 12700 1.3969 - - -
0.3986 12800 1.7228 - - -
0.4018 12900 4.2842 - - -
0.4049 13000 3.5861 2.1113 0.7956 -
0.4080 13100 2.9718 - - -
0.4111 13200 3.1554 - - -
0.4142 13300 3.1357 - - -
0.4173 13400 2.8488 - - -
0.4204 13500 3.7433 - - -
0.4236 13600 2.4195 - - -
0.4267 13700 2.1384 - - -
0.4298 13800 2.7965 - - -
0.4329 13900 1.7869 - - -
0.4360 14000 3.0356 2.7234 0.7697 -
0.4391 14100 3.4984 - - -
0.4422 14200 2.4959 - - -
0.4454 14300 2.4615 - - -
0.4485 14400 2.6309 - - -
0.4516 14500 1.9831 - - -
0.4547 14600 3.25 - - -
0.4578 14700 3.3112 - - -
0.4609 14800 1.9912 - - -
0.4640 14900 1.9252 - - -
0.4672 15000 2.4545 2.0730 0.7972 -
0.4703 15100 1.6943 - - -
0.4734 15200 2.2851 - - -
0.4765 15300 2.4327 - - -
0.4796 15400 1.3503 - - -
0.4827 15500 1.1419 - - -
0.4858 15600 1.7906 - - -
0.4890 15700 1.6504 - - -
0.4921 15800 1.6908 - - -
0.4952 15900 3.0954 - - -
0.4983 16000 1.7151 2.0042 0.8044 -
0.5014 16100 1.5165 - - -
0.5045 16200 2.5573 - - -
0.5076 16300 1.3401 - - -
0.5108 16400 2.5464 - - -
0.5139 16500 2.4564 - - -
0.5170 16600 2.1667 - - -
0.5201 16700 1.2402 - - -
0.5232 16800 1.932 - - -
0.5263 16900 1.1811 - - -
0.5294 17000 2.2014 2.0475 0.8062 -
0.5326 17100 2.6535 - - -
0.5357 17200 1.8715 - - -
0.5388 17300 1.9385 - - -
0.5419 17400 2.0398 - - -
0.5450 17500 1.3436 - - -
0.5481 17600 2.0687 - - -
0.5512 17700 1.6224 - - -
0.5544 17800 1.0539 - - -
0.5575 17900 1.1162 - - -
0.5606 18000 1.6334 2.4120 0.7964 -
0.5637 18100 1.247 - - -
0.5668 18200 2.4652 - - -
0.5699 18300 1.8593 - - -
0.5730 18400 1.1875 - - -
0.5762 18500 2.1173 - - -
0.5793 18600 1.7473 - - -
0.5824 18700 2.1865 - - -
0.5855 18800 1.683 - - -
0.5886 18900 1.6522 - - -
0.5917 19000 1.0526 2.0743 0.8033 -
0.5948 19100 1.5001 - - -
0.5980 19200 1.2606 - - -
0.6011 19300 1.0597 - - -
0.6042 19400 1.8603 - - -
0.6073 19500 1.4883 - - -
0.6104 19600 0.6594 - - -
0.6135 19700 0.9557 - - -
0.6166 19800 0.8651 - - -
0.6198 19900 1.0326 - - -
0.6229 20000 1.2785 2.0868 0.8075 -
0.6260 20100 1.2881 - - -
0.6291 20200 0.5919 - - -
0.6322 20300 1.69 - - -
0.6353 20400 1.0285 - - -
0.6385 20500 0.8843 - - -
0.6416 20600 1.3756 - - -
0.6447 20700 0.9646 - - -
0.6478 20800 0.8052 - - -
0.6509 20900 0.8996 - - -
0.6540 21000 1.2207 2.2881 0.8029 -
0.6571 21100 1.3225 - - -
0.6603 21200 1.8101 - - -
0.6634 21300 0.8756 - - -
0.6665 21400 0.9877 - - -
0.6696 21500 1.7329 - - -
0.6727 21600 1.6885 - - -
0.6758 21700 1.2132 - - -
0.6789 21800 1.4888 - - -
0.6821 21900 1.403 - - -
0.6852 22000 0.5995 2.1952 0.8036 -
0.6883 22100 0.9658 - - -
0.6914 22200 1.1485 - - -
0.6945 22300 1.089 - - -
0.6976 22400 1.2719 - - -
0.7007 22500 0.9611 - - -
0.7039 22600 0.9398 - - -
0.7070 22700 0.7931 - - -
0.7101 22800 1.1224 - - -
0.7132 22900 2.032 - - -
0.7163 23000 1.3664 2.1043 0.8075 -
0.7194 23100 0.7777 - - -
0.7225 23200 0.9427 - - -
0.7257 23300 0.8846 - - -
0.7288 23400 1.0039 - - -
0.7319 23500 0.9344 - - -
0.7350 23600 1.3712 - - -
0.7381 23700 0.8039 - - -
0.7412 23800 1.0735 - - -
0.7443 23900 0.9851 - - -
0.7475 24000 1.8673 2.1547 0.8066 -
0.7506 24100 5.5805 - - -
0.7537 24200 4.1286 - - -
0.7568 24300 2.2206 - - -
0.7599 24400 3.6468 - - -
0.7630 24500 2.9307 - - -
0.7661 24600 3.8745 - - -
0.7693 24700 2.2125 - - -
0.7724 24800 2.3844 - - -
0.7755 24900 1.5081 - - -
0.7786 25000 1.5982 1.8491 0.8145 -
0.7817 25100 2.1563 - - -
0.7848 25200 1.8558 - - -
0.7879 25300 2.2087 - - -
0.7911 25400 2.3953 - - -
0.7942 25500 1.4072 - - -
0.7973 25600 1.4637 - - -
0.8004 25700 2.2037 - - -
0.8035 25800 1.6241 - - -
0.8066 25900 1.4882 - - -
0.8097 26000 0.9108 1.9292 0.8115 -
0.8129 26100 0.9198 - - -
0.8160 26200 1.2981 - - -
0.8191 26300 1.0513 - - -
0.8222 26400 1.389 - - -
0.8253 26500 5.8539 - - -
0.8284 26600 3.547 - - -
0.8315 26700 2.3285 - - -
0.8347 26800 2.8112 - - -
0.8378 26900 3.3717 - - -
0.8409 27000 2.5921 1.9430 0.8108 -
0.8440 27100 1.5048 - - -
0.8471 27200 1.5 - - -
0.8502 27300 0.778 - - -
0.8533 27400 0.9557 - - -
0.8565 27500 1.347 - - -
0.8596 27600 1.5882 - - -
0.8627 27700 1.7333 - - -
0.8658 27800 1.5683 - - -
0.8689 27900 0.7698 - - -
0.8720 28000 1.2758 1.9704 0.8127 -
0.8751 28100 1.3248 - - -
0.8783 28200 1.041 - - -
0.8814 28300 1.6066 - - -
0.8845 28400 1.9033 - - -
0.8876 28500 0.8781 - - -
0.8907 28600 0.9345 - - -
0.8938 28700 0.9209 - - -
0.8969 28800 1.1443 - - -
0.9001 28900 0.9522 - - -
0.9032 29000 1.4295 2.0572 0.8111 -
0.9063 29100 0.9005 - - -
0.9094 29200 1.0024 - - -
0.9125 29300 1.3573 - - -
0.9156 29400 1.0805 - - -
0.9187 29500 1.3308 - - -
0.9219 29600 1.4853 - - -
0.9250 29700 2.0785 - - -
0.9281 29800 0.9283 - - -
0.9312 29900 0.8081 - - -
0.9343 30000 0.4223 2.0404 0.8115 -
0.9374 30100 0.8565 - - -
0.9405 30200 0.6674 - - -
0.9437 30300 0.5499 - - -
0.9468 30400 0.3212 - - -
0.9499 30500 0.166 - - -
0.9530 30600 0.1096 - - -
0.9561 30700 0.0382 - - -
0.9592 30800 0.2927 - - -
0.9623 30900 0.4097 - - -
0.9655 31000 0.5554 2.0068 0.8130 -
0.9686 31100 0.5783 - - -
0.9717 31200 0.376 - - -
0.9748 31300 0.3469 - - -
0.9779 31400 0.3043 - - -
0.9810 31500 0.4023 - - -
0.9841 31600 0.1876 - - -
0.9873 31700 0.4473 - - -
0.9904 31800 0.3256 - - -
0.9935 31900 0.4875 - - -
0.9966 32000 0.1807 2.0122 0.8129 -
0.9997 32100 0.3249 - - -
1.0 32109 - - - 0.8426

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.33.0
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}
Downloads last month
89
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for philipp-zettl/MiniLM-amazon_massive_intent-similarity

Evaluation results