Edit model card

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3 on the en-th dataset. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 544 tokens
  • Output Dimensionality: 512 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • en-th

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 544, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 1024, 'out_features': 512, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'what kind of preferences did you have in mind for the italian restaurant you wanna go to?',
    'โดยที่ไม่ทำให้โทรศัพท์ดูใหญ่เทอะทะจนเกินไป แต่เคสนี้ไม่ตอบโจทย์ มันเป็นแค่เคสพลาสติกบางสุดๆ ที่แทบไม่มีอะไรบุกันกระแทก ยกเว้นส่วนขอบบนเครื่อง ส่วนหนึ่งเป็นเพราะเคสพอดีจนแทบไม่มีที่เหลือ แต่หลักๆน่าจะเป็นเพราะเคสไม่ได้เจาะรูไว้สำหรับเสียบสายชาร์จ',
    'ขอให้เป็นวันที่ดีค่ะ',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Knowledge Distillation

Metric Value
negative_mse -4.411

Translation

Metric Value
src2trg_accuracy 0.4276
trg2src_accuracy 0.4026
mean_accuracy 0.4151

Semantic Similarity

Metric Value
pearson_cosine 0.7552
spearman_cosine 0.7964
pearson_manhattan 0.8293
spearman_manhattan 0.8311
pearson_euclidean 0.8146
spearman_euclidean 0.818
pearson_dot 0.2645
spearman_dot 0.2604
pearson_max 0.8293
spearman_max 0.8311

Training Details

Training Dataset

en-th

  • Dataset: en-th
  • Size: 903,970 training samples
  • Columns: english, non_english, and label
  • Approximate statistics based on the first 1000 samples:
    english non_english label
    type string string list
    details
    • min: 5 tokens
    • mean: 21.5 tokens
    • max: 91 tokens
    • min: 3 tokens
    • mean: 21.67 tokens
    • max: 156 tokens
    • size: 512 elements
  • Samples:
    english non_english label
    There is obviously a situation, when suddenly and spontaneously decide to go from south to north Cyprus, not preparing for this completely in terms of currency exchange. You do not have anything to worry about because you can very quickly to exchange currency in any of the Cypriot cantors, or such service is offered in many places - shops, hotels and even gas stations. มีสถานการณ์ที่เห็นได้ชัดเมื่อฉับพลันและธรรมชาติตัดสินใจที่จะไปจากทิศใต้ไปทางทิศเหนือไซปรัส, ไม่ได้เตรียมความพร้อมสำหรับนี้อย่างสมบูรณ์ในแง่ของการแลกเปลี่ยนเงินตราต่างประเทศ คุณไม่ได้มีอะไรต้องกังวลเกี่ยวกับเพราะคุณสามารถได้อย่างรวดเร็วเพื่อการแลกเปลี่ยนสกุลเงินในใด ๆ ของ cantors ไซปรัสหรือบริการดังกล่าวจะถูกนำเสนอในหลายสถานที่ -- โรงแรม, ร้านค้าและแม้แต่สถานีบริการน้ำมัน [0.08994044363498688, -0.16606739163398743, -0.19563019275665283, 0.16979621350765228, 0.36533093452453613, ...]
    Alright. I've booked you for 7:00 this Thursday evening at Giorgio's on Pine. I also mentioned that you are celebrating an anniversary. มันถูกมากเกินกว่าที่จะส่งคืนหรือเปลี่ยนเป็นอันอื่นฉันไม่แนะนำค่ะ ของถูกก็แบบนี้ [0.49537181854248047, 0.06981103122234344, -0.08879007399082184, -0.3542495667934418, -0.1312403678894043, ...]
    s that? นั่นอะไร? [0.19816944003105164, -0.08889764547348022, 0.06616806238889694, -0.04803535342216492, 0.18784472346305847, ...]
  • Loss: MSELoss

Evaluation Dataset

en-th

  • Dataset: en-th
  • Size: 5,000 evaluation samples
  • Columns: english, non_english, and label
  • Approximate statistics based on the first 1000 samples:
    english non_english label
    type string string list
    details
    • min: 5 tokens
    • mean: 22.69 tokens
    • max: 101 tokens
    • min: 4 tokens
    • mean: 22.05 tokens
    • max: 165 tokens
    • size: 512 elements
  • Samples:
    english non_english label
    3 medium pizzas, 1 olives and chicken, 1 pepperoni and sausage, and 1 meat lovers. ฉันชอบความจริงที่ว่ามันกะทัดรัดเช่นกัน [-0.058319706469774246, 0.34078648686408997, -0.21020987629890442, -0.46271052956581116, -0.08354806154966354, ...]
    Yay write super long essay, quite satisfying! See how ba! Still haveto study for exams lol เขียนเรียงความที่โคตรยาว ค่อนข้างพอใจอยู่นะ ดูว่ายังไง ต้องศึกษาเพื่อสอบอ่ะ [-0.36296623945236206, 0.23631885647773743, -0.10706634074449539, -0.01760946214199066, -0.25405243039131165, ...]
    no problems, how many people and what time? 55.7 ซม. [0.2116040736436844, -0.050325457006692886, -0.018645018339157104, -0.14866583049297333, 0.18265873193740845, ...]
  • Loss: MSELoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss en-th loss en-th_mean_accuracy en-th_negative_mse sts17-en-en-test_spearman_max
0.0018 100 0.4339 - - - -
0.0035 200 0.4325 - - - -
0.0053 300 0.414 - - - -
0.0071 400 0.3828 - - - -
0.0088 500 0.3478 - - - -
0.0106 600 0.3033 - - - -
0.0124 700 0.2495 - - - -
0.0142 800 0.1806 - - - -
0.0159 900 0.1279 - - - -
0.0177 1000 0.1004 - - - -
0.0195 1100 0.0853 - - - -
0.0212 1200 0.0779 - - - -
0.0230 1300 0.0735 - - - -
0.0248 1400 0.0716 - - - -
0.0265 1500 0.0698 - - - -
0.0283 1600 0.0682 - - - -
0.0301 1700 0.068 - - - -
0.0319 1800 0.0667 - - - -
0.0336 1900 0.0657 - - - -
0.0354 2000 0.0653 - - - -
0.0372 2100 0.0655 - - - -
0.0389 2200 0.0637 - - - -
0.0407 2300 0.0632 - - - -
0.0425 2400 0.0625 - - - -
0.0442 2500 0.0621 - - - -
0.0460 2600 0.0615 - - - -
0.0478 2700 0.0599 - - - -
0.0496 2800 0.0604 - - - -
0.0513 2900 0.0596 - - - -
0.0531 3000 0.0591 - - - -
0.0549 3100 0.0592 - - - -
0.0566 3200 0.0587 - - - -
0.0584 3300 0.0575 - - - -
0.0602 3400 0.0575 - - - -
0.0619 3500 0.0571 - - - -
0.0637 3600 0.0567 - - - -
0.0655 3700 0.057 - - - -
0.0673 3800 0.0569 - - - -
0.0690 3900 0.0567 - - - -
0.0708 4000 0.0571 - - - -
0.0726 4100 0.0564 - - - -
0.0743 4200 0.0561 - - - -
0.0761 4300 0.0557 - - - -
0.0779 4400 0.0563 - - - -
0.0796 4500 0.0558 - - - -
0.0814 4600 0.0551 - - - -
0.0832 4700 0.0555 - - - -
0.0850 4800 0.0554 - - - -
0.0867 4900 0.0553 - - - -
0.0885 5000 0.0545 0.0519 0.0714 -5.5590 0.5033
0.0903 5100 0.055 - - - -
0.0920 5200 0.0552 - - - -
0.0938 5300 0.0539 - - - -
0.0956 5400 0.0537 - - - -
0.0973 5500 0.0537 - - - -
0.0991 5600 0.054 - - - -
0.1009 5700 0.0543 - - - -
0.1027 5800 0.0536 - - - -
0.1044 5900 0.0536 - - - -
0.1062 6000 0.053 - - - -
0.1080 6100 0.0527 - - - -
0.1097 6200 0.0531 - - - -
0.1115 6300 0.0537 - - - -
0.1133 6400 0.0526 - - - -
0.1150 6500 0.0528 - - - -
0.1168 6600 0.0527 - - - -
0.1186 6700 0.052 - - - -
0.1204 6800 0.0527 - - - -
0.1221 6900 0.0521 - - - -
0.1239 7000 0.0513 - - - -
0.1257 7100 0.0517 - - - -
0.1274 7200 0.0514 - - - -
0.1292 7300 0.052 - - - -
0.1310 7400 0.0511 - - - -
0.1327 7500 0.0502 - - - -
0.1345 7600 0.0511 - - - -
0.1363 7700 0.0506 - - - -
0.1381 7800 0.0509 - - - -
0.1398 7900 0.0507 - - - -
0.1416 8000 0.0507 - - - -
0.1434 8100 0.0506 - - - -
0.1451 8200 0.0503 - - - -
0.1469 8300 0.0501 - - - -
0.1487 8400 0.0505 - - - -
0.1504 8500 0.0497 - - - -
0.1522 8600 0.0501 - - - -
0.1540 8700 0.049 - - - -
0.1558 8800 0.0496 - - - -
0.1575 8900 0.0495 - - - -
0.1593 9000 0.0491 - - - -
0.1611 9100 0.0494 - - - -
0.1628 9200 0.0493 - - - -
0.1646 9300 0.049 - - - -
0.1664 9400 0.0484 - - - -
0.1681 9500 0.0493 - - - -
0.1699 9600 0.0491 - - - -
0.1717 9700 0.049 - - - -
0.1735 9800 0.0483 - - - -
0.1752 9900 0.0485 - - - -
0.1770 10000 0.0488 0.0465 0.2097 -5.2368 0.5897
0.1788 10100 0.0477 - - - -
0.1805 10200 0.0477 - - - -
0.1823 10300 0.0485 - - - -
0.1841 10400 0.0477 - - - -
0.1858 10500 0.0481 - - - -
0.1876 10600 0.0475 - - - -
0.1894 10700 0.0471 - - - -
0.1912 10800 0.0478 - - - -
0.1929 10900 0.0468 - - - -
0.1947 11000 0.0474 - - - -
0.1965 11100 0.0471 - - - -
0.1982 11200 0.0475 - - - -
0.2000 11300 0.0467 - - - -
0.2018 11400 0.0467 - - - -
0.2035 11500 0.0472 - - - -
0.2053 11600 0.0468 - - - -
0.2071 11700 0.0466 - - - -
0.2089 11800 0.0463 - - - -
0.2106 11900 0.0464 - - - -
0.2124 12000 0.0456 - - - -
0.2142 12100 0.0467 - - - -
0.2159 12200 0.0466 - - - -
0.2177 12300 0.0462 - - - -
0.2195 12400 0.0466 - - - -
0.2212 12500 0.0465 - - - -
0.2230 12600 0.0456 - - - -
0.2248 12700 0.0454 - - - -
0.2266 12800 0.0456 - - - -
0.2283 12900 0.0451 - - - -
0.2301 13000 0.0458 - - - -
0.2319 13100 0.0458 - - - -
0.2336 13200 0.0456 - - - -
0.2354 13300 0.0449 - - - -
0.2372 13400 0.0458 - - - -
0.2389 13500 0.0448 - - - -
0.2407 13600 0.0452 - - - -
0.2425 13700 0.0453 - - - -
0.2443 13800 0.046 - - - -
0.2460 13900 0.0455 - - - -
0.2478 14000 0.0448 - - - -
0.2496 14100 0.0448 - - - -
0.2513 14200 0.0446 - - - -
0.2531 14300 0.045 - - - -
0.2549 14400 0.0444 - - - -
0.2566 14500 0.0447 - - - -
0.2584 14600 0.0445 - - - -
0.2602 14700 0.0446 - - - -
0.2620 14800 0.0446 - - - -
0.2637 14900 0.0441 - - - -
0.2655 15000 0.0441 0.0426 0.3042 -5.0075 0.6642
0.2673 15100 0.0441 - - - -
0.2690 15200 0.0435 - - - -
0.2708 15300 0.0447 - - - -
0.2726 15400 0.044 - - - -
0.2743 15500 0.0447 - - - -
0.2761 15600 0.0435 - - - -
0.2779 15700 0.043 - - - -
0.2797 15800 0.0434 - - - -
0.2814 15900 0.0433 - - - -
0.2832 16000 0.043 - - - -
0.2850 16100 0.0435 - - - -
0.2867 16200 0.0439 - - - -
0.2885 16300 0.0437 - - - -
0.2903 16400 0.0435 - - - -
0.2920 16500 0.0435 - - - -
0.2938 16600 0.0438 - - - -
0.2956 16700 0.0431 - - - -
0.2974 16800 0.043 - - - -
0.2991 16900 0.0425 - - - -
0.3009 17000 0.0434 - - - -
0.3027 17100 0.0425 - - - -
0.3044 17200 0.0433 - - - -
0.3062 17300 0.0435 - - - -
0.3080 17400 0.0431 - - - -
0.3097 17500 0.0421 - - - -
0.3115 17600 0.043 - - - -
0.3133 17700 0.0429 - - - -
0.3150 17800 0.0426 - - - -
0.3168 17900 0.0423 - - - -
0.3186 18000 0.0424 - - - -
0.3204 18100 0.0428 - - - -
0.3221 18200 0.0417 - - - -
0.3239 18300 0.0428 - - - -
0.3257 18400 0.0421 - - - -
0.3274 18500 0.0424 - - - -
0.3292 18600 0.043 - - - -
0.3310 18700 0.0421 - - - -
0.3327 18800 0.0413 - - - -
0.3345 18900 0.0417 - - - -
0.3363 19000 0.0428 - - - -
0.3381 19100 0.0421 - - - -
0.3398 19200 0.042 - - - -
0.3416 19300 0.0417 - - - -
0.3434 19400 0.042 - - - -
0.3451 19500 0.0416 - - - -
0.3469 19600 0.0413 - - - -
0.3487 19700 0.0415 - - - -
0.3504 19800 0.0415 - - - -
0.3522 19900 0.0418 - - - -
0.3540 20000 0.0412 0.0399 0.3538 -4.8579 0.7194
0.3558 20100 0.041 - - - -
0.3575 20200 0.0414 - - - -
0.3593 20300 0.041 - - - -
0.3611 20400 0.0417 - - - -
0.3628 20500 0.0413 - - - -
0.3646 20600 0.0407 - - - -
0.3664 20700 0.0406 - - - -
0.3681 20800 0.0412 - - - -
0.3699 20900 0.0413 - - - -
0.3717 21000 0.0408 - - - -
0.3735 21100 0.0412 - - - -
0.3752 21200 0.0408 - - - -
0.3770 21300 0.041 - - - -
0.3788 21400 0.0402 - - - -
0.3805 21500 0.0405 - - - -
0.3823 21600 0.04 - - - -
0.3841 21700 0.0398 - - - -
0.3858 21800 0.0409 - - - -
0.3876 21900 0.0408 - - - -
0.3894 22000 0.041 - - - -
0.3912 22100 0.0409 - - - -
0.3929 22200 0.0405 - - - -
0.3947 22300 0.0401 - - - -
0.3965 22400 0.0409 - - - -
0.3982 22500 0.0403 - - - -
0.4000 22600 0.041 - - - -
0.4018 22700 0.041 - - - -
0.4035 22800 0.0408 - - - -
0.4053 22900 0.0396 - - - -
0.4071 23000 0.0403 - - - -
0.4089 23100 0.0402 - - - -
0.4106 23200 0.0393 - - - -
0.4124 23300 0.0402 - - - -
0.4142 23400 0.0404 - - - -
0.4159 23500 0.0406 - - - -
0.4177 23600 0.0398 - - - -
0.4195 23700 0.0398 - - - -
0.4212 23800 0.0394 - - - -
0.4230 23900 0.0394 - - - -
0.4248 24000 0.0398 - - - -
0.4266 24100 0.0399 - - - -
0.4283 24200 0.0396 - - - -
0.4301 24300 0.0401 - - - -
0.4319 24400 0.0396 - - - -
0.4336 24500 0.0403 - - - -
0.4354 24600 0.0394 - - - -
0.4372 24700 0.0403 - - - -
0.4389 24800 0.0393 - - - -
0.4407 24900 0.039 - - - -
0.4425 25000 0.0393 0.0382 0.3921 -4.7803 0.7439
0.4443 25100 0.0389 - - - -
0.4460 25200 0.0396 - - - -
0.4478 25300 0.0391 - - - -
0.4496 25400 0.0393 - - - -
0.4513 25500 0.0395 - - - -
0.4531 25600 0.0396 - - - -
0.4549 25700 0.0392 - - - -
0.4566 25800 0.0386 - - - -
0.4584 25900 0.0389 - - - -
0.4602 26000 0.0381 - - - -
0.4620 26100 0.0393 - - - -
0.4637 26200 0.0389 - - - -
0.4655 26300 0.0388 - - - -
0.4673 26400 0.0391 - - - -
0.4690 26500 0.0387 - - - -
0.4708 26600 0.0391 - - - -
0.4726 26700 0.0389 - - - -
0.4743 26800 0.0383 - - - -
0.4761 26900 0.0389 - - - -
0.4779 27000 0.0395 - - - -
0.4797 27100 0.0388 - - - -
0.4814 27200 0.0393 - - - -
0.4832 27300 0.0392 - - - -
0.4850 27400 0.0383 - - - -
0.4867 27500 0.0383 - - - -
0.4885 27600 0.0385 - - - -
0.4903 27700 0.0386 - - - -
0.4920 27800 0.0389 - - - -
0.4938 27900 0.0393 - - - -
0.4956 28000 0.0385 - - - -
0.4974 28100 0.0391 - - - -
0.4991 28200 0.0383 - - - -
0.5009 28300 0.0391 - - - -
0.5027 28400 0.0385 - - - -
0.5044 28500 0.0378 - - - -
0.5062 28600 0.0382 - - - -
0.5080 28700 0.0387 - - - -
0.5097 28800 0.0378 - - - -
0.5115 28900 0.0383 - - - -
0.5133 29000 0.038 - - - -
0.5151 29100 0.0383 - - - -
0.5168 29200 0.0382 - - - -
0.5186 29300 0.0377 - - - -
0.5204 29400 0.0376 - - - -
0.5221 29500 0.0381 - - - -
0.5239 29600 0.0378 - - - -
0.5257 29700 0.0387 - - - -
0.5274 29800 0.0378 - - - -
0.5292 29900 0.0383 - - - -
0.5310 30000 0.0383 0.0367 0.3935 -4.6879 0.7613
0.5328 30100 0.0369 - - - -
0.5345 30200 0.0378 - - - -
0.5363 30300 0.0391 - - - -
0.5381 30400 0.0382 - - - -
0.5398 30500 0.0383 - - - -
0.5416 30600 0.038 - - - -
0.5434 30700 0.0375 - - - -
0.5451 30800 0.0374 - - - -
0.5469 30900 0.037 - - - -
0.5487 31000 0.0378 - - - -
0.5505 31100 0.0373 - - - -
0.5522 31200 0.0381 - - - -
0.5540 31300 0.038 - - - -
0.5558 31400 0.0381 - - - -
0.5575 31500 0.0375 - - - -
0.5593 31600 0.037 - - - -
0.5611 31700 0.037 - - - -
0.5628 31800 0.0377 - - - -
0.5646 31900 0.0377 - - - -
0.5664 32000 0.0373 - - - -
0.5682 32100 0.0368 - - - -
0.5699 32200 0.0369 - - - -
0.5717 32300 0.037 - - - -
0.5735 32400 0.0382 - - - -
0.5752 32500 0.0372 - - - -
0.5770 32600 0.0372 - - - -
0.5788 32700 0.0373 - - - -
0.5805 32800 0.0371 - - - -
0.5823 32900 0.0369 - - - -
0.5841 33000 0.0371 - - - -
0.5859 33100 0.0374 - - - -
0.5876 33200 0.0376 - - - -
0.5894 33300 0.0373 - - - -
0.5912 33400 0.0375 - - - -
0.5929 33500 0.0366 - - - -
0.5947 33600 0.0368 - - - -
0.5965 33700 0.0374 - - - -
0.5982 33800 0.0366 - - - -
0.6000 33900 0.0372 - - - -
0.6018 34000 0.0379 - - - -
0.6036 34100 0.0362 - - - -
0.6053 34200 0.0365 - - - -
0.6071 34300 0.0374 - - - -
0.6089 34400 0.0369 - - - -
0.6106 34500 0.0372 - - - -
0.6124 34600 0.0366 - - - -
0.6142 34700 0.0366 - - - -
0.6159 34800 0.0368 - - - -
0.6177 34900 0.0367 - - - -
0.6195 35000 0.037 0.0356 0.4148 -4.6403 0.7963
0.6212 35100 0.0365 - - - -
0.6230 35200 0.0361 - - - -
0.6248 35300 0.0367 - - - -
0.6266 35400 0.0362 - - - -
0.6283 35500 0.0365 - - - -
0.6301 35600 0.0374 - - - -
0.6319 35700 0.0369 - - - -
0.6336 35800 0.0371 - - - -
0.6354 35900 0.0367 - - - -
0.6372 36000 0.0371 - - - -
0.6389 36100 0.0371 - - - -
0.6407 36200 0.0366 - - - -
0.6425 36300 0.0358 - - - -
0.6443 36400 0.0374 - - - -
0.6460 36500 0.0368 - - - -
0.6478 36600 0.037 - - - -
0.6496 36700 0.0365 - - - -
0.6513 36800 0.036 - - - -
0.6531 36900 0.036 - - - -
0.6549 37000 0.0365 - - - -
0.6566 37100 0.0362 - - - -
0.6584 37200 0.0371 - - - -
0.6602 37300 0.0366 - - - -
0.6620 37400 0.0366 - - - -
0.6637 37500 0.0361 - - - -
0.6655 37600 0.0357 - - - -
0.6673 37700 0.0378 - - - -
0.6690 37800 0.0363 - - - -
0.6708 37900 0.0365 - - - -
0.6726 38000 0.0363 - - - -
0.6743 38100 0.0367 - - - -
0.6761 38200 0.0359 - - - -
0.6779 38300 0.0365 - - - -
0.6797 38400 0.036 - - - -
0.6814 38500 0.036 - - - -
0.6832 38600 0.0362 - - - -
0.6850 38700 0.036 - - - -
0.6867 38800 0.0362 - - - -
0.6885 38900 0.036 - - - -
0.6903 39000 0.0367 - - - -
0.6920 39100 0.0364 - - - -
0.6938 39200 0.0365 - - - -
0.6956 39300 0.0359 - - - -
0.6974 39400 0.0363 - - - -
0.6991 39500 0.0355 - - - -
0.7009 39600 0.0358 - - - -
0.7027 39700 0.0356 - - - -
0.7044 39800 0.0363 - - - -
0.7062 39900 0.0362 - - - -
0.7080 40000 0.0358 0.0345 0.4029 -4.5699 0.8057
0.7097 40100 0.0359 - - - -
0.7115 40200 0.0363 - - - -
0.7133 40300 0.0357 - - - -
0.7151 40400 0.0356 - - - -
0.7168 40500 0.0356 - - - -
0.7186 40600 0.036 - - - -
0.7204 40700 0.0353 - - - -
0.7221 40800 0.0369 - - - -
0.7239 40900 0.0356 - - - -
0.7257 41000 0.0359 - - - -
0.7274 41100 0.036 - - - -
0.7292 41200 0.0362 - - - -
0.7310 41300 0.0357 - - - -
0.7328 41400 0.0357 - - - -
0.7345 41500 0.0356 - - - -
0.7363 41600 0.0357 - - - -
0.7381 41700 0.0354 - - - -
0.7398 41800 0.0356 - - - -
0.7416 41900 0.035 - - - -
0.7434 42000 0.0345 - - - -
0.7451 42100 0.0355 - - - -
0.7469 42200 0.0354 - - - -
0.7487 42300 0.0353 - - - -
0.7505 42400 0.035 - - - -
0.7522 42500 0.0358 - - - -
0.7540 42600 0.0356 - - - -
0.7558 42700 0.0353 - - - -
0.7575 42800 0.0352 - - - -
0.7593 42900 0.0349 - - - -
0.7611 43000 0.0347 - - - -
0.7628 43100 0.0355 - - - -
0.7646 43200 0.0351 - - - -
0.7664 43300 0.0358 - - - -
0.7682 43400 0.0348 - - - -
0.7699 43500 0.0348 - - - -
0.7717 43600 0.0347 - - - -
0.7735 43700 0.0353 - - - -
0.7752 43800 0.0354 - - - -
0.7770 43900 0.0349 - - - -
0.7788 44000 0.0356 - - - -
0.7805 44100 0.0353 - - - -
0.7823 44200 0.0346 - - - -
0.7841 44300 0.0347 - - - -
0.7859 44400 0.0344 - - - -
0.7876 44500 0.0354 - - - -
0.7894 44600 0.0347 - - - -
0.7912 44700 0.0344 - - - -
0.7929 44800 0.0345 - - - -
0.7947 44900 0.035 - - - -
0.7965 45000 0.0343 0.0337 0.4095 -4.5223 0.8104
0.7982 45100 0.0347 - - - -
0.8000 45200 0.0344 - - - -
0.8018 45300 0.0347 - - - -
0.8036 45400 0.034 - - - -
0.8053 45500 0.0341 - - - -
0.8071 45600 0.0352 - - - -
0.8089 45700 0.0345 - - - -
0.8106 45800 0.0341 - - - -
0.8124 45900 0.0351 - - - -
0.8142 46000 0.0346 - - - -
0.8159 46100 0.0345 - - - -
0.8177 46200 0.0354 - - - -
0.8195 46300 0.0342 - - - -
0.8213 46400 0.0343 - - - -
0.8230 46500 0.0346 - - - -
0.8248 46600 0.0342 - - - -
0.8266 46700 0.0344 - - - -
0.8283 46800 0.0343 - - - -
0.8301 46900 0.0354 - - - -
0.8319 47000 0.035 - - - -
0.8336 47100 0.0345 - - - -
0.8354 47200 0.0347 - - - -
0.8372 47300 0.0336 - - - -
0.8390 47400 0.0345 - - - -
0.8407 47500 0.0344 - - - -
0.8425 47600 0.0345 - - - -
0.8443 47700 0.0345 - - - -
0.8460 47800 0.0348 - - - -
0.8478 47900 0.0347 - - - -
0.8496 48000 0.0343 - - - -
0.8513 48100 0.0347 - - - -
0.8531 48200 0.0351 - - - -
0.8549 48300 0.0339 - - - -
0.8567 48400 0.0344 - - - -
0.8584 48500 0.0348 - - - -
0.8602 48600 0.0345 - - - -
0.8620 48700 0.0343 - - - -
0.8637 48800 0.0343 - - - -
0.8655 48900 0.0343 - - - -
0.8673 49000 0.0344 - - - -
0.8690 49100 0.0342 - - - -
0.8708 49200 0.0344 - - - -
0.8726 49300 0.034 - - - -
0.8744 49400 0.0343 - - - -
0.8761 49500 0.0346 - - - -
0.8779 49600 0.0345 - - - -
0.8797 49700 0.0337 - - - -
0.8814 49800 0.0339 - - - -
0.8832 49900 0.0341 - - - -
0.8850 50000 0.0343 0.0328 0.4145 -4.4765 0.8153
0.8867 50100 0.0341 - - - -
0.8885 50200 0.0344 - - - -
0.8903 50300 0.0342 - - - -
0.8921 50400 0.0344 - - - -
0.8938 50500 0.0336 - - - -
0.8956 50600 0.034 - - - -
0.8974 50700 0.0346 - - - -
0.8991 50800 0.0349 - - - -
0.9009 50900 0.0343 - - - -
0.9027 51000 0.0345 - - - -
0.9044 51100 0.0339 - - - -
0.9062 51200 0.0344 - - - -
0.9080 51300 0.0337 - - - -
0.9098 51400 0.034 - - - -
0.9115 51500 0.0341 - - - -
0.9133 51600 0.0342 - - - -
0.9151 51700 0.0339 - - - -
0.9168 51800 0.0336 - - - -
0.9186 51900 0.0342 - - - -
0.9204 52000 0.0354 - - - -
0.9221 52100 0.0337 - - - -
0.9239 52200 0.0338 - - - -
0.9257 52300 0.0344 - - - -
0.9275 52400 0.0338 - - - -
0.9292 52500 0.0337 - - - -
0.9310 52600 0.0335 - - - -
0.9328 52700 0.0329 - - - -
0.9345 52800 0.0335 - - - -
0.9363 52900 0.0341 - - - -
0.9381 53000 0.0338 - - - -
0.9398 53100 0.0336 - - - -
0.9416 53200 0.0337 - - - -
0.9434 53300 0.0339 - - - -
0.9451 53400 0.0333 - - - -
0.9469 53500 0.0336 - - - -
0.9487 53600 0.034 - - - -
0.9505 53700 0.0334 - - - -
0.9522 53800 0.0338 - - - -
0.9540 53900 0.0324 - - - -
0.9558 54000 0.0333 - - - -
0.9575 54100 0.0331 - - - -
0.9593 54200 0.0331 - - - -
0.9611 54300 0.0332 - - - -
0.9628 54400 0.0339 - - - -
0.9646 54500 0.0337 - - - -
0.9664 54600 0.0338 - - - -
0.9682 54700 0.0335 - - - -
0.9699 54800 0.0337 - - - -
0.9717 54900 0.0336 - - - -
0.9735 55000 0.0337 0.0323 0.4196 -4.4734 0.8236
0.9752 55100 0.0338 - - - -
0.9770 55200 0.0343 - - - -
0.9788 55300 0.0334 - - - -
0.9805 55400 0.0336 - - - -
0.9823 55500 0.0329 - - - -
0.9841 55600 0.0338 - - - -
0.9859 55700 0.0326 - - - -
0.9876 55800 0.0328 - - - -
0.9894 55900 0.0335 - - - -
0.9912 56000 0.0333 - - - -
0.9929 56100 0.0335 - - - -
0.9947 56200 0.0332 - - - -
0.9965 56300 0.0335 - - - -
0.9982 56400 0.0337 - - - -
1.0000 56500 0.0328 - - - -
1.0018 56600 0.0336 - - - -
1.0036 56700 0.0331 - - - -
1.0053 56800 0.0333 - - - -
1.0071 56900 0.0331 - - - -
1.0089 57000 0.0332 - - - -
1.0106 57100 0.0319 - - - -
1.0124 57200 0.0331 - - - -
1.0142 57300 0.0335 - - - -
1.0159 57400 0.0329 - - - -
1.0177 57500 0.0328 - - - -
1.0195 57600 0.0329 - - - -
1.0213 57700 0.033 - - - -
1.0230 57800 0.0328 - - - -
1.0248 57900 0.0335 - - - -
1.0266 58000 0.033 - - - -
1.0283 58100 0.033 - - - -
1.0301 58200 0.0328 - - - -
1.0319 58300 0.0327 - - - -
1.0336 58400 0.0327 - - - -
1.0354 58500 0.0334 - - - -
1.0372 58600 0.0332 - - - -
1.0390 58700 0.0333 - - - -
1.0407 58800 0.0328 - - - -
1.0425 58900 0.033 - - - -
1.0443 59000 0.0331 - - - -
1.0460 59100 0.0331 - - - -
1.0478 59200 0.0321 - - - -
1.0496 59300 0.0329 - - - -
1.0513 59400 0.0323 - - - -
1.0531 59500 0.0326 - - - -
1.0549 59600 0.033 - - - -
1.0567 59700 0.0333 - - - -
1.0584 59800 0.0321 - - - -
1.0602 59900 0.0326 - - - -
1.0620 60000 0.0326 0.0315 0.4151 -4.4110 0.8311

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MSELoss

@inproceedings{reimers-2020-multilingual-sentence-bert,
    title = "Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2020",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2004.09813",
}
Downloads last month
2
Safetensors
Model size
568M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for PetchP/m3-clip-ViT-B-16

Base model

BAAI/bge-m3
Finetuned
(129)
this model

Evaluation results