Edit model card

SentenceTransformer based on distilbert/distilbert-base-uncased

This is a sentence-transformers model finetuned from distilbert/distilbert-base-uncased. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: distilbert/distilbert-base-uncased
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'T ENGINE TRANS TOP LAT 90 Deg Front 2025 U717 G-S',
    'T R F ACTIVE VENT SQUIB VOLT 90 Deg Front 2021 P702 VOLTS',
    'T ENGINE TRANS TOP LAT 30 Deg Front Angular Left 2020 P558 G-S',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.4518
spearman_cosine 0.4762
pearson_manhattan 0.4253
spearman_manhattan 0.4638
pearson_euclidean 0.4262
spearman_euclidean 0.4652
pearson_dot 0.3898
spearman_dot 0.374
pearson_max 0.4518
spearman_max 0.4762

Semantic Similarity

Metric Value
pearson_cosine 0.4412
spearman_cosine 0.4671
pearson_manhattan 0.4156
spearman_manhattan 0.456
pearson_euclidean 0.4167
spearman_euclidean 0.4575
pearson_dot 0.3753
spearman_dot 0.3629
pearson_max 0.4412
spearman_max 0.4671

Training Details

Training Dataset

Unnamed Dataset

  • Size: 8,081,275 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 23 tokens
    • mean: 31.48 tokens
    • max: 40 tokens
    • min: 16 tokens
    • mean: 30.06 tokens
    • max: 55 tokens
    • min: 0.0
    • mean: 0.44
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    T L F DUMMY PELVIS VERT Dynamic Seat Sled Test 2025 U718 G-S T SCS R2 HY REF 059 R C PLR REF Y SM LAT 90 Deg / Left Side Decel-4g 2020 CX483 G-S 0.21129386503072142
    T L F DUMMY PELVIS VERT Dynamic Seat Sled Test 2025 U718 G-S T R F DUMMY PELVIS VERT 75 Deg Oblique Right Side 10 in. Pole 2015 P552 G-S 0.4972955033248179
    T L F DUMMY PELVIS VERT Dynamic Seat Sled Test 2025 U718 G-S T SCS L1 HY REF 053 L B PLR REF Y SM LAT 90 Deg Front Bumper Override 2021 CX727 G-S 0.5701051768787058
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 1,726,581 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 22 tokens
    • mean: 25.0 tokens
    • max: 30 tokens
    • min: 16 tokens
    • mean: 31.04 tokens
    • max: 53 tokens
    • min: 0.0
    • mean: 0.44
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    T R F ADAPTIVE TETHER VENT SQUIB VOLT 30 Deg Front Angular Right 20xx GENERIC VOLTS T L F DUMMY T12 LONG 27 Deg Crabbed Left Side NHTSA 214 MDB to vehicle 2015 P552 G-S 0.6835618484879796
    T R F ADAPTIVE TETHER VENT SQUIB VOLT 30 Deg Front Angular Right 20xx GENERIC VOLTS T L F DUMMY R FEMUR LONG 90 Deg Front 2022 U553 G-S 0.666531064739
    T R F ADAPTIVE TETHER VENT SQUIB VOLT 30 Deg Front Angular Right 20xx GENERIC VOLTS T R F DUMMY NECK UPPER MZ LOAD 90 Deg Front 2019 P375ICA IN-LBS 0.46391834212079874
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 3e-05
  • num_train_epochs: 4
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 3e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 4
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: False
  • include_tokens_per_second: False
  • neftune_noise_alpha: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss sts-dev_spearman_cosine
0.0317 1000 6.3069 - -
0.0634 2000 6.1793 - -
0.0950 3000 6.1607 - -
0.1267 4000 6.1512 - -
0.1584 5000 6.1456 - -
0.1901 6000 6.1419 - -
0.2218 7000 6.1398 - -
0.2534 8000 6.1377 - -
0.2851 9000 6.1352 - -
0.3168 10000 6.1338 - -
0.3485 11000 6.1332 - -
0.3801 12000 6.1309 - -
0.4118 13000 6.1315 - -
0.4435 14000 6.1283 - -
0.4752 15000 6.129 - -
0.5069 16000 6.1271 - -
0.5385 17000 6.1265 - -
0.5702 18000 6.1238 - -
0.6019 19000 6.1234 - -
0.6336 20000 6.1225 - -
0.6653 21000 6.1216 - -
0.6969 22000 6.1196 - -
0.7286 23000 6.1198 - -
0.7603 24000 6.1178 - -
0.7920 25000 6.117 - -
0.8236 26000 6.1167 - -
0.8553 27000 6.1165 - -
0.8870 28000 6.1149 - -
0.9187 29000 6.1146 - -
0.9504 30000 6.113 - -
0.9820 31000 6.1143 - -
1.0 31567 - 6.1150 0.4829
1.0137 32000 6.1115 - -
1.0454 33000 6.111 - -
1.0771 34000 6.1091 - -
1.1088 35000 6.1094 - -
1.1404 36000 6.1078 - -
1.1721 37000 6.1095 - -
1.2038 38000 6.106 - -
1.2355 39000 6.1071 - -
1.2671 40000 6.1073 - -
1.2988 41000 6.1064 - -
1.3305 42000 6.1047 - -
1.3622 43000 6.1054 - -
1.3939 44000 6.1048 - -
1.4255 45000 6.1053 - -
1.4572 46000 6.1058 - -
1.4889 47000 6.1037 - -
1.5206 48000 6.1041 - -
1.5523 49000 6.1023 - -
1.5839 50000 6.1018 - -
1.6156 51000 6.104 - -
1.6473 52000 6.1004 - -
1.6790 53000 6.1027 - -
1.7106 54000 6.1017 - -
1.7423 55000 6.1011 - -
1.7740 56000 6.1002 - -
1.8057 57000 6.0994 - -
1.8374 58000 6.0985 - -
1.8690 59000 6.0986 - -
1.9007 60000 6.1006 - -
1.9324 61000 6.0983 - -
1.9641 62000 6.0983 - -
1.9958 63000 6.0973 - -
2.0 63134 - 6.1193 0.4828
2.0274 64000 6.0943 - -
2.0591 65000 6.0941 - -
2.0908 66000 6.0936 - -
2.1225 67000 6.0909 - -
2.1541 68000 6.0925 - -
2.1858 69000 6.0932 - -
2.2175 70000 6.0939 - -
2.2492 71000 6.0919 - -
2.2809 72000 6.0932 - -
2.3125 73000 6.0916 - -
2.3442 74000 6.0919 - -
2.3759 75000 6.0919 - -
2.4076 76000 6.0911 - -
2.4393 77000 6.0924 - -
2.4709 78000 6.0911 - -
2.5026 79000 6.0922 - -
2.5343 80000 6.0926 - -
2.5660 81000 6.0911 - -
2.5976 82000 6.0897 - -
2.6293 83000 6.0922 - -
2.6610 84000 6.0908 - -
2.6927 85000 6.0884 - -
2.7244 86000 6.0907 - -
2.7560 87000 6.0904 - -
2.7877 88000 6.0881 - -
2.8194 89000 6.0902 - -
2.8511 90000 6.088 - -
2.8828 91000 6.0888 - -
2.9144 92000 6.0884 - -
2.9461 93000 6.0881 - -
2.9778 94000 6.0896 - -
3.0 94701 - 6.1225 0.4788
3.0095 95000 6.0857 - -
3.0412 96000 6.0838 - -
3.0728 97000 6.0843 - -
3.1045 98000 6.0865 - -
3.1362 99000 6.0827 - -
3.1679 100000 6.0836 - -
3.1995 101000 6.0837 - -
3.2312 102000 6.0836 - -
3.2629 103000 6.0837 - -
3.2946 104000 6.084 - -
3.3263 105000 6.0836 - -
3.3579 106000 6.0808 - -
3.3896 107000 6.0821 - -
3.4213 108000 6.0817 - -
3.4530 109000 6.082 - -
3.4847 110000 6.083 - -
3.5163 111000 6.0829 - -
3.5480 112000 6.0832 - -
3.5797 113000 6.0829 - -
3.6114 114000 6.0837 - -
3.6430 115000 6.082 - -
3.6747 116000 6.0823 - -
3.7064 117000 6.082 - -
3.7381 118000 6.0833 - -
3.7698 119000 6.0831 - -
3.8014 120000 6.0814 - -
3.8331 121000 6.0813 - -
3.8648 122000 6.0797 - -
3.8965 123000 6.0793 - -
3.9282 124000 6.0818 - -
3.9598 125000 6.0806 - -
3.9915 126000 6.08 - -
4.0 126268 - 6.1266 0.4671

Framework Versions

  • Python: 3.10.6
  • Sentence Transformers: 3.0.0
  • Transformers: 4.35.0
  • PyTorch: 2.1.0a0+4136153
  • Accelerate: 0.30.1
  • Datasets: 2.14.1
  • Tokenizers: 0.14.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}
Downloads last month
0
Safetensors
Model size
66.4M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for gkudirka/crash_encoder2-sts

Finetuned
this model

Evaluation results