Edit model card

SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1

This is a sentence-transformers model finetuned from sentence-transformers/multi-qa-mpnet-base-dot-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'ear malformations, nipple abnormalities, dental anomalies',
    'A number sign (#) is used with this entry because scalp-ear-nipple syndrome (SENS) is caused by heterozygous mutation in the KCTD1 gene (613420) on chromosome 18q11.\n\nDescription\n\nScalp-ear-nipple syndrome is characterized by aplasia cutis congenita of the scalp, breast anomalies that range from hypothelia or athelia to amastia, and minor anomalies of the external ears. Less frequent clinical characteristics include nail dystrophy, dental anomalies, cutaneous syndactyly of the digits, and renal malformations. Penetrance appears to be high, although there is substantial variable expressivity within families (Marneros et al., 2013).\n\nClinical Features',
    'This article is an orphan, as no other articles link to it. Please introduce links to this page from related articles; try the Find link tool for suggestions. (July 2016)  \n  \nInguinal lymphadenopathy  \nInguinal lymphadenopathy  \n  \nInguinal lymphadenopathy causes swollen lymph nodes in the groin area. It can be a symptom of infective or neoplastic processes. Infective aetiologies include Tuberculosis, HIV, non-specific or reactive lymphadenopathy to recent lower limb infection or groin infections. Another notable infectious cause is Lymphogranuloma venereum, which is a sexually transmitted infection of the lymphatic system. Neoplastic aetiologies include lymphoma, leukaemia and metastatic disease from primary tumours in the lower limb, external genitalia or perianal region and melanoma.\n\n## References[edit]\n\n  * Ferrer R (October 1998). "Lymphadenopathy: differential diagnosis and evaluation". Am Fam Physician. 58 (6): 1313–20. PMID 9803196.\n\n## Further reading[edit]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.1807
cosine_accuracy@3 0.5427
cosine_accuracy@5 0.7381
cosine_accuracy@10 0.8161
cosine_precision@1 0.1807
cosine_precision@3 0.1809
cosine_precision@5 0.1476
cosine_precision@10 0.0816
cosine_recall@1 0.1807
cosine_recall@3 0.5427
cosine_recall@5 0.7381
cosine_recall@10 0.8161
cosine_ndcg@10 0.4947
cosine_mrr@10 0.3907
cosine_map@100 0.3953
dot_accuracy@1 0.1827
dot_accuracy@3 0.5413
dot_accuracy@5 0.743
dot_accuracy@10 0.8167
dot_precision@1 0.1827
dot_precision@3 0.1804
dot_precision@5 0.1486
dot_precision@10 0.0817
dot_recall@1 0.1827
dot_recall@3 0.5413
dot_recall@5 0.743
dot_recall@10 0.8167
dot_ndcg@10 0.4957
dot_mrr@10 0.3918
dot_map@100 0.3963

Training Details

Training Dataset

Unnamed Dataset

  • Size: 98,928 training samples
  • Columns: queries and chunks
  • Approximate statistics based on the first 1000 samples:
    queries chunks
    type string string
    details
    • min: 7 tokens
    • mean: 17.4 tokens
    • max: 76 tokens
    • min: 5 tokens
    • mean: 159.93 tokens
    • max: 334 tokens
  • Samples:
    queries chunks
    fever, malaise, headaches, lymphadenopathy A rare, acquired, self-limiting, infectious disease due to the mite-borne bacteria Rickettsia akari characterized by an asymptomatic, 0.5 to 2 cm in diameter papulovesicle that typically ulcerates and forms an eschar, followed by a generalized papulovesicular rash associating variable constitutional symptoms, such as localized lymphadenopathy, fever, malaise, and headaches. Additonal symptoms may include diaphoresis, myalgia and, less frequently, rhinorrhea, pharyngitis, nausea, vomiting, splenomegaly, conjunctival hyperemia, and abdominal pain. Systemic symtoms resolve within 6-10 days.
    rash, papulovesicular, generalized, constitutional symptoms A rare, acquired, self-limiting, infectious disease due to the mite-borne bacteria Rickettsia akari characterized by an asymptomatic, 0.5 to 2 cm in diameter papulovesicle that typically ulcerates and forms an eschar, followed by a generalized papulovesicular rash associating variable constitutional symptoms, such as localized lymphadenopathy, fever, malaise, and headaches. Additonal symptoms may include diaphoresis, myalgia and, less frequently, rhinorrhea, pharyngitis, nausea, vomiting, splenomegaly, conjunctival hyperemia, and abdominal pain. Systemic symtoms resolve within 6-10 days.
    myalgia, diaphoresis, nausea, vomiting A rare, acquired, self-limiting, infectious disease due to the mite-borne bacteria Rickettsia akari characterized by an asymptomatic, 0.5 to 2 cm in diameter papulovesicle that typically ulcerates and forms an eschar, followed by a generalized papulovesicular rash associating variable constitutional symptoms, such as localized lymphadenopathy, fever, malaise, and headaches. Additonal symptoms may include diaphoresis, myalgia and, less frequently, rhinorrhea, pharyngitis, nausea, vomiting, splenomegaly, conjunctival hyperemia, and abdominal pain. Systemic symtoms resolve within 6-10 days.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 1,
        "similarity_fct": "dot_score"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 9,308 evaluation samples
  • Columns: queries and chunks
  • Approximate statistics based on the first 1000 samples:
    queries chunks
    type string string
    details
    • min: 7 tokens
    • mean: 17.8 tokens
    • max: 48 tokens
    • min: 4 tokens
    • mean: 166.19 tokens
    • max: 299 tokens
  • Samples:
    queries chunks
    facial features, overgrowth, learning disabilities, delayed development Sotos syndrome is a condition characterized mainly by distinctive facial features; overgrowth in childhood; and learning disabilities or delayed development. Facial features may include a long, narrow face; a high forehead; flushed (reddened) cheeks; a small, pointed chin; and down-slanting palpebral fissures. Affected infants and children tend to grow quickly; they are significantly taller than their siblings and peers and have a large head. Other signs and symptoms may include intellectual disability; behavioral problems; problems with speech and language; and/or weak muscle tone (hypotonia). Sotos syndrome is usually caused by a mutation in the NSD1 gene and is inherited in an autosomal dominant manner. About 95% of cases are due to a new mutation in the affected person and occur sporadically (are not inherited).
    long face, high forehead, flushed cheeks, small chin, down-slanting palpebral fissures Sotos syndrome is a condition characterized mainly by distinctive facial features; overgrowth in childhood; and learning disabilities or delayed development. Facial features may include a long, narrow face; a high forehead; flushed (reddened) cheeks; a small, pointed chin; and down-slanting palpebral fissures. Affected infants and children tend to grow quickly; they are significantly taller than their siblings and peers and have a large head. Other signs and symptoms may include intellectual disability; behavioral problems; problems with speech and language; and/or weak muscle tone (hypotonia). Sotos syndrome is usually caused by a mutation in the NSD1 gene and is inherited in an autosomal dominant manner. About 95% of cases are due to a new mutation in the affected person and occur sporadically (are not inherited).
    intellectual disability, behavioral problems, speech and language difficulties, hypotonia Sotos syndrome is a condition characterized mainly by distinctive facial features; overgrowth in childhood; and learning disabilities or delayed development. Facial features may include a long, narrow face; a high forehead; flushed (reddened) cheeks; a small, pointed chin; and down-slanting palpebral fissures. Affected infants and children tend to grow quickly; they are significantly taller than their siblings and peers and have a large head. Other signs and symptoms may include intellectual disability; behavioral problems; problems with speech and language; and/or weak muscle tone (hypotonia). Sotos syndrome is usually caused by a mutation in the NSD1 gene and is inherited in an autosomal dominant manner. About 95% of cases are due to a new mutation in the affected person and occur sporadically (are not inherited).
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 1,
        "similarity_fct": "dot_score"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 2e-05
  • num_train_epochs: 25
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • eval_on_start: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 25
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: True
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss dot_map@100
0 0 - 1.8701 0.2095
0.1295 100 1.5494 - -
0.2591 200 0.9993 - -
0.3886 300 0.7225 - -
0.5181 400 0.6533 - -
0.6477 500 0.6618 0.5939 0.3722
0.7772 600 0.6454 - -
0.9067 700 0.5568 - -
1.0363 800 0.5435 - -
1.1658 900 0.499 - -
1.2953 1000 0.5386 0.4768 0.3842
1.4249 1100 0.5077 - -
1.5544 1200 0.4929 - -
1.6839 1300 0.5194 - -
1.8135 1400 0.5157 - -
1.9430 1500 0.4337 0.4455 0.3894
2.0725 1600 0.4373 - -
2.2021 1700 0.4569 - -
2.3316 1800 0.4084 - -
2.4611 1900 0.42 - -
2.5907 2000 0.4112 0.4578 0.3886
2.7202 2100 0.4498 - -
2.8497 2200 0.415 - -
2.9793 2300 0.3734 - -
3.1088 2400 0.3359 - -
3.2383 2500 0.3923 0.4339 0.3929
3.3679 2600 0.3345 - -
3.4974 2700 0.3324 - -
3.6269 2800 0.3574 - -
3.7565 2900 0.4078 - -
3.8860 3000 0.3221 0.4293 0.3904
4.0155 3100 0.2895 - -
4.1451 3200 0.2821 - -
4.2746 3300 0.3192 - -
4.4041 3400 0.28 - -
4.5337 3500 0.2716 0.4486 0.3885
4.6632 3600 0.3147 - -
4.7927 3700 0.3565 - -
4.9223 3800 0.2465 - -
5.0518 3900 0.2436 - -
5.1813 4000 0.2297 0.4486 0.3917
5.3109 4100 0.2538 - -
5.4404 4200 0.2448 - -
5.5699 4300 0.2433 - -
5.6995 4400 0.3017 - -
5.8290 4500 0.2958 0.4737 0.3934
5.9585 4600 0.2142 - -
6.0881 4700 0.1939 - -
6.2176 4800 0.2449 - -
6.3472 4900 0.2026 - -
6.4767 5000 0.2006 0.4901 0.3895
6.6062 5100 0.2118 - -
6.7358 5200 0.3064 - -
6.8653 5300 0.2276 - -
6.9948 5400 0.1809 - -
7.1244 5500 0.1782 0.4992 0.3915
7.2539 5600 0.2211 - -
7.3834 5700 0.1728 - -
7.5130 5800 0.1651 - -
7.6425 5900 0.2158 - -
7.7720 6000 0.2864 0.5113 0.3892
7.9016 6100 0.179 - -
8.0311 6200 0.1677 - -
8.1606 6300 0.1517 - -
8.2902 6400 0.1851 - -
8.4197 6500 0.1646 0.5030 0.3933
8.5492 6600 0.1608 - -
8.6788 6700 0.217 - -
8.8083 6800 0.2357 - -
8.9378 6900 0.1404 - -
9.0674 7000 0.1465 0.5153 0.3877
9.1969 7100 0.1791 - -
9.3264 7200 0.1261 - -
9.4560 7300 0.1406 - -
9.5855 7400 0.1626 - -
9.7150 7500 0.223 0.5326 0.3939
9.8446 7600 0.1806 - -
9.9741 7700 0.1289 - -
10.1036 7800 0.1269 - -
10.2332 7900 0.1609 - -
10.3627 8000 0.1279 0.5113 0.3933
10.4922 8100 0.1264 - -
10.6218 8200 0.1453 - -
10.7513 8300 0.2227 - -
10.8808 8400 0.1314 - -
11.0104 8500 0.1192 0.5444 0.3925
11.1399 8600 0.1164 - -
11.2694 8700 0.1418 - -
11.3990 8800 0.1202 - -
11.5285 8900 0.1152 - -
11.658 9000 0.1454 0.529 0.3963
11.7876 9100 0.1952 - -
11.9171 9200 0.1079 - -
12.0466 9300 0.1139 - -
12.1762 9400 0.1067 - -
12.3057 9500 0.1219 0.5257 0.3938
12.4352 9600 0.119 - -
12.5648 9700 0.1195 - -
12.6943 9800 0.158 - -
12.8238 9900 0.156 - -
12.9534 10000 0.0974 0.5434 0.3934
13.0829 10100 0.0928 - -
13.2124 10200 0.1266 - -
13.3420 10300 0.0964 - -
13.4715 10400 0.1007 - -
13.6010 10500 0.112 0.5789 0.3893
13.7306 10600 0.1699 - -
13.8601 10700 0.1084 - -
13.9896 10800 0.0967 - -
14.1192 10900 0.0856 - -
14.2487 11000 0.1142 0.5252 0.3933
14.3782 11100 0.0891 - -
14.5078 11200 0.0911 - -
14.6373 11300 0.1128 - -
14.7668 11400 0.1686 - -
14.8964 11500 0.0874 0.5874 0.3945
15.0259 11600 0.0909 - -
15.1554 11700 0.0778 - -
15.2850 11800 0.1055 - -
15.4145 11900 0.0872 - -
15.5440 12000 0.0884 0.5894 0.3934
15.6736 12100 0.1101 - -
15.8031 12200 0.1354 - -
15.9326 12300 0.0762 - -
16.0622 12400 0.0782 - -
16.1917 12500 0.0936 0.5589 0.3919
16.3212 12600 0.072 - -
16.4508 12700 0.0806 - -
16.5803 12800 0.0929 - -
16.7098 12900 0.1215 - -
16.8394 13000 0.1039 0.6025 0.3926
16.9689 13100 0.0738 - -
17.0984 13200 0.0651 - -
17.2280 13300 0.0943 - -
17.3575 13400 0.0678 - -
17.4870 13500 0.077 0.6002 0.3941
17.6166 13600 0.0839 - -
17.7461 13700 0.1268 - -
17.8756 13800 0.0764 - -
18.0052 13900 0.0686 - -
18.1347 14000 0.0697 0.5898 0.3913
18.2642 14100 0.0871 - -
18.3938 14200 0.0699 - -
18.5233 14300 0.0611 - -
18.6528 14400 0.0872 - -
18.7824 14500 0.1281 0.6087 0.3927
18.9119 14600 0.0583 - -
19.0415 14700 0.0658 - -
19.1710 14800 0.0595 - -
19.3005 14900 0.0816 - -
19.4301 15000 0.0699 0.6078 0.3965
19.5596 15100 0.0729 - -
19.6891 15200 0.0908 - -
19.8187 15300 0.0978 - -
19.9482 15400 0.0585 - -
20.0777 15500 0.0557 0.5861 0.3925
20.2073 15600 0.0787 - -
20.3368 15700 0.061 - -
20.4663 15800 0.0638 - -
20.5959 15900 0.0656 - -
20.7254 16000 0.1003 0.6032 0.3923
20.8549 16100 0.0718 - -
20.9845 16200 0.0625 - -
21.1140 16300 0.0532 - -
21.2435 16400 0.0739 - -
21.3731 16500 0.0552 0.6080 0.3942
21.5026 16600 0.0588 - -
21.6321 16700 0.0716 - -
21.7617 16800 0.1078 - -
21.8912 16900 0.0559 - -
22.0207 17000 0.0596 0.6044 0.3922
22.1503 17100 0.0512 - -
22.2798 17200 0.0716 - -
22.4093 17300 0.0574 - -
22.5389 17400 0.058 - -
22.6684 17500 0.07 0.6117 0.3942
22.7979 17600 0.0965 - -
22.9275 17700 0.0507 - -
23.0570 17800 0.0498 - -
23.1865 17900 0.0524 - -
23.3161 18000 0.0656 0.5936 0.3936
23.4456 18100 0.057 - -
23.5751 18200 0.0619 - -
23.7047 18300 0.0785 - -
23.8342 18400 0.0729 - -
23.9637 18500 0.0541 0.6174 0.3979
24.0933 18600 0.0456 - -
24.2228 18700 0.0696 - -
24.3523 18800 0.048 - -
24.4819 18900 0.0547 - -
24.6114 19000 0.0553 0.6146 0.3962
24.7409 19100 0.0936 - -
24.8705 19200 0.0579 - -
25.0 19300 0.0498 0.5290 0.3963
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 3.0.1
  • Transformers: 4.43.3
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.30.1
  • Datasets: 2.19.2
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for antonkirk/retrieval-mpnet-dot-finetuned-llama3-synthetic-dataset

Finetuned
(8)
this model

Evaluation results