Edit model card

RAG_general/rerank/models/intfloat-multilingual-e5-small-ft

This is a sentence-transformers model finetuned from intfloat/multilingual-e5-small. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/multilingual-e5-small
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("rjnClarke/intfloat-multilingual-e5-small-fine-tuned")
# Run inference
sentences = [
    'What is the significance of the tennis balls in the excerpt from the play?',
    "Says that you savour too much of your youth,\n    And bids you be advis'd there's nought in France    That can be with a nimble galliard won;    You cannot revel into dukedoms there.    He therefore sends you, meeter for your spirit,    This tun of treasure; and, in lieu of this,    Desires you let the dukedoms that you claim    Hear no more of you. This the Dauphin speaks.  KING HENRY. What treasure, uncle?  EXETER. Tennis-balls, my liege.  KING HENRY. We are glad the Dauphin is so pleasant with us;    His present and your pains we thank you for.    When we have match'd our rackets to these balls,      We will in France, by God's grace, play a set    Shall strike his father's crown into the hazard.    Tell him he hath made a match with such a wrangler    That all the courts of France will be disturb'd    With chaces. And we understand him well,    How he comes o'er us with our wilder days,    Not measuring what use we made of them.    We never valu'd this poor seat of England;    And therefore, living hence, did give ourself    To barbarous licence; as 'tis ever common    That men are merriest when they are from home.    But tell the Dauphin I will keep my state,    Be like a king, and show my sail of greatness,    When I do rouse me in my throne of France;    For that I have laid by my majesty    And plodded like a man for working-days;    But I will rise there with so full a glory    That I will dazzle all the eyes of France,    Yea, strike the Dauphin blind to look on us.    And tell the pleasant Prince this mock of his      Hath turn'd his balls to gun-stones, and his soul    Shall stand sore charged for the wasteful vengeance\n      That shall fly with them; for many a thousand widows\n",
    "YORK. From Ireland thus comes York to claim his right\n    And pluck the crown from feeble Henry's head:    Ring bells aloud, burn bonfires clear and bright,    To entertain great England's lawful king.    Ah, sancta majestas! who would not buy thee dear?    Let them obey that knows not how to rule;    This hand was made to handle nought but gold.    I cannot give due action to my words    Except a sword or sceptre balance it.\n      A sceptre shall it have, have I a soul\n    On which I'll toss the flower-de-luce of France.\n                         Enter BUCKINGHAM    [Aside] Whom have we here? Buckingham, to disturb me?\n    The King hath sent him, sure: I must dissemble.  BUCKINGHAM. York, if thou meanest well I greet thee well.    YORK. Humphrey of Buckingham, I accept thy greeting.    Art thou a messenger, or come of pleasure?  BUCKINGHAM. A messenger from Henry, our dread liege,    To know the reason of these arms in peace;    Or why thou, being a subject as I am,    Against thy oath and true allegiance sworn,    Should raise so great a power without his leave,    Or dare to bring thy force so near the court.  YORK. [Aside] Scarce can I speak, my choler is so great.    O, I could hew up rocks and fight with flint,    I am so angry at these abject terms;    And now, like Ajax Telamonius,    On sheep or oxen could I spend my fury.    I am far better born than is the King,    More like a king, more kingly in my thoughts;    But I must make fair weather yet awhile,    Till Henry be more weak and I more strong.-    Buckingham, I prithee, pardon me    That I have given no answer all this while;    My mind was troubled with deep melancholy.      The cause why I have brought this army hither    Is to remove proud Somerset from the King,    Seditious to his Grace and to the state.  BUCKINGHAM. That is too much presumption on thy part;    But if thy arms be to no other end,    The King hath yielded unto thy demand:\n      The Duke of Somerset is in the Tower.\n",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@3 0.5091
cosine_precision@1 0.3901
cosine_precision@3 0.1697
cosine_precision@5 0.1099
cosine_precision@10 0.0602
cosine_recall@1 0.3901
cosine_recall@3 0.5091
cosine_recall@5 0.5495
cosine_recall@10 0.6017
cosine_ndcg@10 0.494
cosine_mrr@200 0.4654
cosine_map@100 0.4651
dot_accuracy@3 0.5091
dot_precision@1 0.3901
dot_precision@3 0.1697
dot_precision@5 0.1099
dot_precision@10 0.0602
dot_recall@1 0.3901
dot_recall@3 0.5091
dot_recall@5 0.5495
dot_recall@10 0.6017
dot_ndcg@10 0.494
dot_mrr@200 0.4654
dot_map@100 0.4651

Training Details

Training Dataset

Unnamed Dataset

  • Size: 10,359 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 10 tokens
    • mean: 25.61 tokens
    • max: 62 tokens
    • min: 38 tokens
    • mean: 390.39 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    Who is the general being described in the excerpt? PHILO. Nay, but this dotage of our general's
    O'erflows the measure. Those his goodly eyes, That o'er the files and musters of the war Have glow'd like plated Mars, now bend, now turn, The office and devotion of their view Upon a tawny front. His captain's heart, Which in the scuffles of great fights hath burst
    The buckles on his breast, reneges all temper,
    And is become the bellows and the fan To cool a gipsy's lust.
    Flourish. Enter ANTONY, CLEOPATRA, her LADIES, the train,
    with eunuchs fanning her
    Look where they come!
    Take but good note, and you shall see in him The triple pillar of the world transform'd Into a strumpet's fool. Behold and see. CLEOPATRA. If it be love indeed, tell me how much. ANTONY. There's beggary in the love that can be reckon'd. CLEOPATRA. I'll set a bourn how far to be belov'd. ANTONY. Then must thou needs find out new heaven, new earth.
    Enter a MESSENGER MESSENGER. News, my good lord, from Rome.
    ANTONY. Grates me the sum. CLEOPATRA. Nay, hear them, Antony. Fulvia perchance is angry; or who knows If the scarce-bearded Caesar have not sent His pow'rful mandate to you: 'Do this or this; Take in that kingdom and enfranchise that; Perform't, or else we damn thee.' ANTONY. How, my love? CLEOPATRA. Perchance? Nay, and most like, You must not stay here longer; your dismission Is come from Caesar; therefore hear it, Antony. Where's Fulvia's process? Caesar's I would say? Both? Call in the messengers. As I am Egypt's Queen, Thou blushest, Antony, and that blood of thine Is Caesar's homager. Else so thy cheek pays shame
    When shrill-tongu'd Fulvia scolds. The messengers!
    What is the main conflict highlighted in the excerpt? PHILO. Nay, but this dotage of our general's
    O'erflows the measure. Those his goodly eyes, That o'er the files and musters of the war Have glow'd like plated Mars, now bend, now turn, The office and devotion of their view Upon a tawny front. His captain's heart, Which in the scuffles of great fights hath burst
    The buckles on his breast, reneges all temper,
    And is become the bellows and the fan To cool a gipsy's lust.
    Flourish. Enter ANTONY, CLEOPATRA, her LADIES, the train,
    with eunuchs fanning her
    Look where they come!
    Take but good note, and you shall see in him The triple pillar of the world transform'd Into a strumpet's fool. Behold and see. CLEOPATRA. If it be love indeed, tell me how much. ANTONY. There's beggary in the love that can be reckon'd. CLEOPATRA. I'll set a bourn how far to be belov'd. ANTONY. Then must thou needs find out new heaven, new earth.
    Enter a MESSENGER MESSENGER. News, my good lord, from Rome.
    ANTONY. Grates me the sum. CLEOPATRA. Nay, hear them, Antony. Fulvia perchance is angry; or who knows If the scarce-bearded Caesar have not sent His pow'rful mandate to you: 'Do this or this; Take in that kingdom and enfranchise that; Perform't, or else we damn thee.' ANTONY. How, my love? CLEOPATRA. Perchance? Nay, and most like, You must not stay here longer; your dismission Is come from Caesar; therefore hear it, Antony. Where's Fulvia's process? Caesar's I would say? Both? Call in the messengers. As I am Egypt's Queen, Thou blushest, Antony, and that blood of thine Is Caesar's homager. Else so thy cheek pays shame
    When shrill-tongu'd Fulvia scolds. The messengers!
    The excerpt showcases the tension between Antony's loyalty to Cleopatra and his obligations to Caesar, as well as Cleopatra's influence over him. PHILO. Nay, but this dotage of our general's
    O'erflows the measure. Those his goodly eyes, That o'er the files and musters of the war Have glow'd like plated Mars, now bend, now turn, The office and devotion of their view Upon a tawny front. His captain's heart, Which in the scuffles of great fights hath burst
    The buckles on his breast, reneges all temper,
    And is become the bellows and the fan To cool a gipsy's lust.
    Flourish. Enter ANTONY, CLEOPATRA, her LADIES, the train,
    with eunuchs fanning her
    Look where they come!
    Take but good note, and you shall see in him The triple pillar of the world transform'd Into a strumpet's fool. Behold and see. CLEOPATRA. If it be love indeed, tell me how much. ANTONY. There's beggary in the love that can be reckon'd. CLEOPATRA. I'll set a bourn how far to be belov'd. ANTONY. Then must thou needs find out new heaven, new earth.
    Enter a MESSENGER MESSENGER. News, my good lord, from Rome.
    ANTONY. Grates me the sum. CLEOPATRA. Nay, hear them, Antony. Fulvia perchance is angry; or who knows If the scarce-bearded Caesar have not sent His pow'rful mandate to you: 'Do this or this; Take in that kingdom and enfranchise that; Perform't, or else we damn thee.' ANTONY. How, my love? CLEOPATRA. Perchance? Nay, and most like, You must not stay here longer; your dismission Is come from Caesar; therefore hear it, Antony. Where's Fulvia's process? Caesar's I would say? Both? Call in the messengers. As I am Egypt's Queen, Thou blushest, Antony, and that blood of thine Is Caesar's homager. Else so thy cheek pays shame
    When shrill-tongu'd Fulvia scolds. The messengers!
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 2,302 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 11 tokens
    • mean: 25.55 tokens
    • max: 77 tokens
    • min: 17 tokens
    • mean: 395.63 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    The excerpt highlights the tension between Antony's loyalty to Cleopatra and his standing in Rome, showcasing the intricate balance of power and love in the play. When shrill-tongu'd Fulvia scolds. The messengers!
    ANTONY. Let Rome in Tiber melt, and the wide arch Of the rang'd empire fall! Here is my space. Kingdoms are clay; our dungy earth alike Feeds beast as man. The nobleness of life Is to do thus [emhracing], when such a mutual pair And such a twain can do't, in which I bind, On pain of punishment, the world to weet We stand up peerless. CLEOPATRA. Excellent falsehood! Why did he marry Fulvia, and not love her? I'll seem the fool I am not. Antony Will be himself. ANTONY. But stirr'd by Cleopatra. Now for the love of Love and her soft hours, Let's not confound the time with conference harsh; There's not a minute of our lives should stretch Without some pleasure now. What sport to-night? CLEOPATRA. Hear the ambassadors. ANTONY. Fie, wrangling queen! Whom everything becomes- to chide, to laugh, To weep; whose every passion fully strives To make itself in thee fair and admir'd. No messenger but thine, and all alone To-night we'll wander through the streets and note The qualities of people. Come, my queen; Last night you did desire it. Speak not to us. Exeunt ANTONY and CLEOPATRA, with the train DEMETRIUS. Is Caesar with Antonius priz'd so slight? PHILO. Sir, sometimes when he is not Antony, He comes too short of that great property Which still should go with Antony. DEMETRIUS. I am full sorry That he approves the common liar, who Thus speaks of him at Rome; but I will hope
    Of better deeds to-morrow. Rest you happy! Exeunt
    What is the significance of the soothsayer in the context of the play? CHARMIAN. Lord Alexas, sweet Alexas, most anything Alexas, almost
    most absolute Alexas, where's the soothsayer that you prais'd so to th' Queen? O that I knew this husband, which you say must charge his horns with garlands! ALEXAS. Soothsayer! SOOTHSAYER. Your will? CHARMIAN. Is this the man? Is't you, sir, that know things? SOOTHSAYER. In nature's infinite book of secrecy A little I can read. ALEXAS. Show him your hand.
    Enter ENOBARBUS ENOBARBUS. Bring in the banquet quickly; wine enough
    Cleopatra's health to drink. CHARMIAN. Good, sir, give me good fortune. SOOTHSAYER. I make not, but foresee. CHARMIAN. Pray, then, foresee me one. SOOTHSAYER. You shall be yet far fairer than you are. CHARMIAN. He means in flesh. IRAS. No, you shall paint when you are old. CHARMIAN. Wrinkles forbid! ALEXAS. Vex not his prescience; be attentive. CHARMIAN. Hush!
    SOOTHSAYER. You shall be more beloving than beloved.
    What is the setting of the scene in which the excerpt takes place? sweet Isis, I beseech thee! And let her die too, and give him a
    worse! And let worse follow worse, till the worst of all follow him laughing to his grave, fiftyfold a cuckold! Good Isis, hear me this prayer, though thou deny me a matter of more weight; good Isis, I beseech thee! IRAS. Amen. Dear goddess, hear that prayer of the people! For, as it is a heartbreaking to see a handsome man loose-wiv'd, so it is a deadly sorrow to behold a foul knave uncuckolded. Therefore, dear Isis, keep decorum, and fortune him accordingly! CHARMIAN. Amen. ALEXAS. Lo now, if it lay in their hands to make me a cuckold, they would make themselves whores but they'ld do't!
    Enter CLEOPATRA ENOBARBUS. Hush! Here comes Antony.
    CHARMIAN. Not he; the Queen. CLEOPATRA. Saw you my lord? ENOBARBUS. No, lady. CLEOPATRA. Was he not here? CHARMIAN. No, madam. CLEOPATRA. He was dispos'd to mirth; but on the sudden A Roman thought hath struck him. Enobarbus! ENOBARBUS. Madam? CLEOPATRA. Seek him, and bring him hither. Where's Alexas? ALEXAS. Here, at your service. My lord approaches.
    Enter ANTONY, with a MESSENGER and attendants CLEOPATRA. We will not look upon him. Go with us.
    Exeunt CLEOPATRA, ENOBARBUS, and the rest MESSENGER. Fulvia thy wife first came into the field. ANTONY. Against my brother Lucius? MESSENGER. Ay.
    But soon that war had end, and the time's state
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • gradient_accumulation_steps: 2
  • num_train_epochs: 7
  • warmup_steps: 50
  • fp16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 7
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 50
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss multi-dev_cosine_map@100
1.0 162 - 1.7998 0.4106
2.0 324 - 1.6831 0.4286
3.0 486 - 1.6670 0.4343
3.0864 500 1.7796 - -
4.0 648 - 1.6174 0.4501
5.0 810 - 1.5971 0.4559
6.0 972 - 1.5842 0.4620
6.1728 1000 1.0289 - -
7.0 1134 - 1.5726 0.4651
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.43.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
10
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for rjnClarke/intfloat-multilingual-e5-small-fine-tuned

Finetuned
(56)
this model

Evaluation results