proba / README.md
luka023's picture
Add new SentenceTransformer model.
ee708e2 verified
metadata
base_model: intfloat/multilingual-e5-large
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:198
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Najčešći tipovi uključuju iznad/ispod 2.5, ukupno golova, i klađenje na
      broj golova u poluvremenima.
    sentences:
      - Koji su najčešći tipovi klađenja na golove?
      - Koje kladionice u Srbiji nude DNB opciju?
      - Šta je hendikep klađenje?
  - source_sentence: >-
      Facebook grupe posvećene klađenju omogućavaju korisnicima da dobijaju
      savete i predloge od velikih zajednica korisnika i kladioničara.
    sentences:
      - Šta je limit u klađenju?
      - Kako se koristi Facebook za klađenje?
      - Šta je cash-out opcija u uživo klađenju?
  - source_sentence: >-
      Najčešći tipovi uključuju klađenje na konačan ishod, broj gemova, broj
      setova, i klađenje uživo.
    sentences:
      - Koje su prednosti praćenja utakmica uživo?
      - Koji su najčešći tipovi klađenja na tenis?
      - Šta je e-novčanik?
  - source_sentence: >-
      Premijum provizija je dodatna naknada koju berze kvota mogu naplatiti
      igračima za specifične usluge ili dobitke.
    sentences:
      - Šta je premijum provizija?
      - Koje su strategije za uspešno uživo klađenje?
      - Kako funkcioniše klađenje na ukupan broj poena timova?
  - source_sentence: >-
      'Super Jenki' sistem uključuje pet događaja i 26 pojedinačnih opklada,
      takođe poznat kao kanadski sistem.
    sentences:
      - Šta je 'Super Jenki' sistem klađenja?
      - Šta je procena verovatnoće?
      - Kako klađenje uživo funkcioniše u tenisu?
model-index:
  - name: SentenceTransformer based on intfloat/multilingual-e5-large
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.8260869565217391
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9565217391304348
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8260869565217391
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31884057971014484
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20000000000000007
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000003
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8260869565217391
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9565217391304348
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9271072095125116
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9021739130434783
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9021739130434783
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.8695652173913043
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8695652173913043
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3333333333333332
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20000000000000007
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000003
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8695652173913043
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9461678046583877
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9275362318840579
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9275362318840579
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.8260869565217391
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8260869565217391
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3333333333333332
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20000000000000007
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000003
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8260869565217391
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9301212722049728
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9057971014492753
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9057971014492753
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.782608695652174
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9565217391304348
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.782608695652174
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31884057971014484
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20000000000000007
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000003
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.782608695652174
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9565217391304348
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9091552965878422
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8782608695652173
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8782608695652173
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.8260869565217391
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9565217391304348
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9565217391304348
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.8260869565217391
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.31884057971014484
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19130434782608702
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000003
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8260869565217391
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9565217391304348
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9565217391304348
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9164054079968976
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.8894927536231884
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.8894927536231884
            name: Cosine Map@100

SentenceTransformer based on intfloat/multilingual-e5-large

This is a sentence-transformers model finetuned from intfloat/multilingual-e5-large on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/multilingual-e5-large
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("luka023/proba")
# Run inference
sentences = [
    "'Super Jenki' sistem uključuje pet događaja i 26 pojedinačnih opklada, takođe poznat kao kanadski sistem.",
    "Šta je 'Super Jenki' sistem klađenja?",
    'Kako klađenje uživo funkcioniše u tenisu?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.8261
cosine_accuracy@3 0.9565
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.8261
cosine_precision@3 0.3188
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.8261
cosine_recall@3 0.9565
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9271
cosine_mrr@10 0.9022
cosine_map@100 0.9022

Information Retrieval

Metric Value
cosine_accuracy@1 0.8696
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.8696
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.8696
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9462
cosine_mrr@10 0.9275
cosine_map@100 0.9275

Information Retrieval

Metric Value
cosine_accuracy@1 0.8261
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.8261
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.8261
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9301
cosine_mrr@10 0.9058
cosine_map@100 0.9058

Information Retrieval

Metric Value
cosine_accuracy@1 0.7826
cosine_accuracy@3 0.9565
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.7826
cosine_precision@3 0.3188
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.7826
cosine_recall@3 0.9565
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9092
cosine_mrr@10 0.8783
cosine_map@100 0.8783

Information Retrieval

Metric Value
cosine_accuracy@1 0.8261
cosine_accuracy@3 0.9565
cosine_accuracy@5 0.9565
cosine_accuracy@10 1.0
cosine_precision@1 0.8261
cosine_precision@3 0.3188
cosine_precision@5 0.1913
cosine_precision@10 0.1
cosine_recall@1 0.8261
cosine_recall@3 0.9565
cosine_recall@5 0.9565
cosine_recall@10 1.0
cosine_ndcg@10 0.9164
cosine_mrr@10 0.8895
cosine_map@100 0.8895

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 198 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 198 samples:
    positive anchor
    type string string
    details
    • min: 19 tokens
    • mean: 33.76 tokens
    • max: 53 tokens
    • min: 6 tokens
    • mean: 12.87 tokens
    • max: 21 tokens
  • Samples:
    positive anchor
    Klađenje na ukupan broj poena timova podrazumeva predviđanje da li će jedan tim postići više ili manje poena od postavljene granice, nezavisno od konačnog ishoda. Kako funkcioniše klađenje na ukupan broj poena timova?
    Konačan ishod podrazumeva klađenje na to ko će pobediti u utakmici, pri čemu postoje tri mogućnosti: pobeda domaćina, pobeda gosta ili nerešeno. Šta znači klađenje na konačan ishod?
    Patent opklada uključuje tri događaja sa ukupno sedam pojedinačnih opklada: tri singl, tri dubl i jedna trostruka opklada. Šta je patent opklada?
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: False
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
1.0 1 0.6717 0.7663 0.8229 0.5755 0.8242
2.0 2 0.7779 0.8457 0.8638 0.7833 0.8635
3.0 4 0.8410 0.8732 0.8674 0.8167 0.8659
1.0 1 0.8410 0.8732 0.8674 0.8167 0.8659
2.0 2 0.8845 0.8732 0.9022 0.858 0.9022
3.0 4 0.8783 0.9058 0.9275 0.8895 0.9022
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.0
  • Transformers: 4.44.2
  • PyTorch: 2.4.0+cu121
  • Accelerate: 0.33.0
  • Datasets: 3.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}