metadata
base_model: sentence-transformers/all-MiniLM-L6-v2
datasets:
- momo22/eng2nep
language:
- en
- ne
library_name: sentence-transformers
metrics:
- negative_mse
- src2trg_accuracy
- trg2src_accuracy
- mean_accuracy
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:1000
- loss:MSELoss
- dataset_size:5000
- dataset_size:8000
widget:
- source_sentence: |
The aggressive semi-employed religion workshop of Razzak, (EFP).
sentences:
- |
मा ग्रिटर भेट्टाउन सकेन वा GDM प्रयोगकर्ताले कार्यान्वयन गर्न सकेन
- |
रज्जाकको आक्रामक अर्द्धशतक धर्मशाला, (एएफपी)।
- |
त्यसैले मेरो विजयपछि म त्यस्तो अवस्था आउन दिनेछैन।
- source_sentence: >
The authority is being a constitutional body, it was also empowered by
passing the bill from Parliament.
sentences:
- >
अख्तियार संवैधानिक निकाय त हुँदै हो, त्यसमा पनि संसदबाटै विधेयक पास गरेर
अख्तियारलाई अधिकारसम्पन्न पनि गराइएको थियो।
- >
म यहूदाका राजा सिदकियाहलाई र उसका मानिसहरूलाई तिनीहरूका शत्रुहरूकहाँ
सुम्पिनेछु जसले तिनीहरूलाई मार्न चाहन्छन्। ती सेनाहरू यरूशलेमबाट गइसकेका
भएता पनि म तिनीहरूलाई बाबेलका राजाको सेनाहरूकहाँ सुम्पिनेछु।
- |
– संकटकालको असर न्यायिक क्षेत्रमा मात्रै पर्दैन, समग्र मुलुकमै पर्छ।
- source_sentence: >
The two-day conference will participate in investors from China, India,
Japan, the US, European countries, Britain and other countries, the
Federation said.
sentences:
- |
उनीहरूको जनजीविकाको आधार प्राकृतिक स्रोत रहेको छ।
- >
दुई दिनसम्म हुने सम्मेलनमा चीन, भारत, जापान, अमेरिका, युरोपियन देशहरू,
बेलायत लगायत देशबाट लगानीकर्ताको सहभागिता गराउने महासंघले जानकारी दिएको
छ
- |
नयाँ स्न्यापसट लिनका लागि यो बटन क्लिक गर्नुहोस् ।
- source_sentence: >
Mr Sankey issued a "confession" through his solicitor after Shields had
been convicted but then withdrew it.
sentences:
- >
श्री सान्कीले ढालहरू दोषी भएपछि आफ्नो समाधानकर्तामार्फत "स्वीकृति" जारी
गर्नुभयो तर त्यसपछि यसलाई फिर्ता लिनुभयो।
- >
कृत्रिम रुपमा पेट्रोलियम पदार्थको मूल्य स्थिर राख्न अनुदान दिदै जाने हो
भने नेपाली अर्थतन्त्र एकदिन धराशायी हुनेछ।
- >
ओली सरकारले "राष्ट्रियता-राष्ट्रवाद र" आर्थिक सम्ब्रिद्धि "-आर्थिक
विकासलाई यसको प्राथमिकताको रूपमा घोषणा गरेको छ।
- source_sentence: >
We want to use this time to appeal to the American government to see if
they can finally close this chapter.
sentences:
- |
धेरैले घाउ पाए र ओछ्यानमा थिए।
- >
नाम यसको अन्तरराष्ट्रिय हलको अद्वितिय डिजाइनबाट स्पष्ट रूपमा प्राप्त
हुन्छ, जुन शीर्षकनियम स्पेसबाट बनेको छ, जुन ठूलो गहिराइमा उच्च दबाब
बुझ्न सक्षम छ।
- >
हामी अमेरिकी सरकारलाई अपील गर्न यसपटक प्रयोग गर्न चाहन्छौं कि उनीहरूले
अन्त्यमा यो अध्याय बन्द गर्न सक्छन्।
model-index:
- name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
results:
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: Unknown
type: unknown
metrics:
- type: negative_mse
value: -0.37439612206071615
name: Negative Mse
- task:
type: translation
name: Translation
dataset:
name: Unknown
type: unknown
metrics:
- type: src2trg_accuracy
value: 0.0186
name: Src2Trg Accuracy
- type: trg2src_accuracy
value: 0.00835
name: Trg2Src Accuracy
- type: mean_accuracy
value: 0.013474999999999999
name: Mean Accuracy
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2 on the momo22/eng2nep dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
- Training Dataset:
- Languages: en, ne
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("jangedoo/all-MiniLM-L6-v2-nepali")
# Run inference
sentences = [
'We want to use this time to appeal to the American government to see if they can finally close this chapter.\n',
'हामी अमेरिकी सरकारलाई अपील गर्न यसपटक प्रयोग गर्न चाहन्छौं कि उनीहरूले अन्त्यमा यो अध्याय बन्द गर्न सक्छन्।\n',
'नाम यसको अन्तरराष्ट्रिय हलको अद्वितिय डिजाइनबाट स्पष्ट रूपमा प्राप्त हुन्छ, जुन शीर्षकनियम स्पेसबाट बनेको छ, जुन ठूलो गहिराइमा उच्च दबाब बुझ्न सक्षम छ।\n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Knowledge Distillation
- Evaluated with
MSEEvaluator
Metric | Value |
---|---|
negative_mse | -0.3744 |
Translation
- Evaluated with
TranslationEvaluator
Metric | Value |
---|---|
src2trg_accuracy | 0.0186 |
trg2src_accuracy | 0.0083 |
mean_accuracy | 0.0135 |
Training Details
Training Dataset
momo22/eng2nep
- Dataset: momo22/eng2nep at 57da8d4
- Size: 8,000 training samples
- Columns:
English
,Nepali
, andlabel
- Approximate statistics based on the first 1000 samples:
English Nepali label type string string list details - min: 3 tokens
- mean: 26.29 tokens
- max: 130 tokens
- min: 3 tokens
- mean: 65.39 tokens
- max: 256 tokens
- size: 384 elements
- Samples:
English Nepali label But with the origin of feudal practices in the Middle Ages, the practice of untouchability began, as well as discrimination against women.
तर मध्ययुगमा सामन्ती प्रथाको उद्भव भएसँगै जसरी छुवाछुत प्रथाको शुरुवात भयो, त्यसैगरी नारी प्रति पनि विभेद गरिन थालियो
[-0.05432726442813873, 0.029996933415532112, -0.008532932959496975, -0.035200122743844986, 0.008856767788529396, ...]
A Pandit was found on the way to Pokhara from Baglung.
वाग्लुङ्गबाट पोखरा आउँदा बाटोमा एकजना पण्डित भेटिए।
[-0.023763148114085197, 0.0959007516503334, -0.11197677254676819, 0.10978179425001144, -0.028137238696217537, ...]
He went on: "She ate a perfectly normal and healthy diet.
उनी गए: "उनले पूर्ण सामान्य र स्वस्थ आहार खाइन्।
[0.028130479156970978, 0.030386686325073242, -0.012276170775294304, 0.1316223442554474, -0.01928202621638775, ...]
- Loss:
MSELoss
Evaluation Dataset
momo22/eng2nep
- Dataset: momo22/eng2nep at 57da8d4
- Size: 500 evaluation samples
- Columns:
English
,Nepali
, andlabel
- Approximate statistics based on the first 1000 samples:
English Nepali label type string string list details - min: 4 tokens
- mean: 26.71 tokens
- max: 213 tokens
- min: 3 tokens
- mean: 64.1 tokens
- max: 256 tokens
- size: 384 elements
- Samples:
English Nepali label Chapter 3
परिच्छेद–३
[-0.049459926784038544, 0.048675183206796646, 0.016583453863859177, 0.04876156523823738, -0.020754676312208176, ...]
The capability of MOF would be strengthened to enable it to efficiently play the lead role in donor coordination, and to secure support from all stakeholders in aid coordination activities.
दाताहरूको समन्वयमा नेतृत्वदायीको भूमिका निर्वाह प्रभावकारी ढंगले गर्न अर्थ मन्त्रालयको क्षमता सुदृढ गरिनेछ यसको लागि सबै सरोकारवालाबाट समर्थन प्राप्त गरिनेछ ।
[-0.06200315058231354, -0.016507938504219055, -0.029924314469099045, -0.052509162575006485, 0.07746178656816483, ...]
Polimatrix, Inc. is a system integrator and total solutions provider delivering radiation and nuclear protection and detection.
पोलिमाट्रिक्स, इन्कर्पोरेटिड प्रणाली इन्टिजर र कुल समाधान प्रदायक रेडियो र आणविक संरक्षण र पत्ता लगाउने प्रणाली इन्टिजर र कुल समाधान प्रदायक हो।
[-0.0446796678006649, 0.026428330689668655, -0.09837698936462402, -0.07765442878007889, -0.020364686846733093, ...]
- Loss:
MSELoss
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 64per_device_eval_batch_size
: 64learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1bf16
: Truepush_to_hub
: Truehub_model_id
: jangedoo/all-MiniLM-L6-v2-nepalipush_to_hub_model_id
: all-MiniLM-L6-v2-nepali
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 64per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Trueresume_from_checkpoint
: Nonehub_model_id
: jangedoo/all-MiniLM-L6-v2-nepalihub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: all-MiniLM-L6-v2-nepalipush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | loss | mean_accuracy | negative_mse |
---|---|---|---|---|---|
0.4 | 50 | 0.0021 | 0.0019 | 0.0111 | -0.3837 |
0.8 | 100 | 0.002 | 0.0019 | 0.0123 | -0.3794 |
0.4 | 50 | 0.002 | 0.0019 | 0.0130 | -0.3773 |
0.8 | 100 | 0.002 | 0.0019 | 0.0135 | -0.3744 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.42.4
- PyTorch: 2.3.1+cu121
- Accelerate: 0.32.1
- Datasets: 2.21.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MSELoss
@inproceedings{reimers-2020-multilingual-sentence-bert,
title = "Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2020",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/2004.09813",
}