crypto_nli / README.md
pawan2411's picture
Add new SentenceTransformer model.
7a57985 verified
metadata
base_model: microsoft/mpnet-base
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:24901
  - loss:SoftmaxLoss
widget:
  - source_sentence: >-
      Cryptocurrency holders are being exploited, with whales creating more
      coins and profiting from their value.
    sentences:
      - Buyer purchases cryptocurrency from seller in exchange.
      - >-
        Price fluctuates due to fear and uncertainty, only time will reveal its
        direction.
      - >-
        New user's post removed due to lack of required **karma** and account
        age.
  - source_sentence: >-
      User seeks assistance with retrieving funds from a cryptocurrency
      investment platform.
    sentences:
      - Bot removed post for being too short, resubmit with more characters.
      - People enjoy walking while searching for digital currency in their area.
      - >-
        Cryptocurrency project's legitimacy unlikely due to complexity and
        scrutiny in parachain development and ecosystem interactions.
  - source_sentence: >-
      Large cryptocurrencies' market dominance may change as new projects emerge
      with exceptional utility and marketing.
    sentences:
      - Market experiencing significant decline.
      - >-
        Decentralized concept in crypto is main idea, but most coins are
        centralized.
      - Cryptocurrency users share information.
  - source_sentence: Use XLM for low-cost transactions between exchanges, saving on fees.
    sentences:
      - Exchanges should automate process for increased activity.
      - >-
        Investment taxes vary by country, but generally apply after withdrawal,
        with losses still needing declaration.
      - Use basic version, buy coins with credit card.
  - source_sentence: >-
      New user seeks advice on storing Bitcoin and USDT on WazirX or Binance,
      considering pros and cons.
    sentences:
      - >-
        Buy cryptocurrency directly with credit card, but high fee makes Indian
        exchange a better option.
      - 'Cryptocurrency prices: Bitcoin, Ethereum, and others fluctuate.'
      - Investor has faith in Tezos, making strategic moves.

SentenceTransformer based on microsoft/mpnet-base

This is a sentence-transformers model finetuned from microsoft/mpnet-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: microsoft/mpnet-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("pawan2411/crypto_nli")
# Run inference
sentences = [
    'New user seeks advice on storing Bitcoin and USDT on WazirX or Binance, considering pros and cons.',
    'Buy cryptocurrency directly with credit card, but high fee makes Indian exchange a better option.',
    'Investor has faith in Tezos, making strategic moves.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 24,901 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string int
    details
    • min: 6 tokens
    • mean: 21.86 tokens
    • max: 61 tokens
    • min: 6 tokens
    • mean: 16.67 tokens
    • max: 50 tokens
    • 0: ~83.50%
    • 1: ~16.50%
  • Samples:
    sentence_0 sentence_1 label
    User asks about tracing crypto swaps and process of exchanging digital currencies. "Private cryptocurrency swap can't be traced." 0
    Cryptocurrency project with weak fundamentals deserves to fail, cherish coins before next market downturn. "Trust information in this community." 0
    New user seeks advice on using crypto credit cards in daily life. User uses digital wallet for cryptocurrency transactions, earning cashback rewards. 1
  • Loss: SoftmaxLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
1.2821 500 0.3912
2.5641 1000 0.3157
3.8462 1500 0.2926
5.1282 2000 0.2788
6.4103 2500 0.2599
7.6923 3000 0.2428
8.9744 3500 0.2314
1.2821 500 0.2333
2.5641 1000 0.2292
3.8462 1500 0.1987
5.1282 2000 0.1757
6.4103 2500 0.1578
7.6923 3000 0.1413
8.9744 3500 0.1258
1.2821 500 0.1086
2.5641 1000 0.1048
3.8462 1500 0.0917
5.1282 2000 0.0805
6.4103 2500 0.0712
7.6923 3000 0.0673
8.9744 3500 0.0646
1.2821 500 0.0505
2.5641 1000 0.0511
3.8462 1500 0.046
5.1282 2000 0.0415
6.4103 2500 0.0396
7.6923 3000 0.0357
8.9744 3500 0.0382
1.2821 500 0.0252
2.5641 1000 0.029
3.8462 1500 0.0247
5.1282 2000 0.0233
6.4103 2500 0.0228
7.6923 3000 0.0218
8.9744 3500 0.0251
1.2821 500 0.0158
2.5641 1000 0.0184
3.8462 1500 0.0165
5.1282 2000 0.0139
6.4103 2500 0.0145
7.6923 3000 0.0139
8.9744 3500 0.0164

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers and SoftmaxLoss

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}