metadata
base_model: antoinelouis/colbert-xm
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
widget: []
SentenceTransformer based on antoinelouis/colbert-xm
This is a sentence-transformers model finetuned from antoinelouis/colbert-xm. It maps sentences & paragraphs to a 128-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: antoinelouis/colbert-xm
- Maximum Sequence Length: 514 tokens
- Output Dimensionality: 128 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
ColBERT(
(0): Transformer({'max_seq_length': 514, 'do_lower_case': False}) with Transformer model: XmodModel
(1): Dense({'in_features': 768, 'out_features': 128, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'The weather is lovely today.',
"It's so sunny outside!",
'He drove to the stadium.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 128]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Framework Versions
- Python: 3.12.5
- Sentence Transformers: 3.0.1
- Transformers: 4.45.1
- PyTorch: 2.4.1
- Accelerate: 0.34.2
- Datasets: 3.0.1
- Tokenizers: 0.20.0