SentiCSE
This is a RoBERTa-base model trained on MR dataset and finetuned for sentiment analysis with the Sentiment tasks. This model is suitable for English.
- Reference Paper: SentiCSE (Main of Coling 2024).
- Git Repo: https://github.com/nayohan/SentiCSE.
import torch
from scipy.spatial.distance import cosine
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("DILAB-HYU/SentiCSE")
model = AutoModel.from_pretrained("DILAB-HYU/SentiCSE")
# Tokenize input texts
texts = [
"The food is delicious.",
"The atmosphere of the restaurant is good.",
"The food at the restaurant is devoid of flavor.",
"The restaurant lacks a good ambiance."
]
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
# Get the embeddings
with torch.no_grad():
embeddings = model(**inputs, output_hidden_states=True, return_dict=True).pooler_output
# Calculate cosine similarities
# Cosine similarities are in [-1, 1]. Higher means more similar
cosine_sim_0_1 = 1 - cosine(embeddings[0], embeddings[1])
cosine_sim_0_2 = 1 - cosine(embeddings[0], embeddings[2])
cosine_sim_0_3 = 1 - cosine(embeddings[0], embeddings[3])
print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[1], cosine_sim_0_1))
print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[2], cosine_sim_0_2))
print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[3], cosine_sim_0_3))
Output:
Cosine similarity between "The food is delicious." and "The atmosphere of the restaurant is good." is: 0.942
Cosine similarity between "The food is delicious." and "The food at the restaurant is devoid of flavor." is: 0.703
Cosine similarity between "The food is delicious." and "The restaurant lacks a good ambiance." is: 0.656
BibTeX entry and citation info
Please cite the reference paper if you use this model.
@article{2024SentiCSE,
title={SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity},
author={Kim, Jaemin and Na, Yohan and Kim, Kangmin and Lee, Sangrak and Chae, Dong-Kyu},
journal={Proceedings of the 30th International Conference on Computational Linguistics (COLING)},
year={2024},
}
- Downloads last month
- 2,175
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.