blue2959
/

xlm-roberta-base-finetuned-kor-8-emotions_v1.2

Text Classification

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Edit model card

XLM-Roberta-base --> 8emotions!

Label Dictionry

label_dictionary
emo2int = { "기쁨": 0, "당황": 1, "분노": 2, "불안": 3, "상처": 4, "슬픔": 5, "중립": 6 }
kore2en = { "기쁨": "joy", "당황": "surprise", "분노": "anger", "불안": "fear", "상처": "hurt", "슬픔": "sadness", "중립": "neutral" }

Dataset

감성대화말뭉치(AI Hub)

input format(recommendation) - this model is trained by ChatBOT dataset.
ref: https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&dataSetSn=86

한국어 감정 정보가 포함된 연속적 대화 데이터셋(AIHub)

And.. this dataset doesn't have neutral class..
So additional dataset is used.
ref: https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=271
finally I Concatenate 2 Datasets.

Input Format(Please Use Special Tokens [USR], [BOT] to use model API!)

(example) [USR] 안녕. [BOT] 안녕하세요! 무엇을 도와드릴까요? [USR] 별일 없어.
이 두개의 특수 토큰은 반드시 사용해주시길 부탁드립니다.
And these are a part of real data.

Metrics(F1, Accuracy, and Confusion Matrix!)

and confusion matrix like this..
and so on.. F1, Accuracy

Downloads last month: 5

Safetensors

Model size

278M params

Tensor type

F32

·

Inference Examples

Text Classification

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.