metadata

license: mit
datasets:
  - dair-ai/emotion
language:
  - en
metrics:
  - accuracy
  - precision
  - recall
  - f1
base_model:
  - google-bert/bert-base-uncased
pipeline_tag: text-classification
library_name: transformers
tags:
  - emotion-classification

BERT-Base-Uncased Emotion Classification Model

Model Architecture

Base Model: bert-base-uncased
Architecture: Transformer-based model (BERT)
Fine-Tuned Task: Emotion classification
Number of Labels: 6 (sadness, joy, love, anger, fear, surprise)

Dataset Information

The model was fine-tuned on the dair-ai/emotion dataset, which consists of English tweets classified into six emotion categories.

Training Dataset Size: 16,000 examples
Validation Dataset Size: 2,000 examples
Test Dataset Size: 2,000 examples
Features:
- text: The text of the tweet
- label: The emotion label for the text (ClassLabel: ['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'])

Training Arguments

The model was trained using the following hyperparameters:

Learning Rate: 2e-05
Batch Size: 16
Number of Epochs: 20 (stopped early at 7 epochs)
Gradient Accumulation Steps: 2
Weight Decay: 0.01
Mixed Precision (FP16): True
Early Stopping: Enabled (see details below)
Logging: Progress logged every 100 steps
Save Strategy: Checkpoints saved at the end of each epoch, with the 3 most recent checkpoints retained

The model was trained for 7 epochs, but it stopped early due to early stopping. Early stopping was configured with the following details:

Patience: 3 epochs (training stops if no improvement in F1 score for 3 consecutive evaluations)
Best Metric: F1 score (greater is better)
Final Epoch: The model was trained for 7 epochs (out of the planned 20) and stopped early due to no improvement in evaluation metrics.

Final Training Metrics (After 7 Epochs)

The model achieved the following results on the validation dataset in the final epoch:

Accuracy: 0.9085
Precision: 0.8736
Recall: 0.8962
F1 Score: 0.8824

Test Set Evaluation

After training, the model was evaluated on a held-out test set. The following are the results on the test dataset:

Test Accuracy: 0.9180
Test Precision: 0.8663
Test Recall: 0.8757
Test F1 Score: 0.8706

Usage

You can load the model and tokenizer for inference using the Hugging Face transformers library with the pipeline:

from transformers import pipeline

# Load the emotion classification pipeline
classifier = pipeline("text-classification", model='Prikshit7766/bert-base-uncased-emotion', return_all_scores=True)

# Test the classifier with a sample sentence
prediction = classifier("I am feeling great and happy today!")

# Print the predictions
print(prediction)

Output

  [[{'label': 'sadness', 'score': 0.00010687233589123935},
    {'label': 'joy', 'score': 0.9991187453269958},
    {'label': 'love', 'score': 0.00041500659426674247},
    {'label': 'anger', 'score': 7.090374856488779e-05},
    {'label': 'fear', 'score': 5.2315706852823496e-05},
    {'label': 'surprise', 'score': 0.0002362433006055653}]]