BERT Fine-tuned on MRPC
This model is a fine-tuned version of bert-base-uncased on the MRPC (Microsoft Research Paraphrase Corpus) dataset from the GLUE benchmark. It is designed to determine whether two given sentences are semantically equivalent.
Model description
The model uses the BERT base architecture (12 layers, 768 hidden dimensions, 12 attention heads) and has been fine-tuned specifically for the paraphrase identification task. The output layer predicts whether the input sentence pair expresses the same meaning.
Key specifications:
- Base model: bert-base-uncased
- Task type: Binary classification (paraphrase/not paraphrase)
- Training method: Fine-tuning all layers
- Language: English
Intended uses & limitations
Intended uses
- Paraphrase detection
- Semantic similarity assessment
- Question duplicate detection
- Content matching
- Automated text comparison
Limitations
- Only works with English text
- Performance may degrade on out-of-domain text
- May struggle with complex or nuanced semantic relationships
- Limited to comparing pairs of sentences (not longer texts)
Training and evaluation data
The model was trained on the Microsoft Research Paraphrase Corpus (MRPC) from the GLUE benchmark:
- Training set: 3,667 sentence pairs
- Validation set: 408 sentence pairs
- Each pair is labeled as either paraphrase (1) or non-paraphrase (0)
- Class distribution: approximately 67.4% positive (paraphrase) and 32.6% negative (non-paraphrase)
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- Learning rate: 3e-05
- Batch size: 8 (train and eval)
- Optimizer: AdamW (betas=(0.9,0.999), epsilon=1e-08)
- LR scheduler: Linear decay
- Number of epochs: 3
- Max sequence length: 512
- Weight decay: 0.01
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
No log | 1.0 | 459 | 0.3905 | 0.8382 | 0.8878 |
0.5385 | 2.0 | 918 | 0.4275 | 0.8505 | 0.8961 |
0.3054 | 3.0 | 1377 | 0.5471 | 0.8652 | 0.9057 |
Framework versions
- Transformers 4.46.2
- PyTorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3
Performance analysis
The model achieves strong performance on the MRPC validation set:
- Accuracy: 86.52%
- F1 Score: 90.57%
These metrics indicate that the model is effective at identifying paraphrases while maintaining a good balance between precision and recall.
Example usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("real-jiakai/bert-base-uncased-finetuned-mrpc")
model = AutoModelForSequenceClassification.from_pretrained("real-jiakai/bert-base-uncased-finetuned-mrpc")
# Example function
def check_paraphrase(sentence1, sentence2):
inputs = tokenizer(sentence1, sentence2, return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
prediction = outputs.logits.argmax().item()
return "Paraphrase" if prediction == 1 else "Not paraphrase"
# Example usage
sentence1 = "The cat sat on the mat."
sentence2 = "A cat was sitting on the mat."
result = check_paraphrase(sentence1, sentence2)
print(f"Result: {result}")
- Downloads last month
- 3
Model tree for real-jiakai/bert-base-uncased-finetuned-mrpc
Base model
google-bert/bert-base-uncasedDataset used to train real-jiakai/bert-base-uncased-finetuned-mrpc
Evaluation results
- Accuracy on GLUE MRPCself-reported0.865
- F1 on GLUE MRPCself-reported0.906