File size: 1,475 Bytes
b0006a3 de94882 f7f4f4f de94882 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
---
license: apache-2.0
datasets:
- yongchao/gptgen_text_detection
metrics:
- accuracy
pipeline_tag: text-classification
---
# BERT-based Classification Model for AI Generated Text Detection
## Model Overview
This BERT-based model is fine-tuned for the task of Ai generated text detection, especially in a TEXT-SQL senario.
Please be noted that this model is still in testing phase, its validity has not been fully tested.
## Model Details
- **Architecture**: BERT (bert-base-uncased)
- **Training Data**: The model was trained on a dataset of 2000 labeled human and ai created questions.
- **Training Procedure**:
- **Epochs**: 10
- **Batch Size**: 16
- **Learning Rate**: 2e-5
- **Warmup Steps**: 500
- **Weight Decay**: 0.01
- **Model Performance**:
- **Accuracy**: 85.7%
- **Precision**: 82.4%
- **Recall**: 91%
- **F1 Score**: 86.5%
## Limitations and Ethical Considerations
### Limitations
The model may not perform well on text that are significantly different from the training data.
### Ethical Considerations
Be aware of potential biases in the training data that could affect the model's predictions. Ensure that the model is used in a fair and unbiased manner.
## References
- **BERT Paper**: Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
- **Dataset**: [Link to the dataset](https://huggingface.co/datasets/yongchao/gptgen_text_detection)
|