--- license: apache-2.0 base_model: microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext tags: - generated_from_trainer metrics: - precision - recall - accuracy - f1 model-index: - name: pretoxtm-sentence-classifier results: [] datasets: - javicorvi/pretoxtm-dataset language: - en pipeline_tag: text-classification --- # pretoxtm-sentence-classifier This model is a fine-tuned version of [microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext) on [javicorvi/pretoxtm-dataset](https://huggingface.co/datasets/javicorvi/pretoxtm-dataset). It achieves the following results on the evaluation set: - Loss: 0.1181 - Precision: 0.9788 - Recall: 0.9800 - Accuracy: 0.9795 - F1: 0.9794 ## Model description PretoxTM Sentence Classifier is a model trained on preclinical toxicology literature, designed to detect sentences that contain treatment-related findings. ## Training and evaluation data The model was trained on [javicorvi/pretoxtm-dataset](https://huggingface.co/datasets/javicorvi/pretoxtm-dataset). The dataset is divided in train, validation and test. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1.1848183151867784e-05 - train_batch_size: 4 - eval_batch_size: 8 - seed: 1 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | Accuracy | F1 | |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:--------:|:------:| | 0.2543 | 1.0 | 514 | 0.1181 | 0.9788 | 0.9800 | 0.9795 | 0.9794 | | 0.1344 | 2.0 | 1028 | 0.1488 | 0.9767 | 0.9775 | 0.9773 | 0.9771 | | 0.0419 | 3.0 | 1542 | 0.1520 | 0.9767 | 0.9775 | 0.9773 | 0.9771 | ### Framework versions - Transformers 4.39.3 - Pytorch 2.2.1+cu121 - Datasets 2.18.0 - Tokenizers 0.15.2