online-doctor-model / README.md
pathfinderNdoma's picture
Update README.md
e302501 verified
---
library_name: transformers
license: creativeml-openrail-m
base_model:
- dmis-lab/biobert-v1.1
pipeline_tag: question-answering
---
library_name: transformers
tags: [biomedical, question-answering, healthcare]
---
# Model Card for Online Doctor Model
This model is a fine-tuned version of the `dmis-lab/biobert-large-cased-v1.1-squad` model. It is designed to answer questions related to diseases based on symptom descriptions, providing a question-answering pipeline to help healthcare professionals and users. This model has been trained on a custom dataset of diseases and their symptoms for predictive question answering.
## Model Details
### Model Description
This is a question-answering model fine-tuned using the `BioBERT` architecture, specifically adapted for healthcare-related questions. The model is designed to extract answers from a disease-symptom dataset based on user-inputted symptoms or queries.
- **Developed by:** Ayamba Victor Ndoma
- **Model type:** Question Answering
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Finetuned from model:** `dmis-lab/biobert-large-cased-v1.1-squad`
### Model Sources
- **Repository:** [Hugging Face Repository](https://huggingface.co/your-username/online-doctor-model)
- **Demo:** [Link to Demo (optional)]
## Uses
This model can be used in the following cases:
### Direct Use
- Answering healthcare-related questions based on symptom input from users.
- Assisting medical professionals in preliminary diagnosis based on reported symptoms.
### Downstream Use
- Can be further fine-tuned or extended for more specific disease or symptom-related tasks.
- Integrated into chatbot systems for medical consultation services.
### Out-of-Scope Use
- The model is not intended for use in making definitive medical diagnoses without human supervision.
- It is not suitable for predicting non-health-related issues.
## Bias, Risks, and Limitations
- **Bias:** The model is trained on a custom dataset with potentially limited diversity in disease-symptom pairs.
- **Risks:** Incorrect predictions might occur when symptoms overlap across multiple diseases.
- **Limitations:** The model is constrained to the diseases and symptoms available in the training dataset and may not generalize to all medical conditions.
### Recommendations
This model should be used with caution, and its answers should be reviewed by qualified healthcare professionals.
## How to Get Started with the Model
Use the following code to get started with the model:
```python
from transformers import pipeline
qa_pipeline = pipeline("question-answering", model="your-username/online-doctor-model")
# Example question and context
question = "What are the symptoms of diabetes?"
context = "Diabetes: increased thirst, frequent urination, hunger, fatigue, blurred vision."
result = qa_pipeline(question=question, context=context)
print(result['answer'])
```
## Training Details
### Training Data
The model is fine-tuned on a custom dataset containing diseases and their respective symptoms.
### Training Procedure
- **Preprocessing:** Text cleaning and tokenization were applied to ensure proper context and symptom pairing.
- **Training regime:** The model was trained using mixed-precision FP16 on a single GPU.
#### Training Hyperparameters
- **Epochs:** 3
- **Batch size:** 16
- **Learning rate:** 3e-5
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
The model was evaluated on a held-out portion of the custom disease-symptom dataset.
#### Factors
- Subpopulation: Various diseases ranging from common illnesses to rare conditions.
- Domains: Medical text and descriptions of symptoms.
#### Metrics
The model was evaluated using the SQuAD metrics, including F1 score and Exact Match (EM).
### Results
- **F1 score:** 0.82
- **Exact Match (EM):** 0.78
#### Summary
The model performs well on the task of extracting relevant symptoms and disease-related answers based on the question provided. However, its performance is limited to the diseases and symptoms present in the training data.
## Environmental Impact
- **Hardware Type:** Single GPU (NVIDIA Tesla T4)
- **Hours used:** 3
- **Cloud Provider:** Google Cloud
- **Compute Region:** US
- **Carbon Emitted:** Approximately 0.36 kg CO2eq
## Technical Specifications
### Model Architecture and Objective
The model is based on the `BioBERT` architecture fine-tuned for the SQuAD task, with a focus on healthcare question-answering.
### Compute Infrastructure
- **Hardware:** NVIDIA Tesla T4 GPU
- **Software:** PyTorch, Transformers Library
## Citation
If you use this model, please cite it as:
```
@misc{Ndoma2024onlinedoctor,
author = {Ayamba Victor Ndoma},
title = {Online Doctor Model for Disease Prediction},
year = {2024},
howpublished = {\url{https://huggingface.co/your-username/online-doctor-model}},
}
```
## Model Card Authors
- Ayamba Victor Ndoma
## Model Card Contact
For questions or feedback, please contact `[email protected]`.
---