File size: 2,972 Bytes
4a39e47 9604d84 4a39e47 9604d84 4a39e47 9604d84 4a39e47 cea934a 4a39e47 cea934a a62d24c 4d140f3 8bfa4a7 fc38dcb f4edebc c688c7e 29aac0b a3863e7 5f6e150 fac4153 a452e62 4a39e47 cea934a a62d24c 4d140f3 8bfa4a7 fc38dcb f4edebc c688c7e 29aac0b a3863e7 5f6e150 fac4153 a452e62 4a39e47 9604d84 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
---
license: apache-2.0
pipeline_tag: question-answering
tags:
- question-answering
- transformers
- generated_from_trainer
datasets:
- squad_v2
- LLukas22/nq-simplified
- newsqa
- LLukas22/NLQuAD
- deepset/germanquad
language:
- en
- de
---
# all-MiniLM-L12-v2-qa-all
This model is an extractive qa model.
It's a fine-tuned version of [all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2) on the following datasets: [squad_v2](https://huggingface.co/datasets/squad_v2), [LLukas22/nq-simplified](https://huggingface.co/datasets/LLukas22/nq-simplified), [newsqa](https://huggingface.co/datasets/newsqa), [LLukas22/NLQuAD](https://huggingface.co/datasets/LLukas22/NLQuAD), [deepset/germanquad](https://huggingface.co/datasets/deepset/germanquad).
## Usage
You can use the model like this:
```python
from transformers import pipeline
#Make predictions
model_name = "LLukas22/all-MiniLM-L12-v2-qa-all"
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
QA_input = {
"question": "What's my name?",
"context": "My name is Clara and I live in Berkeley."
}
result = nlp(QA_input)
print(result)
```
Alternatively you can load the model and tokenizer on their own:
```python
from transformers import AutoModelForQuestionAnswering, AutoTokenizer
#Make predictions
model_name = "LLukas22/all-MiniLM-L12-v2-qa-all"
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
```
## Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2E-05
- per device batch size: 60
- effective batch size: 180
- seed: 42
- optimizer: AdamW with betas (0.9,0.999) and eps 1E-08
- weight decay: 1E-02
- D-Adaptation: False
- Warmup: True
- number of epochs: 15
- mixed_precision_training: bf16
## Training results
| Epoch | Train Loss | Validation Loss |
| ----- | ---------- | --------------- |
| 0 | 3.76 | 3.02 |
| 1 | 2.57 | 2.23 |
| 2 | 2.2 | 2.08 |
| 3 | 2.07 | 2.03 |
| 4 | 1.96 | 1.97 |
| 5 | 1.87 | 1.93 |
| 6 | 1.81 | 1.91 |
| 7 | 1.77 | 1.89 |
| 8 | 1.73 | 1.89 |
| 9 | 1.7 | 1.9 |
| 10 | 1.68 | 1.9 |
| 11 | 1.67 | 1.9 |
## Evaluation results
| Epoch | f1 | exact_match |
| ----- | ----- | ----- |
| 0 | 0.29 | 0.228 |
| 1 | 0.371 | 0.329 |
| 2 | 0.413 | 0.369 |
| 3 | 0.437 | 0.376 |
| 4 | 0.454 | 0.388 |
| 5 | 0.468 | 0.4 |
| 6 | 0.479 | 0.408 |
| 7 | 0.487 | 0.415 |
| 8 | 0.495 | 0.421 |
| 9 | 0.501 | 0.416 |
| 10 | 0.506 | 0.42 |
| 11 | 0.51 | 0.421 |
## Framework versions
- Transformers: 4.25.1
- PyTorch: 2.0.0.dev20230210+cu118
- PyTorch Lightning: 1.8.6
- Datasets: 2.7.1
- Tokenizers: 0.13.1
- Sentence Transformers: 2.2.2
## Additional Information
This model was trained as part of my Master's Thesis **'Evaluation of transformer based language models for use in service information systems'**. The source code is available on [Github](https://github.com/LLukas22/Retrieval-Augmented-QA). |