Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,12 @@ metrics:
|
|
8 |
model-index:
|
9 |
- name: Llama-2-7b-hf-IDMGSP
|
10 |
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -15,19 +21,70 @@ should probably proofread and complete it, then remove this comment. -->
|
|
15 |
|
16 |
# Llama-2-7b-hf-IDMGSP
|
17 |
|
18 |
-
This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the
|
19 |
-
It achieves the following results on the evaluation
|
20 |
- Loss: 0.1450
|
21 |
- Accuracy: {'accuracy': 0.9759036144578314}
|
22 |
- F1: {'f1': 0.9758125472411187}
|
23 |
|
24 |
## Model description
|
25 |
|
26 |
-
|
27 |
|
28 |
## Intended uses & limitations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
-
More information needed
|
31 |
|
32 |
## Training and evaluation data
|
33 |
|
@@ -37,6 +94,12 @@ More information needed
|
|
37 |
|
38 |
### Training hyperparameters
|
39 |
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
The following hyperparameters were used during training:
|
41 |
- learning_rate: 0.0001
|
42 |
- train_batch_size: 32
|
@@ -65,4 +128,4 @@ The following hyperparameters were used during training:
|
|
65 |
- Transformers 4.35.0
|
66 |
- Pytorch 2.0.1
|
67 |
- Datasets 2.14.6
|
68 |
-
- Tokenizers 0.14.1
|
|
|
8 |
model-index:
|
9 |
- name: Llama-2-7b-hf-IDMGSP
|
10 |
results: []
|
11 |
+
license: mit
|
12 |
+
datasets:
|
13 |
+
- tum-nlp/IDMGSP
|
14 |
+
language:
|
15 |
+
- da
|
16 |
+
library_name: transformers
|
17 |
---
|
18 |
|
19 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
21 |
|
22 |
# Llama-2-7b-hf-IDMGSP
|
23 |
|
24 |
+
This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the [tum-nlp/IDMGSP](https://huggingface.co/datasets/tum-nlp/IDMGSP) dataset.
|
25 |
+
It achieves the following results on the evaluation split:
|
26 |
- Loss: 0.1450
|
27 |
- Accuracy: {'accuracy': 0.9759036144578314}
|
28 |
- F1: {'f1': 0.9758125472411187}
|
29 |
|
30 |
## Model description
|
31 |
|
32 |
+
Model loaded fine-tuned in 4bit quantization mode using LoRA.
|
33 |
|
34 |
## Intended uses & limitations
|
35 |
+
Labels: `0` non-AI generated, `1` AI generated.
|
36 |
+
|
37 |
+
For classifying AI generated text. Code to run the inference
|
38 |
+
|
39 |
+
```
|
40 |
+
import transformers
|
41 |
+
import torch
|
42 |
+
import datasets
|
43 |
+
import numpy as np
|
44 |
+
import torch
|
45 |
+
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, PeftModel, AutoPeftModelForCausalLM, TaskType
|
46 |
+
import bitsandbytes as bnb
|
47 |
+
|
48 |
+
class Model():
|
49 |
+
def __init__(self, name) -> None:
|
50 |
+
# Hyperparams
|
51 |
+
self.lr = 1e-4
|
52 |
+
self.epochs = 5
|
53 |
+
self.weight_decay = 0.01
|
54 |
+
self.dropout = 0.1
|
55 |
+
self.sequence_length = 512
|
56 |
+
self.batch_size = 32
|
57 |
+
|
58 |
+
# Tokenizer
|
59 |
+
self.tokenizer = transformers.LlamaTokenizer.from_pretrained(self.name)
|
60 |
+
self.tokenizer.pad_token = self.tokenizer.eos_token
|
61 |
+
print(f"Tokenizer: {self.tokenizer.eos_token}; Pad {self.tokenizer.pad_token}")
|
62 |
+
|
63 |
+
# Model
|
64 |
+
bnb_config = transformers.BitsAndBytesConfig(
|
65 |
+
load_in_4bit = True,
|
66 |
+
bnb_4bit_use_double_quant = True,
|
67 |
+
bnb_4bit_quant_type = "nf4",
|
68 |
+
bnb_4bit_compute_dtype = "bfloat16",
|
69 |
+
)
|
70 |
+
self.peft_config = LoraConfig(
|
71 |
+
task_type=TaskType.SEQ_CLS, r=8, lora_alpha=16, lora_dropout=0.05, bias="none"
|
72 |
+
)
|
73 |
+
self.model = transformers.LlamaForSequenceClassification.from_pretrained(self.name,
|
74 |
+
num_labels=2,
|
75 |
+
quantization_config = bnb_config,
|
76 |
+
device_map = "auto"
|
77 |
+
)
|
78 |
+
self.model.config.pad_token_id = self.model.config.eos_token_id
|
79 |
+
|
80 |
+
def predict(self, text):
|
81 |
+
inputs = self.tokenize(text)
|
82 |
+
outputs = self.model(**inputs)
|
83 |
+
logits = outputs.logits
|
84 |
+
predictions = torch.argmax(logits, dim=-1)
|
85 |
+
return id2label[predictions.item()]
|
86 |
+
```
|
87 |
|
|
|
88 |
|
89 |
## Training and evaluation data
|
90 |
|
|
|
94 |
|
95 |
### Training hyperparameters
|
96 |
|
97 |
+
BitsAndBytes and LoRA config parameters:
|
98 |
+
|
99 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/638f0f9ab0525fa370479467/XI1imFyXmzFjCGCkBYClc.png)
|
100 |
+
|
101 |
+
GPU Consumption during training:
|
102 |
+
|
103 |
The following hyperparameters were used during training:
|
104 |
- learning_rate: 0.0001
|
105 |
- train_batch_size: 32
|
|
|
128 |
- Transformers 4.35.0
|
129 |
- Pytorch 2.0.1
|
130 |
- Datasets 2.14.6
|
131 |
+
- Tokenizers 0.14.1
|