ernlavr commited on
Commit
fec4831
1 Parent(s): 23baba2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -5
README.md CHANGED
@@ -8,6 +8,12 @@ metrics:
8
  model-index:
9
  - name: Llama-2-7b-hf-IDMGSP
10
  results: []
 
 
 
 
 
 
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,19 +21,70 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # Llama-2-7b-hf-IDMGSP
17
 
18
- This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the None dataset.
19
- It achieves the following results on the evaluation set:
20
  - Loss: 0.1450
21
  - Accuracy: {'accuracy': 0.9759036144578314}
22
  - F1: {'f1': 0.9758125472411187}
23
 
24
  ## Model description
25
 
26
- More information needed
27
 
28
  ## Intended uses & limitations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
- More information needed
31
 
32
  ## Training and evaluation data
33
 
@@ -37,6 +94,12 @@ More information needed
37
 
38
  ### Training hyperparameters
39
 
 
 
 
 
 
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 0.0001
42
  - train_batch_size: 32
@@ -65,4 +128,4 @@ The following hyperparameters were used during training:
65
  - Transformers 4.35.0
66
  - Pytorch 2.0.1
67
  - Datasets 2.14.6
68
- - Tokenizers 0.14.1
 
8
  model-index:
9
  - name: Llama-2-7b-hf-IDMGSP
10
  results: []
11
+ license: mit
12
+ datasets:
13
+ - tum-nlp/IDMGSP
14
+ language:
15
+ - da
16
+ library_name: transformers
17
  ---
18
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
21
 
22
  # Llama-2-7b-hf-IDMGSP
23
 
24
+ This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the [tum-nlp/IDMGSP](https://huggingface.co/datasets/tum-nlp/IDMGSP) dataset.
25
+ It achieves the following results on the evaluation split:
26
  - Loss: 0.1450
27
  - Accuracy: {'accuracy': 0.9759036144578314}
28
  - F1: {'f1': 0.9758125472411187}
29
 
30
  ## Model description
31
 
32
+ Model loaded fine-tuned in 4bit quantization mode using LoRA.
33
 
34
  ## Intended uses & limitations
35
+ Labels: `0` non-AI generated, `1` AI generated.
36
+
37
+ For classifying AI generated text. Code to run the inference
38
+
39
+ ```
40
+ import transformers
41
+ import torch
42
+ import datasets
43
+ import numpy as np
44
+ import torch
45
+ from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, PeftModel, AutoPeftModelForCausalLM, TaskType
46
+ import bitsandbytes as bnb
47
+
48
+ class Model():
49
+ def __init__(self, name) -> None:
50
+ # Hyperparams
51
+ self.lr = 1e-4
52
+ self.epochs = 5
53
+ self.weight_decay = 0.01
54
+ self.dropout = 0.1
55
+ self.sequence_length = 512
56
+ self.batch_size = 32
57
+
58
+ # Tokenizer
59
+ self.tokenizer = transformers.LlamaTokenizer.from_pretrained(self.name)
60
+ self.tokenizer.pad_token = self.tokenizer.eos_token
61
+ print(f"Tokenizer: {self.tokenizer.eos_token}; Pad {self.tokenizer.pad_token}")
62
+
63
+ # Model
64
+ bnb_config = transformers.BitsAndBytesConfig(
65
+ load_in_4bit = True,
66
+ bnb_4bit_use_double_quant = True,
67
+ bnb_4bit_quant_type = "nf4",
68
+ bnb_4bit_compute_dtype = "bfloat16",
69
+ )
70
+ self.peft_config = LoraConfig(
71
+ task_type=TaskType.SEQ_CLS, r=8, lora_alpha=16, lora_dropout=0.05, bias="none"
72
+ )
73
+ self.model = transformers.LlamaForSequenceClassification.from_pretrained(self.name,
74
+ num_labels=2,
75
+ quantization_config = bnb_config,
76
+ device_map = "auto"
77
+ )
78
+ self.model.config.pad_token_id = self.model.config.eos_token_id
79
+
80
+ def predict(self, text):
81
+ inputs = self.tokenize(text)
82
+ outputs = self.model(**inputs)
83
+ logits = outputs.logits
84
+ predictions = torch.argmax(logits, dim=-1)
85
+ return id2label[predictions.item()]
86
+ ```
87
 
 
88
 
89
  ## Training and evaluation data
90
 
 
94
 
95
  ### Training hyperparameters
96
 
97
+ BitsAndBytes and LoRA config parameters:
98
+
99
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/638f0f9ab0525fa370479467/XI1imFyXmzFjCGCkBYClc.png)
100
+
101
+ GPU Consumption during training:
102
+
103
  The following hyperparameters were used during training:
104
  - learning_rate: 0.0001
105
  - train_batch_size: 32
 
128
  - Transformers 4.35.0
129
  - Pytorch 2.0.1
130
  - Datasets 2.14.6
131
+ - Tokenizers 0.14.1