avramesh commited on
Commit
21dcd12
1 Parent(s): f926b29

avramesh/ft-attempt1

Browse files
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
- license: apache-2.0
3
  library_name: peft
4
  tags:
5
  - generated_from_trainer
6
- base_model: TheBloke/Mistral-7B-Instruct-v0.2-GPTQ
7
  model-index:
8
  - name: shawgpt-ft
9
  results: []
@@ -14,9 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # shawgpt-ft
16
 
17
- This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 2.7323
20
 
21
  ## Model description
22
 
@@ -49,22 +49,24 @@ The following hyperparameters were used during training:
49
 
50
  ### Training results
51
 
52
- | Training Loss | Epoch | Step | Validation Loss |
53
- |:-------------:|:-----:|:----:|:---------------:|
54
- | 4.1166 | 0.8 | 1 | 3.3662 |
55
- | 4.0433 | 1.6 | 2 | 3.3124 |
56
- | 3.8655 | 2.4 | 3 | 3.1827 |
57
- | 1.807 | 4.0 | 5 | 2.9565 |
58
- | 3.6058 | 4.8 | 6 | 2.8779 |
59
- | 3.2678 | 5.6 | 7 | 2.8191 |
60
- | 3.0625 | 6.4 | 8 | 2.7742 |
61
- | 1.5177 | 8.0 | 10 | 2.7323 |
 
 
62
 
63
 
64
  ### Framework versions
65
 
66
  - PEFT 0.11.1
67
- - Transformers 4.40.2
68
  - Pytorch 2.1.0+cu121
69
- - Datasets 2.19.1
70
  - Tokenizers 0.19.1
 
1
  ---
2
+ license: llama3
3
  library_name: peft
4
  tags:
5
  - generated_from_trainer
6
+ base_model: meta-llama/Meta-Llama-3-8B-Instruct
7
  model-index:
8
  - name: shawgpt-ft
9
  results: []
 
14
 
15
  # shawgpt-ft
16
 
17
+ This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 2.3855
20
 
21
  ## Model description
22
 
 
49
 
50
  ### Training results
51
 
52
+ | Training Loss | Epoch | Step | Validation Loss |
53
+ |:-------------:|:------:|:----:|:---------------:|
54
+ | 4.0204 | 0.9231 | 3 | 3.4582 |
55
+ | 3.8317 | 1.8462 | 6 | 3.2753 |
56
+ | 3.6194 | 2.7692 | 9 | 3.1305 |
57
+ | 2.5725 | 4.0 | 13 | 2.9379 |
58
+ | 3.2435 | 4.9231 | 16 | 2.7861 |
59
+ | 3.0346 | 5.8462 | 19 | 2.6554 |
60
+ | 2.8818 | 6.7692 | 22 | 2.5453 |
61
+ | 2.0646 | 8.0 | 26 | 2.4340 |
62
+ | 2.6678 | 8.9231 | 29 | 2.3907 |
63
+ | 1.8536 | 9.2308 | 30 | 2.3855 |
64
 
65
 
66
  ### Framework versions
67
 
68
  - PEFT 0.11.1
69
+ - Transformers 4.42.3
70
  - Pytorch 2.1.0+cu121
71
+ - Datasets 2.20.0
72
  - Tokenizers 0.19.1
adapter_config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "alpha_pattern": {},
3
  "auto_mapping": null,
4
- "base_model_name_or_path": "TheBloke/Mistral-7B-Instruct-v0.2-GPTQ",
5
  "bias": "none",
6
  "fan_in_fan_out": false,
7
  "inference_mode": true,
 
1
  {
2
  "alpha_pattern": {},
3
  "auto_mapping": null,
4
+ "base_model_name_or_path": "meta-llama/Meta-Llama-3-8B-Instruct",
5
  "bias": "none",
6
  "fan_in_fan_out": false,
7
  "inference_mode": true,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:030e1d37378d340e34dbd6baa413a02581eed78e0cb60597e6a44296e86e4c78
3
  size 8397056
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e6333943ac3447250b6c80e0a602d45b7693b00aa9dc2a91c75d766b761a5c17
3
  size 8397056
runs/Jul05_16-10-38_palomino/events.out.tfevents.1720195839.palomino.761712.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:82015e951b15a2ab636be789b0efa80d3ab81e9d2db522304ccf432f28705280
3
+ size 10554
runs/Jul05_16-15-38_palomino/events.out.tfevents.1720196138.palomino.761985.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:914727c60c45db8dff41858cb980f8e2be558b9888f07018550a84c9ae98ba46
3
+ size 9983
runs/Jul05_16-18-14_palomino/events.out.tfevents.1720196294.palomino.762255.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:019bd5401837fb3d8733bdd38b642f0a2d11a065f3f94c7996198f3dee1c9aaa
3
+ size 9983
runs/Jul05_16-19-54_palomino/events.out.tfevents.1720196394.palomino.762468.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:27173df435006d1ed8e4dd4fa2a15c6dfba4b42ae0270316d08180b00c583d69
3
+ size 9983
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0e3ebe66cf45512e676193ad16f662f3110c343362676da008ab8fa00c866d8f
3
- size 4984
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:613400c88a1864f1cadcdaa20e9a7b355cd6ac9da324e91f927acc5455dc1fbe
3
+ size 5112