End of training

Browse files

Files changed (6) hide show

README.md +27 -41
adapter_config.json +2 -2
adapter_model.safetensors +2 -2
emissions.csv +1 -1
runs/Jul24_22-44-22_msc-modeltrain-pod/events.out.tfevents.1721861066.msc-modeltrain-pod.678.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.7658
 ## Model description
@@ -32,20 +32,6 @@ More information needed
 ## Training procedure
-The following `bitsandbytes` quantization config was used during training:
-- quant_method: bitsandbytes
-- _load_in_8bit: False
-- _load_in_4bit: True
-- llm_int8_threshold: 6.0
-- llm_int8_skip_modules: None
-- llm_int8_enable_fp32_cpu_offload: False
-- llm_int8_has_fp16_weight: False
-- bnb_4bit_quant_type: nf4
-- bnb_4bit_use_double_quant: True
-- bnb_4bit_compute_dtype: bfloat16
-- load_in_4bit: True
-- load_in_8bit: False
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -64,37 +50,37 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 3.7986        | 1.36  | 10   | 3.3486          |
-| 2.781         | 2.71  | 20   | 1.9851          |
-| 1.6096        | 4.07  | 30   | 1.3075          |
-| 1.2107        | 5.42  | 40   | 1.1210          |
-| 1.0597        | 6.78  | 50   | 1.0222          |
-| 0.9672        | 8.14  | 60   | 0.9562          |
-| 0.8924        | 9.49  | 70   | 0.9131          |
-| 0.8189        | 10.85 | 80   | 0.8582          |
-| 0.7393        | 12.2  | 90   | 0.7907          |
-| 0.6355        | 13.56 | 100  | 0.7136          |
-| 0.5683        | 14.92 | 110  | 0.7013          |
-| 0.533         | 16.27 | 120  | 0.7011          |
-| 0.5155        | 17.63 | 130  | 0.7049          |
-| 0.4965        | 18.98 | 140  | 0.7194          |
-| 0.4826        | 20.34 | 150  | 0.7222          |
-| 0.4617        | 21.69 | 160  | 0.7294          |
-| 0.453         | 23.05 | 170  | 0.7347          |
-| 0.439         | 24.41 | 180  | 0.7418          |
-| 0.4333        | 25.76 | 190  | 0.7473          |
-| 0.4261        | 27.12 | 200  | 0.7600          |
-| 0.4238        | 28.47 | 210  | 0.7580          |
-| 0.4163        | 29.83 | 220  | 0.7646          |
-| 0.4158        | 31.19 | 230  | 0.7659          |
-| 0.4137        | 32.54 | 240  | 0.7662          |
-| 0.4131        | 33.9  | 250  | 0.7658          |
 ### Framework versions
 - PEFT 0.4.0
 - Transformers 4.38.2
-- Pytorch 2.3.1+cu121
 - Datasets 2.13.1
 - Tokenizers 0.15.2

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.7124
 ## Model description
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.8363        | 1.36  | 10   | 3.5698          |
+| 3.2454        | 2.71  | 20   | 2.7356          |
+| 2.2867        | 4.07  | 30   | 1.7205          |
+| 1.4623        | 5.42  | 40   | 1.2840          |
+| 1.1723        | 6.78  | 50   | 1.0982          |
+| 1.0295        | 8.14  | 60   | 0.9766          |
+| 0.9085        | 9.49  | 70   | 0.8723          |
+| 0.784         | 10.85 | 80   | 0.7651          |
+| 0.717         | 12.2  | 90   | 0.7394          |
+| 0.6745        | 13.56 | 100  | 0.7235          |
+| 0.6402        | 14.92 | 110  | 0.7157          |
+| 0.6251        | 16.27 | 120  | 0.7089          |
+| 0.5961        | 17.63 | 130  | 0.7100          |
+| 0.5871        | 18.98 | 140  | 0.7042          |
+| 0.5714        | 20.34 | 150  | 0.7070          |
+| 0.5582        | 21.69 | 160  | 0.7062          |
+| 0.5457        | 23.05 | 170  | 0.7076          |
+| 0.5392        | 24.41 | 180  | 0.7094          |
+| 0.5354        | 25.76 | 190  | 0.7100          |
+| 0.5278        | 27.12 | 200  | 0.7105          |
+| 0.5275        | 28.47 | 210  | 0.7110          |
+| 0.5249        | 29.83 | 220  | 0.7123          |
+| 0.5204        | 31.19 | 230  | 0.7123          |
+| 0.5198        | 32.54 | 240  | 0.7123          |
+| 0.5195        | 33.9  | 250  | 0.7124          |
 ### Framework versions
 - PEFT 0.4.0
 - Transformers 4.38.2
+- Pytorch 2.4.0+cu121
 - Datasets 2.13.1
 - Tokenizers 0.15.2

adapter_config.json CHANGED Viewed

@@ -7,11 +7,11 @@
   "init_lora_weights": true,
   "layers_pattern": null,
   "layers_to_transform": null,
-  "lora_alpha": 64,
   "lora_dropout": 0.3,
   "modules_to_save": null,
   "peft_type": "LORA",
-  "r": 32,
   "revision": null,
   "target_modules": [
     "q_proj",

   "init_lora_weights": true,
   "layers_pattern": null,
   "layers_to_transform": null,
+  "lora_alpha": 32,
   "lora_dropout": 0.3,
   "modules_to_save": null,
   "peft_type": "LORA",
+  "r": 16,
   "revision": null,
   "target_modules": [
     "q_proj",

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c79b518cd11b8a6534588d5307e8de14d00c864982e240d6fbb4a42c5c073fee
-size 75523312

 version https://git-lfs.github.com/spec/v1
+oid sha256:853047c7ee6f98c051a455fe53ed043a81d61b9f38d831e7f1d882e9b2d0c0a8
+size 37774528

emissions.csv CHANGED Viewed

	@@ -1,2 +1,2 @@
1	timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2	- 2024-07-~~18T15~~:29:54,~~95beb92d~~-~~f3cb~~-~~419a~~-~~bd73~~-~~35f2aa8381d9~~,codecarbon,~~1287~~.~~520273923874~~,0.~~07720597406987714~~,0.~~11487120042662032~~,United Kingdom,GBR,scotland,N,,


1	timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2	+ 2024-07-25T00:29:17,b81b783c-301a-439a-a0a5-4917c65bc6de,codecarbon,6290.977914571762,0.3557443410838277,0.5292955629092988,United Kingdom,GBR,scotland,N,,

runs/Jul24_22-44-22_msc-modeltrain-pod/events.out.tfevents.1721861066.msc-modeltrain-pod.678.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:64674eeea02a95ded93bfb6003762d75495f8ca12b599682b5fa2540e8d4fb08
+size 17034

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5cef72bd14db9c868844ff3c9d70e303cc81b9c07a101d5f05f0fa45c6adaafe
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:10b3b3a3d7323b4bda4c1a482867d25717c65236d1bd44bb96cd5c9ce33dd107
 size 4984