End of training

Browse files

Files changed (5) hide show

README.md +40 -26
adapter_model.safetensors +1 -1
emissions.csv +1 -1
runs/Jul29_16-35-55_msc-modeltrain-pod/events.out.tfevents.1722270959.msc-modeltrain-pod.10499.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4797
 ## Model description
@@ -32,6 +32,20 @@ More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -50,31 +64,31 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 3.365         | 1.36  | 10   | 2.0638          |
-| 1.3671        | 2.71  | 20   | 0.9814          |
-| 0.817         | 4.07  | 30   | 0.7618          |
-| 0.6648        | 5.42  | 40   | 0.7134          |
-| 0.5897        | 6.78  | 50   | 0.6871          |
-| 0.5076        | 8.14  | 60   | 0.6776          |
-| 0.4545        | 9.49  | 70   | 0.7360          |
-| 0.4059        | 10.85 | 80   | 0.7673          |
-| 0.3544        | 12.2  | 90   | 0.8158          |
-| 0.3161        | 13.56 | 100  | 0.8801          |
-| 0.2844        | 14.92 | 110  | 0.9591          |
-| 0.259         | 16.27 | 120  | 0.9817          |
-| 0.2405        | 17.63 | 130  | 1.0922          |
-| 0.2298        | 18.98 | 140  | 1.1705          |
-| 0.2125        | 20.34 | 150  | 1.1817          |
-| 0.2073        | 21.69 | 160  | 1.2862          |
-| 0.1998        | 23.05 | 170  | 1.3352          |
-| 0.1912        | 24.41 | 180  | 1.3434          |
-| 0.1883        | 25.76 | 190  | 1.4113          |
-| 0.1851        | 27.12 | 200  | 1.4113          |
-| 0.1796        | 28.47 | 210  | 1.4654          |
-| 0.1805        | 29.83 | 220  | 1.4565          |
-| 0.1768        | 31.19 | 230  | 1.4650          |
-| 0.1763        | 32.54 | 240  | 1.4769          |
-| 0.1752        | 33.9  | 250  | 1.4797          |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.4554
 ## Model description
 ## Training procedure
+The following `bitsandbytes` quantization config was used during training:
+- quant_method: bitsandbytes
+- _load_in_8bit: False
+- _load_in_4bit: True
+- llm_int8_threshold: 6.0
+- llm_int8_skip_modules: None
+- llm_int8_enable_fp32_cpu_offload: False
+- llm_int8_has_fp16_weight: False
+- bnb_4bit_quant_type: nf4
+- bnb_4bit_use_double_quant: True
+- bnb_4bit_compute_dtype: bfloat16
+- load_in_4bit: True
+- load_in_8bit: False
 ### Training hyperparameters
 The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.4063        | 1.36  | 10   | 2.0249          |
+| 1.4234        | 2.71  | 20   | 1.1088          |
+| 0.9874        | 4.07  | 30   | 0.8900          |
+| 0.7207        | 5.42  | 40   | 0.6961          |
+| 0.5784        | 6.78  | 50   | 0.6823          |
+| 0.5088        | 8.14  | 60   | 0.6767          |
+| 0.4453        | 9.49  | 70   | 0.7067          |
+| 0.3935        | 10.85 | 80   | 0.7432          |
+| 0.3417        | 12.2  | 90   | 0.8008          |
+| 0.3026        | 13.56 | 100  | 0.9167          |
+| 0.2754        | 14.92 | 110  | 0.9432          |
+| 0.2507        | 16.27 | 120  | 0.9834          |
+| 0.2359        | 17.63 | 130  | 1.0581          |
+| 0.2213        | 18.98 | 140  | 1.1612          |
+| 0.2075        | 20.34 | 150  | 1.1553          |
+| 0.2011        | 21.69 | 160  | 1.3062          |
+| 0.1959        | 23.05 | 170  | 1.3247          |
+| 0.1891        | 24.41 | 180  | 1.3318          |
+| 0.1865        | 25.76 | 190  | 1.3603          |
+| 0.1825        | 27.12 | 200  | 1.3980          |
+| 0.1797        | 28.47 | 210  | 1.4180          |
+| 0.178         | 29.83 | 220  | 1.4311          |
+| 0.176         | 31.19 | 230  | 1.4476          |
+| 0.1748        | 32.54 | 240  | 1.4538          |
+| 0.1753        | 33.9  | 250  | 1.4554          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7baefad0749b95301d1ef5729beb829eccb0fff2ef98c4b1dfe752fecdb4b7cf
 size 151020944

 version https://git-lfs.github.com/spec/v1
+oid sha256:0ea850e17a642a80a9f6054cba639a863c38fd5ee587fc288a24e2a510e28b46
 size 151020944

emissions.csv CHANGED Viewed

	@@ -1,2 +1,2 @@
1	timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2	- 2024-07-~~25T20~~:07:05,~~71cb2d14~~-~~a3e9~~-~~44f2~~-~~9adf~~-~~aa99d60af3f0~~,codecarbon,~~6487~~.~~100156784058~~,0.~~3647758733647877~~,0.~~5427331623606918~~,United Kingdom,GBR,scotland,N,,


1	timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2	+ 2024-07-29T16:59:42,0b1b335d-594e-49e2-84c9-dbc256266dc6,codecarbon,1422.8740487098694,0.08038502085419474,0.11960115720427114,United Kingdom,GBR,scotland,N,,

runs/Jul29_16-35-55_msc-modeltrain-pod/events.out.tfevents.1722270959.msc-modeltrain-pod.10499.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7d705f844566094626f09e436f750a967f03599708c798d76a36bfc00dfd78e5
+size 17477

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fa99f7e0c9cee069a0ee479ad9d6186ca6da7c27642e43b0a7cf82a3fc09d7e6
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:73cf781d850b973106fdd5e079cb1a7baf30c96f78fa7dac138c5e1e1cf3d9a6
 size 4984