End of training

Browse files

Files changed (5) hide show

README.md +27 -27
adapter_model.safetensors +1 -1
emissions.csv +1 -1
runs/Jul17_16-52-03_msc-modeltrain-pod/events.out.tfevents.1721235127.msc-modeltrain-pod.1963.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.9224
 ## Model description
@@ -49,7 +49,7 @@ The following `bitsandbytes` quantization config was used during training:
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 3e-05
 - train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
@@ -64,31 +64,31 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 3.791         | 1.33  | 10   | 3.6476          |
-| 3.2811        | 2.67  | 20   | 2.9195          |
-| 2.3899        | 4.0   | 30   | 1.8723          |
-| 1.5443        | 5.33  | 40   | 1.3519          |
-| 1.2394        | 6.67  | 50   | 1.1884          |
-| 1.1162        | 8.0   | 60   | 1.1023          |
-| 1.0377        | 9.33  | 70   | 1.0551          |
-| 0.9831        | 10.67 | 80   | 1.0228          |
-| 0.9476        | 12.0  | 90   | 0.9988          |
-| 0.9032        | 13.33 | 100  | 0.9850          |
-| 0.8799        | 14.67 | 110  | 0.9668          |
-| 0.8581        | 16.0  | 120  | 0.9503          |
-| 0.8315        | 17.33 | 130  | 0.9457          |
-| 0.8077        | 18.67 | 140  | 0.9422          |
-| 0.7921        | 20.0  | 150  | 0.9362          |
-| 0.7752        | 21.33 | 160  | 0.9318          |
-| 0.7614        | 22.67 | 170  | 0.9306          |
-| 0.7559        | 24.0  | 180  | 0.9233          |
-| 0.7441        | 25.33 | 190  | 0.9237          |
-| 0.7345        | 26.67 | 200  | 0.9237          |
-| 0.7341        | 28.0  | 210  | 0.9205          |
-| 0.7288        | 29.33 | 220  | 0.9195          |
-| 0.7237        | 30.67 | 230  | 0.9219          |
-| 0.7255        | 32.0  | 240  | 0.9210          |
-| 0.7273        | 33.33 | 250  | 0.9224          |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.5993
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0002
 - train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.329         | 1.33  | 10   | 1.8003          |
+| 1.296         | 2.67  | 20   | 1.0774          |
+| 0.9489        | 4.0   | 30   | 0.9022          |
+| 0.7167        | 5.33  | 40   | 0.7270          |
+| 0.552         | 6.67  | 50   | 0.7372          |
+| 0.4766        | 8.0   | 60   | 0.7281          |
+| 0.4153        | 9.33  | 70   | 0.7673          |
+| 0.3614        | 10.67 | 80   | 0.8597          |
+| 0.3238        | 12.0  | 90   | 0.8915          |
+| 0.2923        | 13.33 | 100  | 0.9281          |
+| 0.2648        | 14.67 | 110  | 1.0239          |
+| 0.2483        | 16.0  | 120  | 1.0198          |
+| 0.2311        | 17.33 | 130  | 1.1314          |
+| 0.2196        | 18.67 | 140  | 1.2578          |
+| 0.2109        | 20.0  | 150  | 1.3155          |
+| 0.1997        | 21.33 | 160  | 1.2602          |
+| 0.1927        | 22.67 | 170  | 1.4758          |
+| 0.191         | 24.0  | 180  | 1.4080          |
+| 0.1834        | 25.33 | 190  | 1.4783          |
+| 0.1799        | 26.67 | 200  | 1.5217          |
+| 0.1796        | 28.0  | 210  | 1.5525          |
+| 0.1738        | 29.33 | 220  | 1.5714          |
+| 0.1725        | 30.67 | 230  | 1.5953          |
+| 0.1727        | 32.0  | 240  | 1.5980          |
+| 0.172         | 33.33 | 250  | 1.5993          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a5e4c8865b08c12ceac25c011d87948ac0e7211cabbde5084dc52c8b2b6ab979
 size 75523312

 version https://git-lfs.github.com/spec/v1
+oid sha256:cdd34584c6acc5f0d9ac55c6e1190f8f78557a7c4ce7d5d4bea77352db827001
 size 75523312

emissions.csv CHANGED Viewed

	@@ -1,2 +1,2 @@
1	timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2	- 2024-07-~~17T16~~:49:56,~~14ad89ba~~-~~873d~~-~~4177~~-~~8763~~-~~006e7acb7e4e~~,codecarbon,~~708~~.~~9762728214264~~,0.~~049197013388741065~~,0.~~07319796237858917~~,United Kingdom,GBR,scotland,N,,


1	timestamp,experiment_id,project_name,duration,emissions,energy_consumed,country_name,country_iso_code,region,on_cloud,cloud_provider,cloud_region
2	+ 2024-07-17T17:04:02,acf495a5-fe7d-4741-820f-f1df91f69def,codecarbon,714.2170021533966,0.0483531364122497,0.07194239682851639,United Kingdom,GBR,scotland,N,,

runs/Jul17_16-52-03_msc-modeltrain-pod/events.out.tfevents.1721235127.msc-modeltrain-pod.1963.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b0d440fd270502e1558e2529598331a95c504da06d0caa214e6192f6b986db1c
+size 17469

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b9a734bbf314d44fae2a458e8b7d37bcf2e0accb9a4366f729a3412790bc98da
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:ef8e5dc5533306720bffeb0cf13b6ace0ad8000b5f6b5240b681a005a13a301e
 size 4984