--- license: llama3 library_name: peft tags: - generated_from_trainer base_model: meta-llama/Meta-Llama-3-8B-Instruct model-index: - name: Llama3_devops results: [] --- # Llama3_devops This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.3252 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - lr_scheduler_warmup_steps: 100 - training_steps: 12001 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:-----:|:---------------:| | 1.4137 | 0.0612 | 100 | 1.7127 | | 1.4861 | 0.1224 | 200 | 1.6550 | | 1.3809 | 0.1837 | 300 | 1.6320 | | 1.6918 | 0.2449 | 400 | 1.6155 | | 1.5341 | 0.3061 | 500 | 1.6085 | | 1.326 | 0.3673 | 600 | 1.6069 | | 1.4157 | 0.4285 | 700 | 1.6039 | | 1.477 | 0.4897 | 800 | 1.5980 | | 2.091 | 0.5510 | 900 | 1.5930 | | 1.4464 | 0.6122 | 1000 | 1.5901 | | 1.5648 | 0.6734 | 1100 | 1.5888 | | 1.7804 | 0.7346 | 1200 | 1.5885 | | 1.7443 | 0.7958 | 1300 | 1.5874 | | 1.721 | 0.8571 | 1400 | 1.5850 | | 1.5615 | 0.9183 | 1500 | 1.5828 | | 1.5138 | 0.9795 | 1600 | 1.5816 | | 2.0057 | 1.0407 | 1700 | 1.5811 | | 1.6474 | 1.1019 | 1800 | 1.5811 | | 1.8227 | 1.1635 | 1900 | 1.5812 | | 1.3724 | 1.2247 | 2000 | 1.5799 | | 1.2722 | 1.2859 | 2100 | 1.5790 | | 1.5611 | 1.3471 | 2200 | 1.5784 | | 1.5327 | 1.4083 | 2300 | 1.5782 | | 1.5264 | 1.4695 | 2400 | 1.5782 | | 1.5766 | 1.5308 | 2500 | 1.5779 | | 1.7018 | 1.5920 | 2600 | 1.5772 | | 1.201 | 1.6532 | 2700 | 1.5765 | | 1.4864 | 1.7144 | 2800 | 1.5762 | | 1.2907 | 1.7756 | 2900 | 1.5760 | | 1.6052 | 1.8369 | 3000 | 1.5760 | | 1.3841 | 1.3711 | 3100 | 1.3650 | | 1.3509 | 1.4153 | 3200 | 1.3555 | | 1.349 | 1.4595 | 3300 | 1.3518 | | 1.4748 | 1.5038 | 3400 | 1.3499 | | 1.0276 | 1.5480 | 3500 | 1.3492 | | 1.3901 | 1.5922 | 3600 | 1.3491 | | 1.2557 | 1.6364 | 3700 | 1.3447 | | 1.146 | 1.6807 | 3800 | 1.3422 | | 1.3166 | 1.7249 | 3900 | 1.3408 | | 1.4498 | 1.7691 | 4000 | 1.3401 | | 1.2284 | 1.8134 | 4100 | 1.3399 | | 1.2182 | 1.8576 | 4200 | 1.3398 | | 1.2163 | 1.9018 | 4300 | 1.3379 | | 1.2242 | 1.9460 | 4400 | 1.3367 | | 1.2829 | 1.9903 | 4500 | 1.3360 | | 1.214 | 2.0345 | 4600 | 1.3356 | | 1.2161 | 2.0787 | 4700 | 1.3355 | | 1.2942 | 2.1230 | 4800 | 1.3355 | | 1.2288 | 2.1672 | 4900 | 1.3343 | | 1.3177 | 2.2114 | 5000 | 1.3337 | | 1.3833 | 2.2556 | 5100 | 1.3332 | | 1.658 | 2.2999 | 5200 | 1.3329 | | 1.3888 | 2.3441 | 5300 | 1.3329 | | 1.3027 | 2.3883 | 5400 | 1.3328 | | 1.4974 | 2.4326 | 5500 | 1.3321 | | 1.1546 | 2.4768 | 5600 | 1.3316 | | 1.2156 | 2.5210 | 5700 | 1.3313 | | 1.3549 | 2.5652 | 5800 | 1.3311 | | 1.3213 | 2.6095 | 5900 | 1.3310 | | 1.3492 | 2.6537 | 6000 | 1.3310 | | 1.3454 | 2.6979 | 6100 | 1.3306 | | 1.4238 | 2.7421 | 6200 | 1.3302 | | 1.4476 | 2.7864 | 6300 | 1.3299 | | 1.2525 | 2.8306 | 6400 | 1.3298 | | 1.343 | 2.8748 | 6500 | 1.3298 | | 1.3299 | 2.9191 | 6600 | 1.3298 | | 1.4081 | 2.9633 | 6700 | 1.3293 | | 1.4621 | 3.0075 | 6800 | 1.3290 | | 1.0876 | 3.0517 | 6900 | 1.3289 | | 1.3061 | 3.0960 | 7000 | 1.3288 | | 1.2202 | 3.1402 | 7100 | 1.3287 | | 1.3105 | 3.1844 | 7200 | 1.3287 | | 1.3631 | 3.2287 | 7300 | 1.3284 | | 1.3136 | 3.2729 | 7400 | 1.3282 | | 1.442 | 3.3171 | 7500 | 1.3281 | | 1.3141 | 3.3613 | 7600 | 1.3280 | | 1.3445 | 3.4056 | 7700 | 1.3280 | | 1.2843 | 3.4498 | 7800 | 1.3279 | | 1.342 | 3.4940 | 7900 | 1.3277 | | 1.2877 | 3.5383 | 8000 | 1.3275 | | 1.4434 | 3.5825 | 8100 | 1.3274 | | 1.2827 | 3.6267 | 8200 | 1.3273 | | 1.1758 | 3.6709 | 8300 | 1.3273 | | 1.3382 | 3.7152 | 8400 | 1.3273 | | 1.2126 | 3.7594 | 8500 | 1.3271 | | 1.4859 | 3.8036 | 8600 | 1.3270 | | 1.1627 | 3.8479 | 8700 | 1.3269 | | 1.5215 | 3.8921 | 8800 | 1.3268 | | 1.6232 | 3.9363 | 8900 | 1.3268 | | 1.3434 | 3.9805 | 9000 | 1.3268 | | 1.1927 | 4.0248 | 9100 | 1.3267 | | 1.2415 | 4.0690 | 9200 | 1.3265 | | 1.1639 | 4.1132 | 9300 | 1.3264 | | 1.2402 | 4.1575 | 9400 | 1.3264 | | 1.295 | 4.2017 | 9500 | 1.3264 | | 1.1189 | 4.2459 | 9600 | 1.3264 | | 1.2794 | 4.2901 | 9700 | 1.3263 | | 1.1904 | 4.3344 | 9800 | 1.3261 | | 1.1547 | 4.3786 | 9900 | 1.3261 | | 1.3298 | 4.4228 | 10000 | 1.3260 | | 1.1915 | 4.4670 | 10100 | 1.3260 | | 1.2256 | 4.5113 | 10200 | 1.3260 | | 1.3068 | 4.5555 | 10300 | 1.3259 | | 1.5124 | 4.5997 | 10400 | 1.3258 | | 1.3894 | 4.6440 | 10500 | 1.3258 | | 1.1934 | 4.6882 | 10600 | 1.3257 | | 1.2746 | 4.7324 | 10700 | 1.3257 | | 1.2689 | 4.7766 | 10800 | 1.3257 | | 1.3315 | 4.8209 | 10900 | 1.3256 | | 1.4784 | 4.8651 | 11000 | 1.3255 | | 1.2925 | 4.9093 | 11100 | 1.3255 | | 1.2004 | 4.9536 | 11200 | 1.3254 | | 1.4289 | 4.9978 | 11300 | 1.3254 | | 1.354 | 5.0420 | 11400 | 1.3254 | | 1.1891 | 5.0862 | 11500 | 1.3253 | | 1.3498 | 5.1305 | 11600 | 1.3253 | | 1.3814 | 5.1747 | 11700 | 1.3252 | | 1.4559 | 5.2189 | 11800 | 1.3252 | | 1.2006 | 5.2632 | 11900 | 1.3252 | | 1.3107 | 5.3074 | 12000 | 1.3252 | ### Framework versions - PEFT 0.11.1 - Transformers 4.41.2 - Pytorch 2.1.2 - Datasets 2.18.0 - Tokenizers 0.19.1