--- license: mit library_name: peft tags: - llama-factory - lora - generated_from_trainer base_model: Qwen/Qwen2-7B-Instruct model-index: - name: '06072000' results: [] --- # 06072000 This model is a fine-tuned version of [Qwen/Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) on the alpaca_formatted_ift_eft_dft_rft_share5k_2048 dataset. It achieves the following results on the evaluation set: - Loss: 0.8293 ## Model description Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 7B Qwen2 model. Compared with the state-of-the-art opensource language models, including the previous released Qwen1.5, Qwen2 has generally surpassed most opensource models and demonstrated competitiveness against proprietary models across a series of benchmarks targeting for language understanding, language generation, multilingual capability, coding, mathematics, reasoning, etc. Qwen2-7B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs. Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2 for handling long texts. For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2/), [GitHub](https://github.com/QwenLM/Qwen2), and [Documentation](https://qwen.readthedocs.io/en/latest/). ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - distributed_type: multi-GPU - num_devices: 2 - gradient_accumulation_steps: 4 - total_train_batch_size: 16 - total_eval_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 700 - num_epochs: 5.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 0.7753 | 0.9586 | 700 | 0.8477 | | 0.7738 | 1.9172 | 1400 | 0.8355 | | 0.5842 | 2.8757 | 2100 | 0.8306 | | 0.9188 | 3.8343 | 2800 | 0.8303 | | 0.8923 | 4.7929 | 3500 | 0.8290 | ### Framework versions - PEFT 0.10.0 - Transformers 4.40.0 - Pytorch 2.1.0+cu121 - Datasets 2.14.5 - Tokenizers 0.19.1