--- license: apache-2.0 datasets: - argilla/distilabel-intel-orca-dpo-pairs library_name: transformers pipeline_tag: text-generation --- # Chikuma_10.7B - V2 (Enhanced with DPO)


This model is the **DPO fine tuned version** of [Chikuma_10.7B](https://huggingface.co/sethuiyer/Chikuma_10.7B), which was a depth upscaled merge of: * [sethuiyer/SynthIQ-7b](https://huggingface.co/sethuiyer/SynthIQ-7b) * [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) The name "Chikuma" is inspired by the [Chikuma River](https://en.wikipedia.org/wiki/Shinano_River), the longest in Japan, known for its continuous flow and meandering path. This metaphorically represents the model's depth, fluidity, and adaptability in processing and understanding language. # Dataset used for Fine Tuning Dataset: `/argilla/distilabel-intel-orca-dpo-pairs` The dataset was roughly ~3000 samples but they were high quality (according to the chosen_score). The following filters were applied to the original dataset: ```python dataset = dataset.filter( lambda r: r["status"] != "tie" and r["chosen_score"] >= 8 and not r["in_gsm8k_train"] ) ``` # Chat Template The chat template for Chikuma_10.7B - V2 is a modified version of ChatML, optimized for improved interaction and engagement: ``` <|im_start|>GPT4 Correct system: {system} Always use <|end_of_turn|> when you want to end the answer. <|im_end|> <|im_start|>GPT4 Correct user: {user}<|im_end|> <|im_start|>GPT4 Correct Assistant: {asistant}<|im_end|> ``` ### Training Environment - Hardware: Single A100 80GB GPU in a runpod, utilized for approximately 1.5 hours. - Training Script: Accessible via [Google Colab Notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing). Special thanks to [mlabonne](https://huggingface.co/mlabonne) for providing the template. ## Usage ```python # Format prompt from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained(new_model) # Create pipeline pipeline = transformers.pipeline( "text-generation", model=new_model, tokenizer=tokenizer, device="cuda" ) # Generate text message = [ {"role": "system", "content": "You are a helpful assistant chatbot. Always use <|end_of_turn|> when you want to end the answer."}, {"role": "user", "content": "What is large language model?"} ] prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False) sequences = pipeline( prompt, do_sample=True, temperature=0.7, top_p=0.9, num_return_sequences=1, max_length=512, ) print(sequences[0]['generated_text']) ``` ## Things in Pipeline: 1. Manual Testing and Evaluation against GPT-4 on text-generation-webui across 45 sample complex prompts. 2. Nous Benchmark 3. GGUF Format 4. Ollama Model (if model benchmarks are good) ## Acknowledgements A heartfelt appreciation goes to the vibrant open-source community, particularly: * The Intel team for publishing a great open dataset and show how well it worked in the first place * Teknium and NousResearch for their awesome work and models. * Maxime for sharing such great resources. * Argilla for publishing argilla/distilabel-intel-orca-dpo-pairs