metadata
license: other
license_name: qwen
license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
language:
- en
pipeline_tag: text-generation
base_model: Qwen/Qwen2.5-72B
tags:
- chat
library_name: transformers
Shuttle-3 (beta) [2024/10/25]
We are excited to introduce Shuttle-3, our next-generation state-of-the-art language model designed to excel in complex chat, multilingual communication, reasoning, and agent tasks.
- Shuttle-3 is a fine-tuned version of Qwen-2.5-72b-Instruct, emulating the writing style of Claude 3 models and thoroughly trained on role-playing data.
Model Details
- Model Name: Shuttle-3
- Developed by: ShuttleAI Inc.
- Base Model: Qwen-2.5-72b-Instruct
- Parameters: 72B
- Language(s): Multilingual
- Repository: https://huggingface.co/shuttleai
- Fine-Tuned Model: https://huggingface.co/shuttleai/shuttle-3
Key Features
- Pretrained on a large proportion of multilingual and code data
- Finetuned to emulate the prose quality of Claude 3 models and extensively on role play data
Fine-Tuning Details
- Training Setup: Trained on 130 million tokens for 12 hours using 4 A100 PCIe GPUs.
Prompting
Shuttle-3 uses ChatML as its prompting format:
<|im_start|>system
You are a pirate! Yardy harr harr!<|im_end|>
<|im_start|>user
Where are you currently!<|im_end|>
<|im_start|>assistant
Look ahoy ye scallywag! We're on the high seas!<|im_end|>