Model description
This is a Vicuna-like model with only 160M parameters, which is fine-tuned from LLaMA-160m on ShareGPT data.
The training setup follows the Vicuna suite.
The model is mainly developed as a base Small Speculative Model in MCSD paper. As a comparison, it can be better aligned to the Vicuna models than LLaMA-160m with little loss of alignment to the LLaMA models.
Draft Model | Target Model | Alignment |
---|---|---|
LLaMA-68/160M | LLaMA-13/33B | π |
LLaMA-68/160M | Vicuna-13/33B | π |
Vicuna-68/160M | LLaMA-13/33B | π |
Vicuna-68/160M | Vicuna-13/33B | π |
- Downloads last month
- 8,034
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.