LLaMA 33b finetuned on wikitext_document_level
with a linear ROPE scaling of 8, for a 16k token context length.
This is a merged version of llama33b-16k-qlora.
Note that this is not an instruct model - this is base LLaMA with an extended sequence length.
- Downloads last month
- 307
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.