Edit model card

Model Card for gpt-sw3-6.7b-v2-translator-gguf

The gpt-sw3-6.7b-v2-translator is a finetuned version of gpt-sw3-6.7b-v2-instruct on a carefully selected translation pair dataset that was gathered by AI Sweden.

Intended usage:

Translate text data from English to Swedish, or Swedish to English.

How to use:

Translate from English to Swedish:

FROM ./gpt-sw3-6-7b-v2-translator-SIZE.gguf
TEMPLATE "<|endoftext|><s>User: Översätt till Svenska från Engelska\n{{ .Prompt }}<s>Bot:"
PARAMETER stop <s>
PARAMETER stop User:

Translate from Swedish to English:

FROM ./gpt-sw3-6-7b-v2-translator-SIZE.gguf
TEMPLATE "<|endoftext|><s>User: Översätt till Engelska från Svenska\n{{ .Prompt }}<s>Bot:"
PARAMETER stop <s>
PARAMETER stop User:

Versions:

gpt-sw3-6-7b-v2-translator-Q4.gguf
gpt-sw3-6-7b-v2-translator-Q4_K_M.gguf
gpt-sw3-6-7b-v2-translator-Q8.gguf
gpt-sw3-6-7b-v2-translator-f16.gguf

Training & Data:

The training was done on 1 NVIDIA DGX using DeepSpeed ZeRO 3 for three epochs on roughly 4GB of carefully selected translation data. It is a full finetune of all of the model parameters.

Epoch Training Loss Evaluation Loss
1 1.309 1.281
2 1.161 1.242
3 1.053 1.219
Downloads last month
184
GGUF
Model size
6.98B params
Architecture
gpt2

4-bit

16-bit

Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for AI-Sweden-Models/gpt-sw3-6.7b-v2-translator-gguf

Quantized
(1)
this model