Edit model card

llm-jp-1.3b-v1.0-aya

llm-jp's llm-jp-1.3b-v1.0 model fine-tuned on the Japanese examples from Cohere's aya dataset

Model llm-jp-eval AVG
kcoopermiller/llm-jp-1.3b-v1.0-aya 0.0698
llm-jp/llm-jp-1.3b-v1.0 0.047

How to use

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("kcoopermiller/llm-jp-1.3b-v1.0-aya")
model = AutoModelForCausalLM.from_pretrained("kcoopermiller/llm-jp-1.3b-v1.0-aya", device_map="auto")
text = "自然言語処理とは何か"
tokenized_input = tokenizer.encode(text, add_special_tokens=False, return_tensors="pt").to(model.device)
with torch.no_grad():
    output = model.generate(
        tokenized_input,
        max_new_tokens=20,
        do_sample=True,
        top_p=0.90,
        temperature=0.7,
    )[0]
print(tokenizer.decode(output))
Downloads last month
14
Safetensors
Model size
1.32B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kcoopermiller/llm-jp-1.3b-v1.0-aya

Merges
4 models

Dataset used to train kcoopermiller/llm-jp-1.3b-v1.0-aya