Edit model card

szürkemarha-mistral v1

Ez az első (teszt) verziója egy magyar nyelvű instrukciókövető modellnek.

Használat

Ebben a repoban van egy app.py script, ami egy gradio felületet csinál a kényelmesebb használathoz.

Vagy kódból valahogy így:

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")

BASE_MODEL = "mistralai/Mistral-7B-v0.1"
LORA_WEIGHTS = "boapps/szurkemarha-mistral"

device = "cuda"

try:
    if torch.backends.mps.is_available():
        device = "mps"
except:
    pass

nf4_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(BASE_MODEL, quantization_config=nf4_config)

model = PeftModel.from_pretrained(
    model, LORA_WEIGHTS, torch_dtype=torch.float16, force_download=True
)

prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Melyik megyében található az alábbi város?

### Input:
Pécs

### Response:"""
inputs = tokenizer(prompt, return_tensors="pt")
input_ids = inputs["input_ids"].to(device)
generation_config = GenerationConfig(
    temperature=0.1,
    top_p=0.75,
    top_k=40,
    num_beams=4,
)
with torch.no_grad():
    generation_output = model.generate(
        input_ids=input_ids,
        generation_config=generation_config,
        return_dict_in_generate=True,
        output_scores=True,
        max_new_tokens=256,
    )
s = generation_output.sequences[0]
output = tokenizer.decode(s)
print(output.split("### Response:")[1].strip())
Downloads last month
21
GGUF
Model size
7.24B params
Architecture
llama
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train boapps/szurkemarha-mistral