⭐My custom LLM 13B⭐

Model Details

Model Developers

Kyujin Han (kyujinpy)

Model Architecture

My custom LLM 13B is an auto-regressive language model based on the LLaMA2 transformer architecture.

Base Model

beomi/llama-2-koen-13b

Training Dataset

kyujinpy/OpenOrca-ko-v3.

Model comparisons

Ko-LLM leaderboard(11/27; link)

Model	Average	Ko-ARC	Ko-HellaSwag	Ko-MMLU	Ko-TruthfulQA	Ko-CommonGen V2
⭐My custom LLM 13B-v1⭐	50.19	45.99	56.93	41.78	41.66	64.58
⭐My custom LLM 13B-v4⭐	49.89	45.05	57.06	41.83	42.93	62.57
⭐My custom LLM 13B-v8⭐	49.84	45.65	56.98	41.37	41.42	59.50

Implementation Code

### KO-Platypus
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

repo = "PracticeLLM/Custom-KoLLM-13B-v8"
OpenOrca = AutoModelForCausalLM.from_pretrained(
        repo,
        return_dict=True,
        torch_dtype=torch.float16,
        device_map='auto'
)
OpenOrca_tokenizer = AutoTokenizer.from_pretrained(repo)

Hyperparameters

QLoRA
lora_target_modules '[gate_proj, down_proj, up_proj]'
lora_r 64

PracticeLLM
/

Custom-KoLLM-13B-v8

⭐My custom LLM 13B⭐

Model Details

Model comparisons

Implementation Code

Hyperparameters

Dataset used to train PracticeLLM/Custom-KoLLM-13B-v8