Edit model card

CodeMind-Llama3.1-8B-unsloth

Codemind Project is a language model developed to assist in solving and learning coding test problems. This model is fine-tuned using posts written by LeetCode users as training data, aiming to provide answers specialized for coding tests.

Model Information

  • Base Model: meta-llama/Meta-Llama-3.1-8B-Instruct
  • Fine-Tuning: Fine-tuned using the unsloth library based on the unsloth/Meta-Llama-3.1-8B-Instruct model
  • Fine-Tuning Process: Conducted with reference to the Llama 3.1 Conversational_notebook

Dataset Used

How to Use the Model

This model is accessible through HuggingFace's model hub and can be integrated into applications using the API. It is designed to generate explanations, code snippets, or guides for coding problems or programming-related questions.

# 자세한 사항은 demo-Llama3.1.ipynb 확인
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
from IPython.display import display, Markdown

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "LimYeri/CodeMind-Llama3.1-8B-unsloth", # YOUR MODEL YOU USED FOR TRAINING
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

tokenizer = get_chat_template(
    tokenizer,
    chat_template = "llama-3.1",
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference

messages = [
    {"role": "system", "content": "You are a kind coding test teacher."},
    {"role": "user", "content": "Enter your coding problem or question here."},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True, # Must add for generation
    return_tensors = "pt",
).to("cuda")

outputs = model.generate(input_ids = inputs, max_new_tokens = 3000, use_cache = True,
                         temperature = 0.5, min_p = 0.3) # Feel free to adjust the temperature and min_p
text = (tokenizer.batch_decode(outputs))[0].split('assistant<|end_header_id|>\n\n')[1].strip()
display(Markdown(text))

LoRA Configuration

  • r: 16
  • lora_alpha: 16
  • lora_dropout: 0
  • bias: "none"
  • use_gradient_checkpointing: "unsloth"

Training Settings

  • Per Device Train Batch Size: 8
  • Gradient Accumulation Steps: 2
  • Warmup Steps: 200
  • Number of Training Epochs: 5
  • Learning Rate: 2e-4
  • fp16: not is_bfloat16_supported()
  • bf16: is_bfloat16_supported()
  • Logging Steps: 20
  • Optimizer: "adamw_8bit"
  • Weight Decay: 0.01
  • LR Scheduler Type: "linear"

Evaluation Results Open LLM Leaderboard

Metric Value
Average 22.17
IFEval 64.9
BBH 24.19
MATH Lvl 5 9.97
GPQA 1.9
MUSR 6.04
MMLU-PRO 26

Fine-Tuning Code

Detailed fine-tuning code and settings can be found in the CodeMind-Extended GitHub repository.

Downloads last month
12
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for LimYeri/CodeMind-Llama3.1-8B-unsloth-merged

Quantizations
3 models

Dataset used to train LimYeri/CodeMind-Llama3.1-8B-unsloth-merged

Collection including LimYeri/CodeMind-Llama3.1-8B-unsloth-merged