chlee10's picture
Update README.md
7f72575 verified
|
raw
history blame
1.34 kB
metadata
pipeline_tag: text-generation
license: apache-2.0
language:
  - en
tags:
  - SOLAR-10.7B-v1.0
  - Open-platypus-Commercial
base_model: upstage/SOLAR-10.7B-v1.0
datasets:
  - kyujinpy/Open-platypus-Commercial
model-index:
  - name: T3Q-platypus-SOLAR-10.7B-v1.0
    results: []

Update @ 2024.03.07

T3Q-platypus-SOLAR-10.7B-v1.0

This model is a fine-tuned version of upstage/SOLAR-10.7B-v1.0

Model Developers Chihoon Lee(chlee10), T3Q

Training hyperparameters

The following hyperparameters were used during training:

python finetune.py \
    --base_model PracticeLLM/Twice-KoSOLAR-16.1B-test \
    --data-path  kyujinpy/KOR-OpenOrca-Platypus-v3 \
    --output_dir ./Twice-KoSOLAR-16.1B-instruct-test \
    --batch_size 64 \
    --micro_batch_size 1 \
    --num_epochs 1 \
    --learning_rate 3e-5 \
    --cutoff_len 4096 \
    --val_set_size 0 \
    --lora_r 16 \
    --lora_alpha 16 \
    --lora_dropout 0.05 \
    --lora_target_modules '[q_proj, k_proj, v_proj, o_proj, gate_proj, down_proj, up_proj, lm_head]' \
    --train_on_inputs False \
    --add_eos_token False \
    --group_by_length False \
    --prompt_template_name user_prompt \
    --lr_scheduler 'cosine' \
    #--warmup_steps 100 \

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.13.0
  • Tokenizers 0.14.1