chlee10 commited on
Commit
9a088d1
โ€ข
1 Parent(s): 7f72575

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -20
README.md CHANGED
@@ -26,26 +26,33 @@ This model is a fine-tuned version of upstage/SOLAR-10.7B-v1.0
26
  The following hyperparameters were used during training:
27
 
28
  ```python
29
- python finetune.py \
30
- --base_model PracticeLLM/Twice-KoSOLAR-16.1B-test \
31
- --data-path kyujinpy/KOR-OpenOrca-Platypus-v3 \
32
- --output_dir ./Twice-KoSOLAR-16.1B-instruct-test \
33
- --batch_size 64 \
34
- --micro_batch_size 1 \
35
- --num_epochs 1 \
36
- --learning_rate 3e-5 \
37
- --cutoff_len 4096 \
38
- --val_set_size 0 \
39
- --lora_r 16 \
40
- --lora_alpha 16 \
41
- --lora_dropout 0.05 \
42
- --lora_target_modules '[q_proj, k_proj, v_proj, o_proj, gate_proj, down_proj, up_proj, lm_head]' \
43
- --train_on_inputs False \
44
- --add_eos_token False \
45
- --group_by_length False \
46
- --prompt_template_name user_prompt \
47
- --lr_scheduler 'cosine' \
48
- #--warmup_steps 100 \
 
 
 
 
 
 
 
49
  ```
50
 
51
  ## Framework versions
 
26
  The following hyperparameters were used during training:
27
 
28
  ```python
29
+ # ๋ฐ์ดํ„ฐ์…‹๊ณผ ํ›ˆ๋ จ ํšŸ์ˆ˜์™€ ๊ด€๋ จ๋œ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ
30
+ batch_size = 16
31
+ num_epochs = 1
32
+ micro_batch = 1
33
+ gradient_accumulation_steps = batch_size // micro_batch
34
+
35
+ # ํ›ˆ๋ จ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ
36
+ cutoff_len = 4096
37
+ lr_scheduler = 'cosine'
38
+ warmup_ratio = 0.06 # warmup_steps = 100
39
+ learning_rate = 4e-4
40
+ optimizer = 'adamw_torch'
41
+ weight_decay = 0.01
42
+ max_grad_norm = 1.0
43
+
44
+ # LoRA config
45
+ lora_r = 16
46
+ lora_alpha = 16
47
+ lora_dropout = 0.05
48
+ lora_target_modules = ["gate_proj", "down_proj", "up_proj"]
49
+
50
+ # Tokenizer์—์„œ ๋‚˜์˜ค๋Š” input๊ฐ’ ์„ค์ • ์˜ต์…˜
51
+ train_on_inputs = False
52
+ add_eos_token = False
53
+
54
+ # NEFTune params
55
+ noise_alpha: int = 5
56
  ```
57
 
58
  ## Framework versions