quantumaikr commited on
Commit
de1e105
โ€ข
1 Parent(s): 6d0ad17

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +137 -1
README.md CHANGED
@@ -8,6 +8,12 @@ tags:
8
  - llama
9
  ---
10
 
 
 
 
 
 
 
11
  # KoreanLM: ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ ํ”„๋กœ์ ํŠธ
12
 
13
  KoreanLM์€ ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๊ธฐ ์œ„ํ•œ ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. ํ˜„์žฌ ๋Œ€๋ถ€๋ถ„์˜ ์–ธ์–ด๋ชจ๋ธ๋“ค์€ ์˜์–ด์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ์–ด, ํ•œ๊ตญ์–ด์— ๋Œ€ํ•œ ํ•™์Šต์ด ์ƒ๋Œ€์ ์œผ๋กœ ๋ถ€์กฑํ•˜๊ณ  ํ† ํฐํ™” ๊ณผ์ •์—์„œ ๋น„ํšจ์œจ์ ์ธ ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  ํ•œ๊ตญ์–ด์— ์ตœ์ ํ™”๋œ ์–ธ์–ด๋ชจ๋ธ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด KoreanLM ํ”„๋กœ์ ํŠธ๋ฅผ ์‹œ์ž‘ํ•˜๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
@@ -19,4 +25,134 @@ KoreanLM์€ ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๊ธฐ ์œ„ํ•œ ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ 
19
 
20
  2. ํšจ์œจ์ ์ธ ํ† ํฐํ™” ๋ฐฉ์‹ ๋„์ž…: ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ์˜ ํ† ํฐํ™” ๊ณผ์ •์—์„œ ํšจ์œจ์ ์ด๊ณ  ์ •ํ™•ํ•œ ๋ถ„์„์ด ๊ฐ€๋Šฅํ•œ ์ƒˆ๋กœ์šด ํ† ํฐํ™” ๋ฐฉ์‹์„ ๋„์ž…ํ•˜์—ฌ ์–ธ์–ด๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
21
 
22
- 3. ๊ฑฐ๋Œ€ ์–ธ์–ด๋ชจ๋ธ์˜ ์‚ฌ์šฉ์„ฑ ๊ฐœ์„ : ํ˜„์žฌ ๊ฑฐ๋Œ€ํ•œ ์‚ฌ์ด์ฆˆ์˜ ์–ธ์–ด๋ชจ๋ธ๋“ค์€ ๊ธฐ์—…์ด ์ž์‚ฌ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํŒŒ์ธํŠœ๋‹ํ•˜๊ธฐ ์–ด๋ ค์šด ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ์˜ ํฌ๊ธฐ๋ฅผ ์กฐ์ ˆํ•˜์—ฌ ์‚ฌ์šฉ์„ฑ์„ ๊ฐœ์„ ํ•˜๊ณ , ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ์ž‘์—…์— ๋” ์‰ฝ๊ฒŒ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - llama
9
  ---
10
 
11
+
12
+ <p align="center" width="100%">
13
+ <img src="https://i.imgur.com/snFDU0P.png" alt="KoreanLM icon" style="width: 500px; display: block; margin: auto; border-radius: 10%;">
14
+ </p>
15
+
16
+
17
  # KoreanLM: ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ ํ”„๋กœ์ ํŠธ
18
 
19
  KoreanLM์€ ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๊ธฐ ์œ„ํ•œ ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. ํ˜„์žฌ ๋Œ€๋ถ€๋ถ„์˜ ์–ธ์–ด๋ชจ๋ธ๋“ค์€ ์˜์–ด์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ์–ด, ํ•œ๊ตญ์–ด์— ๋Œ€ํ•œ ํ•™์Šต์ด ์ƒ๋Œ€์ ์œผ๋กœ ๋ถ€์กฑํ•˜๊ณ  ํ† ํฐํ™” ๊ณผ์ •์—์„œ ๋น„ํšจ์œจ์ ์ธ ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  ํ•œ๊ตญ์–ด์— ์ตœ์ ํ™”๋œ ์–ธ์–ด๋ชจ๋ธ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด KoreanLM ํ”„๋กœ์ ํŠธ๋ฅผ ์‹œ์ž‘ํ•˜๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
 
25
 
26
  2. ํšจ์œจ์ ์ธ ํ† ํฐํ™” ๋ฐฉ์‹ ๋„์ž…: ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ์˜ ํ† ํฐํ™” ๊ณผ์ •์—์„œ ํšจ์œจ์ ์ด๊ณ  ์ •ํ™•ํ•œ ๋ถ„์„์ด ๊ฐ€๋Šฅํ•œ ์ƒˆ๋กœ์šด ํ† ํฐํ™” ๋ฐฉ์‹์„ ๋„์ž…ํ•˜์—ฌ ์–ธ์–ด๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
27
 
28
+ 3. ๊ฑฐ๋Œ€ ์–ธ์–ด๋ชจ๋ธ์˜ ์‚ฌ์šฉ์„ฑ ๊ฐœ์„ : ํ˜„์žฌ ๊ฑฐ๋Œ€ํ•œ ์‚ฌ์ด์ฆˆ์˜ ์–ธ์–ด๋ชจ๋ธ๋“ค์€ ๊ธฐ์—…์ด ์ž์‚ฌ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํŒŒ์ธํŠœ๋‹ํ•˜๊ธฐ ์–ด๋ ค์šด ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ์˜ ํฌ๊ธฐ๋ฅผ ์กฐ์ ˆํ•˜์—ฌ ์‚ฌ์šฉ์„ฑ์„ ๊ฐœ์„ ํ•˜๊ณ , ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ์ž‘์—…์— ๋” ์‰ฝ๊ฒŒ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.
29
+
30
+
31
+ ## ์‚ฌ์šฉ ๋ฐฉ๋ฒ•
32
+
33
+ KoreanLM์€ GitHub ์ €์žฅ์†Œ๋ฅผ ํ†ตํ•ด ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค. ํ”„๋กœ์ ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฐฉ๋ฒ•์œผ๋กœ ์„ค์น˜ํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
34
+
35
+ ```bash
36
+ git clone https://github.com/quantumaikr/KoreanLM.git
37
+ cd KoreanLM
38
+ pip install -r requirements.txt
39
+ ```
40
+
41
+ ## ์˜ˆ์ œ
42
+
43
+ ๋‹ค์Œ์€ transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋กœ๋”ฉํ•˜๋Š” ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค.
44
+
45
+ ```python
46
+
47
+ import transformers
48
+ model = transformers.AutoModelForCausalLM.from_pretrained("quantumaikr/KoreanLM")
49
+ tokenizer = transformers.AutoTokenizer.from_pretrained("quantumaikr/KoreanLM")
50
+
51
+ ```
52
+
53
+
54
+ ## ํ›ˆ๋ จ (ํŒŒ์ธํŠœ๋‹)
55
+
56
+ ```bash
57
+ torchrun --nproc_per_node=4 --master_port=1004 train.py \
58
+ --model_name_or_path quantumaikr/KoreanLM \
59
+ --data_path korean_data.json \
60
+ --num_train_epochs 3 \
61
+ --cache_dir './data' \
62
+ --bf16 True \
63
+ --tf32 True \
64
+ --per_device_train_batch_size 4 \
65
+ --per_device_eval_batch_size 4 \
66
+ --gradient_accumulation_steps 8 \
67
+ --evaluation_strategy "no" \
68
+ --save_strategy "steps" \
69
+ --save_steps 500 \
70
+ --save_total_limit 1 \
71
+ --learning_rate 2e-5 \
72
+ --weight_decay 0. \
73
+ --warmup_ratio 0.03 \
74
+ --lr_scheduler_type "cosine" \
75
+ --logging_steps 1 \
76
+ --fsdp "full_shard auto_wrap" \
77
+ --fsdp_transformer_layer_cls_to_wrap 'OPTDecoderLayer' \
78
+ ```
79
+
80
+ ```bash
81
+ pip install deepspeed
82
+ torchrun --nproc_per_node=4 --master_port=1004 train.py \
83
+ --deepspeed "./deepspeed.json" \
84
+ --model_name_or_path quantumaikr/KoreanLM \
85
+ --data_path korean_data.json \
86
+ --num_train_epochs 3 \
87
+ --cache_dir './data' \
88
+ --bf16 True \
89
+ --tf32 True \
90
+ --per_device_train_batch_size 4 \
91
+ --per_device_eval_batch_size 4 \
92
+ --gradient_accumulation_steps 8 \
93
+ --evaluation_strategy "no" \
94
+ --save_strategy "steps" \
95
+ --save_steps 2000 \
96
+ --save_total_limit 1 \
97
+ --learning_rate 2e-5 \
98
+ --weight_decay 0. \
99
+ --warmup_ratio 0.03 \
100
+ ```
101
+
102
+ ## ํ›ˆ๋ จ (LoRA)
103
+
104
+ ```bash
105
+ python finetune-lora.py \
106
+ --base_model 'quantumaikr/KoreanLM' \
107
+ --data_path './korean_data.json' \
108
+ --output_dir './KoreanLM-LoRA' \
109
+ --cache_dir './data'
110
+ ```
111
+
112
+ ## ์ถ”๋ก 
113
+
114
+ ```bash
115
+ python generate.py \
116
+ --load_8bit \
117
+ --share_gradio \
118
+ --base_model 'quantumaikr/KoreanLM' \
119
+ --lora_weights 'quantumaikr/KoreanLM-LoRA' \
120
+ --cache_dir './data'
121
+
122
+ ```
123
+
124
+ ## ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ ๊ณต๊ฐœ ๋ฐ ์›น ๋ฐ๋ชจ
125
+
126
+ [ํ•™์Šต๋ชจ๋ธ](https://huggingface.co/quantumaikr/KoreanLM/tree/main)
127
+
128
+ [์›น๋ฐ๋ชจ](https://ntcswxk8f4m9zf-7860.proxy.runpod.net/)
129
+
130
+
131
+
132
+
133
+ ## ๊ธฐ์—ฌ๋ฐฉ๋ฒ•
134
+
135
+ 1. ์ด์Šˆ ์ œ๊ธฐ: KoreanLM ํ”„๋กœ์ ํŠธ์™€ ๊ด€๋ จ๋œ ๋ฌธ์ œ์ ์ด๋‚˜ ๊ฐœ์„ ์‚ฌํ•ญ์„ ์ด์Šˆ๋กœ ์ œ๊ธฐํ•ด์ฃผ์„ธ์š”.
136
+
137
+ 2. ์ฝ”๋“œ ์ž‘์„ฑ: ๊ฐœ์„ ์‚ฌํ•ญ์ด๋‚˜ ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ์„ ์ถ”๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž‘์„ฑ๋œ ์ฝ”๋“œ๋Š” Pull Request๋ฅผ ํ†ตํ•ด ์ œ์ถœํ•ด์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
138
+
139
+ 3. ๋ฌธ์„œ ์ž‘์„ฑ ๋ฐ ๋ฒˆ์—ญ: ํ”„๋กœ์ ํŠธ์˜ ๋ฌธ์„œ ์ž‘์„ฑ์ด๋‚˜ ๋ฒˆ์—ญ ์ž‘์—…์— ์ฐธ์—ฌํ•˜์—ฌ ํ”„๋กœ์ ํŠธ์˜ ์งˆ์„ ๋†’์—ฌ์ฃผ์„ธ์š”.
140
+
141
+ 4. ํ…Œ์ŠคํŠธ ๋ฐ ํ”ผ๋“œ๋ฐฑ: ํ”„๋กœ์ ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋ฐœ๊ฒฌํ•œ ๋ฒ„๊ทธ๋‚˜ ๊ฐœ์„ ์‚ฌํ•ญ์„ ํ”ผ๋“œ๋ฐฑํ•ด์ฃผ์‹œ๋ฉด ํฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค.
142
+
143
+ ## ๋ผ์ด์„ ์Šค
144
+
145
+ KoreanLM ํ”„๋กœ์ ํŠธ๋Š” Apache 2.0 License ๋ผ์ด์„ ์Šค๋ฅผ ๋”ฐ๏ฟฝ๏ฟฝ๏ฟฝ๋‹ˆ๋‹ค. ํ”„๋กœ์ ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์‹ค ๋•Œ ๋ผ์ด์„ ์Šค์— ๋”ฐ๋ผ ์ฃผ์˜์‚ฌํ•ญ์„ ์ง€์ผœ์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
146
+
147
+
148
+ ## ๊ธฐ์ˆ  ๋ฌธ์˜
149
+
150
+ KoreanLM ํ”„๋กœ์ ํŠธ์™€ ๊ด€๋ จ๋œ ๋ฌธ์˜์‚ฌํ•ญ์ด ์žˆ์œผ์‹œ๋ฉด ์ด๋ฉ”์ผ ๋˜๋Š” GitHub ์ด์Šˆ๋ฅผ ํ†ตํ•ด ๋ฌธ์˜ํ•ด์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. ์ด ํ”„๋กœ์ ํŠธ๊ฐ€ ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ์™€ ๊ฐœ๋ฐœ์— ๋„์›€์ด ๋˜๊ธธ ๋ฐ”๋ผ๋ฉฐ, ๋งŽ์€ ๊ด€์‹ฌ๊ณผ ์ฐธ์—ฌ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.
151
+
152
+
153
+ ์ด๋ฉ”์ผ: [email protected]
154
+
155
+
156
+ ---
157
+
158
+ This repository has implementations inspired by [open_llama](https://github.com/openlm-research/open_llama), [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) and [alpaca-lora](https://github.com/tloen/alpaca-lora) projects.