🌟 Checkout Taiwan-LLM Demo Chat-UI 🌟

Model Card for Taiwan LLM 13B v2.0 chat

Taiwan LLM is an advanced language model tailored for Traditional Chinese, focusing on the linguistic and cultural contexts of Taiwan. Developed from a large base model, it's enriched with diverse Taiwanese textual sources and refined through Supervised Fine-Tuning. This model excels in language understanding and generation, aligning closely with Taiwan's cultural nuances. It demonstrates improved performance on various benchmarks like TC-Eval, showcasing its contextual comprehension and cultural relevance. For detailed insights into Taiwan LLM's development and features, refer to our technical report.

Model description

Model type: A 13B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
Language(s) (NLP): Primarily Traditional Chinese (zh-tw)
Finetuned from model: yentinglin/Taiwan-LLM-13B-v2.0-base

Model Sources

Repository: https://github.com/MiuLab/Taiwan-LLaMa
Demo: https://twllm.com/

Performance

TMMLUS+ score: 24.76727075757576

Intended uses

Here's how you can run the model using the pipeline() function from 🤗 Transformers:

# pip install transformers>=4.34
# pip install accelerate

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="yentinglin/Taiwan-LLM-13B-v2.0-chat", torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "你是一個人工智慧助理",
    },
    {"role": "user", "content": "東北季風如何影響台灣氣候？"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
distributed_type: multi-GPU
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
num_epochs: 5.0

Citation

If you find Taiwan LLM is useful in your work, please cite it with:

@misc{lin2023taiwan,
      title={Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model}, 
      author={Yen-Ting Lin and Yun-Nung Chen},
      year={2023},
      eprint={2311.17487},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Acknowledgement

Taiwan LLM v2 is conducted in collaboration with Ubitus K.K.. Ubitus provides valuable compute resources for the project.

Open LLM Leaderboard

Task	Version	Metric	Value		Stderr
leaderboard:arc:challenge:25	0	acc	0.5529	±	0.0145
		acc_norm	0.5862	±	0.0144
leaderboard:gsm8k:5	0	qem	0.3177	±	0.0128
leaderboard:hellaswag:10	0	acc	0.6307	±	0.0048
		acc_norm	0.8327	±	0.0037
leaderboard:mmlu:_average:5		acc	0.5483	±	0.0356
leaderboard:mmlu:abstract_algebra:5	0	acc	0.3400	±	0.0476
leaderboard:mmlu:anatomy:5	0	acc	0.5111	±	0.0432
leaderboard:mmlu:astronomy:5	0	acc	0.5789	±	0.0402
leaderboard:mmlu:business_ethics:5	0	acc	0.5100	±	0.0502
leaderboard:mmlu:clinical_knowledge:5	0	acc	0.6000	±	0.0302
leaderboard:mmlu:college_biology:5	0	acc	0.5764	±	0.0413
leaderboard:mmlu:college_chemistry:5	0	acc	0.4100	±	0.0494
leaderboard:mmlu:college_computer_science:5	0	acc	0.4500	±	0.0500
leaderboard:mmlu:college_mathematics:5	0	acc	0.3800	±	0.0488
leaderboard:mmlu:college_medicine:5	0	acc	0.5434	±	0.0380
leaderboard:mmlu:college_physics:5	0	acc	0.2941	±	0.0453
leaderboard:mmlu:computer_security:5	0	acc	0.7000	±	0.0461
leaderboard:mmlu:conceptual_physics:5	0	acc	0.4468	±	0.0325
leaderboard:mmlu:econometrics:5	0	acc	0.2719	±	0.0419
leaderboard:mmlu:electrical_engineering:5	0	acc	0.4552	±	0.0415
leaderboard:mmlu:elementary_mathematics:5	0	acc	0.3175	±	0.0240
leaderboard:mmlu:formal_logic:5	0	acc	0.3413	±	0.0424
leaderboard:mmlu:global_facts:5	0	acc	0.3700	±	0.0485
leaderboard:mmlu:high_school_biology:5	0	acc	0.6323	±	0.0274
leaderboard:mmlu:high_school_chemistry:5	0	acc	0.4581	±	0.0351
leaderboard:mmlu:high_school_computer_science:5	0	acc	0.5400	±	0.0501
leaderboard:mmlu:high_school_european_history:5	0	acc	0.6364	±	0.0376
leaderboard:mmlu:high_school_geography:5	0	acc	0.6970	±	0.0327
leaderboard:mmlu:high_school_government_and_politics:5	0	acc	0.7617	±	0.0307
leaderboard:mmlu:high_school_macroeconomics:5	0	acc	0.4974	±	0.0254
leaderboard:mmlu:high_school_mathematics:5	0	acc	0.3296	±	0.0287
leaderboard:mmlu:high_school_microeconomics:5	0	acc	0.5336	±	0.0324
leaderboard:mmlu:high_school_physics:5	0	acc	0.3709	±	0.0394
leaderboard:mmlu:high_school_psychology:5	0	acc	0.7468	±	0.0186
leaderboard:mmlu:high_school_statistics:5	0	acc	0.4074	±	0.0335
leaderboard:mmlu:high_school_us_history:5	0	acc	0.7108	±	0.0318
leaderboard:mmlu:high_school_world_history:5	0	acc	0.7046	±	0.0297
leaderboard:mmlu:human_aging:5	0	acc	0.6323	±	0.0324
leaderboard:mmlu:human_sexuality:5	0	acc	0.5878	±	0.0432
leaderboard:mmlu:international_law:5	0	acc	0.6694	±	0.0429
leaderboard:mmlu:jurisprudence:5	0	acc	0.7037	±	0.0441
leaderboard:mmlu:logical_fallacies:5	0	acc	0.6564	±	0.0373
leaderboard:mmlu:machine_learning:5	0	acc	0.3393	±	0.0449
leaderboard:mmlu:management:5	0	acc	0.7087	±	0.0450
leaderboard:mmlu:marketing:5	0	acc	0.8333	±	0.0244
leaderboard:mmlu:medical_genetics:5	0	acc	0.5400	±	0.0501
leaderboard:mmlu:miscellaneous:5	0	acc	0.7382	±	0.0157
leaderboard:mmlu:moral_disputes:5	0	acc	0.6127	±	0.0262
leaderboard:mmlu:moral_scenarios:5	0	acc	0.3788	±	0.0162
leaderboard:mmlu:nutrition:5	0	acc	0.6046	±	0.0280
leaderboard:mmlu:philosophy:5	0	acc	0.6270	±	0.0275
leaderboard:mmlu:prehistory:5	0	acc	0.6204	±	0.0270
leaderboard:mmlu:professional_accounting:5	0	acc	0.3582	±	0.0286
leaderboard:mmlu:professional_law:5	0	acc	0.3931	±	0.0125
leaderboard:mmlu:professional_medicine:5	0	acc	0.5184	±	0.0304
leaderboard:mmlu:professional_psychology:5	0	acc	0.5556	±	0.0201
leaderboard:mmlu:public_relations:5	0	acc	0.6818	±	0.0446
leaderboard:mmlu:security_studies:5	0	acc	0.6122	±	0.0312
leaderboard:mmlu:sociology:5	0	acc	0.7164	±	0.0319
leaderboard:mmlu:us_foreign_policy:5	0	acc	0.8200	±	0.0386
leaderboard:mmlu:virology:5	0	acc	0.4578	±	0.0388
leaderboard:mmlu:world_religions:5	0	acc	0.7661	±	0.0325
leaderboard:truthfulqa:mc:0	0	truthfulqa_mc1	0.2840	±	0.0158
		truthfulqa_mc2	0.4423	±	0.0146
leaderboard:winogrande:5	0	acc	0.7593	±	0.0120

TC-Eval

Task	Version	Metric	Value		Stderr
community:tc-eval-v2:drcd:0	0	pem	0.6848	±	0.0079
		pqem	0.6799	±	0.0079
community:tc-eval-v2:penguin_table:0	0	acc	0.2361	±	0.0355
community:tc-eval-v2:_average:5		acc	0.3508	±	0.0318
community:tc-eval-v2:tmmluplus-accounting:5	0	acc	0.2565	±	0.0317
community:tc-eval-v2:tmmluplus-administrative_law:5	0	acc	0.2833	±	0.0220
community:tc-eval-v2:tmmluplus-advance_chemistry:5	0	acc	0.3333	±	0.0427
community:tc-eval-v2:tmmluplus-agriculture:5	0	acc	0.1987	±	0.0326
community:tc-eval-v2:tmmluplus-anti_money_laundering:5	0	acc	0.5597	±	0.0430
community:tc-eval-v2:tmmluplus-auditing:5	0	acc	0.2836	±	0.0192
community:tc-eval-v2:tmmluplus-basic_medical_science:5	0	acc	0.2841	±	0.0146
community:tc-eval-v2:tmmluplus-business_management:5	0	acc	0.4245	±	0.0421
community:tc-eval-v2:tmmluplus-chinese_language_and_literature:5	0	acc	0.2714	±	0.0316
community:tc-eval-v2:tmmluplus-clinical_psychology:5	0	acc	0.3840	±	0.0437
community:tc-eval-v2:tmmluplus-computer_science:5	0	acc	0.4195	±	0.0375
community:tc-eval-v2:tmmluplus-culinary_skills:5	0	acc	0.4589	±	0.0292
community:tc-eval-v2:tmmluplus-dentistry:5	0	acc	0.3885	±	0.0244
community:tc-eval-v2:tmmluplus-economics:5	0	acc	0.3053	±	0.0233
community:tc-eval-v2:tmmluplus-education:5	0	acc	0.4355	±	0.0447
community:tc-eval-v2:tmmluplus-education_(profession_level):5	0	acc	0.2819	±	0.0204
community:tc-eval-v2:tmmluplus-educational_psychology:5	0	acc	0.4489	±	0.0376
community:tc-eval-v2:tmmluplus-engineering_math:5	0	acc	0.2718	±	0.0441
community:tc-eval-v2:tmmluplus-finance_banking:5	0	acc	0.3037	±	0.0397
community:tc-eval-v2:tmmluplus-financial_analysis:5	0	acc	0.2801	±	0.0230
community:tc-eval-v2:tmmluplus-fire_science:5	0	acc	0.2500	±	0.0390
community:tc-eval-v2:tmmluplus-general_principles_of_law:5	0	acc	0.3113	±	0.0452
community:tc-eval-v2:tmmluplus-geography_of_taiwan:5	0	acc	0.4492	±	0.0180
community:tc-eval-v2:tmmluplus-human_behavior:5	0	acc	0.3883	±	0.0278
community:tc-eval-v2:tmmluplus-insurance_studies:5	0	acc	0.3487	±	0.0173
community:tc-eval-v2:tmmluplus-introduction_to_law:5	0	acc	0.3165	±	0.0303
community:tc-eval-v2:tmmluplus-jce_humanities:5	0	acc	0.3444	±	0.0504
community:tc-eval-v2:tmmluplus-junior_chemistry:5	0	acc	0.3158	±	0.0322
community:tc-eval-v2:tmmluplus-junior_chinese_exam:5	0	acc	0.4171	±	0.0374
community:tc-eval-v2:tmmluplus-junior_math_exam:5	0	acc	0.2286	±	0.0318
community:tc-eval-v2:tmmluplus-junior_science_exam:5	0	acc	0.3427	±	0.0326
community:tc-eval-v2:tmmluplus-junior_social_studies:5	0	acc	0.4683	±	0.0446
community:tc-eval-v2:tmmluplus-logic_reasoning:5	0	acc	0.2734	±	0.0379
community:tc-eval-v2:tmmluplus-macroeconomics:5	0	acc	0.3187	±	0.0230
community:tc-eval-v2:tmmluplus-management_accounting:5	0	acc	0.2977	±	0.0313
community:tc-eval-v2:tmmluplus-marketing_management:5	0	acc	0.4624	±	0.0520
community:tc-eval-v2:tmmluplus-mechanical:5	0	acc	0.4831	±	0.0462
community:tc-eval-v2:tmmluplus-music:5	0	acc	0.3993	±	0.0294
community:tc-eval-v2:tmmluplus-national_protection:5	0	acc	0.4929	±	0.0345
community:tc-eval-v2:tmmluplus-nautical_science:5	0	acc	0.2777	±	0.0191
community:tc-eval-v2:tmmluplus-occupational_therapy_for_psychological_disorders:5	0	acc	0.4438	±	0.0213
community:tc-eval-v2:tmmluplus-official_document_management:5	0	acc	0.3559	±	0.0322
community:tc-eval-v2:tmmluplus-optometry:5	0	acc	0.2804	±	0.0148
community:tc-eval-v2:tmmluplus-organic_chemistry:5	0	acc	0.3486	±	0.0459
community:tc-eval-v2:tmmluplus-pharmacology:5	0	acc	0.3397	±	0.0197
community:tc-eval-v2:tmmluplus-pharmacy:5	0	acc	0.2174	±	0.0209
community:tc-eval-v2:tmmluplus-physical_education:5	0	acc	0.3966	±	0.0367
community:tc-eval-v2:tmmluplus-physics:5	0	acc	0.2371	±	0.0434
community:tc-eval-v2:tmmluplus-politic_science:5	0	acc	0.3407	±	0.0150
community:tc-eval-v2:tmmluplus-real_estate:5	0	acc	0.3804	±	0.0509
community:tc-eval-v2:tmmluplus-secondary_physics:5	0	acc	0.3393	±	0.0449
community:tc-eval-v2:tmmluplus-statistics_and_machine_learning:5	0	acc	0.3438	±	0.0318
community:tc-eval-v2:tmmluplus-taiwanese_hokkien:5	0	acc	0.2636	±	0.0389
community:tc-eval-v2:tmmluplus-taxation:5	0	acc	0.2507	±	0.0224
community:tc-eval-v2:tmmluplus-technical:5	0	acc	0.4204	±	0.0247
community:tc-eval-v2:tmmluplus-three_principles_of_people:5	0	acc	0.5396	±	0.0424
community:tc-eval-v2:tmmluplus-trade:5	0	acc	0.2251	±	0.0187
community:tc-eval-v2:tmmluplus-traditional_chinese_medicine_clinical_medicine:5	0	acc	0.3094	±	0.0278
community:tc-eval-v2:tmmluplus-trust_practice:5	0	acc	0.3292	±	0.0235
community:tc-eval-v2:tmmluplus-ttqav2:5	0	acc	0.6726	±	0.0443
community:tc-eval-v2:tmmluplus-tve_chinese_language:5	0	acc	0.4161	±	0.0225
community:tc-eval-v2:tmmluplus-tve_design:5	0	acc	0.4542	±	0.0227
community:tc-eval-v2:tmmluplus-tve_mathematics:5	0	acc	0.2733	±	0.0365
community:tc-eval-v2:tmmluplus-tve_natural_sciences:5	0	acc	0.3349	±	0.0229
community:tc-eval-v2:tmmluplus-veterinary_pathology:5	0	acc	0.2544	±	0.0259
community:tc-eval-v2:tmmluplus-veterinary_pharmacology:5	0	acc	0.3259	±	0.0202

yentinglin
/

Taiwan-LLM-13B-v2.0-chat