KF-DeBERTa

์นด์นด์˜ค๋ฑ…ํฌ & ์—ํ”„์—”๊ฐ€์ด๋“œ์—์„œ ํ•™์Šตํ•œ ๊ธˆ์œต ๋„๋ฉ”์ธ ํŠนํ™” ์–ธ์–ด๋ชจ๋ธ์„ ๊ณต๊ฐœํ•ฉ๋‹ˆ๋‹ค.

Model description

  • KF-DeBERTa๋Š” ๋ฒ”์šฉ ๋„๋ฉ”์ธ ๋ง๋ญ‰์น˜์™€ ๊ธˆ์œต ๋„๋ฉ”์ธ ๋ง๋ญ‰์น˜๋ฅผ ํ•จ๊ป˜ ํ•™์Šตํ•œ ์–ธ์–ด๋ชจ๋ธ ์ž…๋‹ˆ๋‹ค.
  • ๋ชจ๋ธ ์•„ํ‚คํ…์ณ๋Š” DeBERTa-v2๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šตํ•˜์˜€์Šต๋‹ˆ๋‹ค.
    • ELECTRA์˜ RTD๋ฅผ training objective๋กœ ์‚ฌ์šฉํ•œ DeBERTa-v3๋Š” ์ผ๋ถ€ task(KLUE-RE, WoS, Retrieval)์—์„œ ์ƒ๋‹นํžˆ ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ํ™•์ธํ•˜์—ฌ ์ตœ์ข… ์•„ํ‚คํ…์ณ๋Š” DeBERTa-v2๋กœ ๊ฒฐ์ •ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • ๋ฒ”์šฉ ๋„๋ฉ”์ธ ๋ฐ ๊ธˆ์œต ๋„๋ฉ”์ธ downstream task์—์„œ ๋ชจ๋‘ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ํ™•์ธํ•˜์˜€์Šต๋‹ˆ๋‹ค.
    • ๊ธˆ์œต ๋„๋ฉ”์ธ downstream task์˜ ์ฒ ์ €ํ•œ ์„ฑ๋Šฅ๊ฒ€์ฆ์„ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ์…‹์„ ํ†ตํ•ด ๊ฒ€์ฆ์„ ์ˆ˜ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
    • ๋ฒ”์šฉ ๋„๋ฉ”์ธ ๋ฐ ๊ธˆ์œต ๋„๋ฉ”์ธ์—์„œ ๊ธฐ์กด ์–ธ์–ด๋ชจ๋ธ๋ณด๋‹ค ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์คฌ์œผ๋ฉฐ ํŠนํžˆ KLUE Benchmark์—์„œ๋Š” RoBERTa-Large๋ณด๋‹ค ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ํ™•์ธํ•˜์˜€์Šต๋‹ˆ๋‹ค.

Usage

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained("kakaobank/kf-deberta-base")
tokenizer = AutoTokenizer.from_pretrained("kakaobank/kf-deberta-base")

text = "์นด์นด์˜ค๋ฑ…ํฌ์™€ ์—ํ”„์—”๊ฐ€์ด๋“œ๊ฐ€ ๊ธˆ์œตํŠนํ™” ์–ธ์–ด๋ชจ๋ธ์„ ๊ณต๊ฐœํ•ฉ๋‹ˆ๋‹ค."
tokens = tokenizer.tokenize(text)
print(tokens)

inputs = tokenizer(text, return_tensors="pt")
model_output = model(**inputs)
print(model_output)

Benchmark

  • ๋ชจ๋“  task๋Š” ์•„๋ž˜์™€ ๊ฐ™์€ ๊ธฐ๋ณธ์ ์ธ hyperparameter search๋งŒ ์ˆ˜ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
    • batch size: {16, 32}
    • learning_rate: {1e-5, 3e-5, 5e-5}
    • weight_decay: {0, 0.01}
    • warmup_proportion: {0, 0.1}

KLUE Benchmark

Model YNAT KLUE-ST KLUE-NLI KLUE-NER KLUE-RE KLUE-DP KLUE-MRC WoS AVG
F1 Pearsonr/F1 ACC F1-Entity/F1-Char F1-micro/AUC UAS/LAS EM/ROUGE JGA/F1-S
mBERT (Base) 82.64 82.97/75.93 72.90 75.56/88.81 58.39/56.41 88.53/86.04 49.96/55.57 35.27/88.60 71.26
XLM-R (Base) 84.52 88.88/81.20 78.23 80.48/92.14 57.62/57.05 93.12/87.23 26.76/53.36 41.54/89.81 72.28
XLM-R (Large) 87.30 93.08/87.17 86.40 82.18/93.20 58.75/63.53 92.87/87.82 35.23/66.55 42.44/89.88 76.17
KR-BERT (Base) 85.36 87.50/77.92 77.10 74.97/90.46 62.83/65.42 92.87/87.13 48.95/58.38 45.60/90.82 74.67
KoELECTRA (Base) 85.99 93.14/85.89 86.87 86.06/92.75 62.67/57.46 90.93/87.07 59.54/65.64 39.83/88.91 77.34
KLUE-BERT (Base) 86.95 91.01/83.44 79.87 83.71/91.17 65.58/68.11 93.07/87.25 62.42/68.15 46.72/91.59 78.50
KLUE-RoBERTa (Small) 85.95 91.70/85.42 81.00 83.55/91.20 61.26/60.89 93.47/87.50 58.25/63.56 46.65/91.50 77.28
KLUE-RoBERTa (Base) 86.19 92.91/86.78 86.30 83.81/91.09 66.73/68.11 93.75/87.77 69.56/74.64 47.41/91.60 80.48
KLUE-RoBERTa (Large) 85.88 93.20/86.13 89.50 84.54/91.45 71.06/73.33 93.84/87.93 75.26/80.30 49.39/92.19 82.43
KF-DeBERTa (Base) 87.51 93.24/87.73 88.37 89.17/93.30 69.70/75.07 94.05/87.97 72.59/78.08 50.21/92.59 82.83
  • ๊ตต์€๊ธ€์”จ๋Š” ๋ชจ๋“  ๋ชจ๋ธ์ค‘ ๊ฐ€์žฅ๋†’์€ ์ ์ˆ˜์ด๋ฉฐ, ๋ฐ‘์ค„์€ base ๋ชจ๋ธ ์ค‘ ๊ฐ€์žฅ ๋†’์€ ์ ์ˆ˜์ž…๋‹ˆ๋‹ค.

๊ธˆ์œต๋„๋ฉ”์ธ ๋ฒค์น˜๋งˆํฌ

Model FN-Sentiment (v1) FN-Sentiment (v2) FN-Adnews FN-NER KorFPB KorFiQA-SA KorHeadline Avg (FiQA-SA ์ œ์™ธ)
ACC ACC ACC F1-micro ACC MSE Mean F1
KLUE-RoBERTa (Base) 98.26 91.21 96.34 90.31 90.97 0.0589 81.11 94.03
KoELECTRA (Base) 98.26 90.56 96.98 89.81 92.36 0.0652 80.69 93.90
KF-DeBERTa (Base) 99.36 92.29 97.63 91.80 93.47 0.0553 82.12 95.27
  • FN-Sentiment: ๊ธˆ์œต๋„๋ฉ”์ธ ๊ฐ์„ฑ๋ถ„์„
  • FN-Adnews: ๊ธˆ์œต๋„๋ฉ”์ธ ๊ด‘๊ณ ์„ฑ๊ธฐ์‚ฌ ๋ถ„๋ฅ˜
  • FN-NER: ๊ธˆ์œต๋„๋ฉ”์ธ ๊ฐœ์ฒด๋ช…์ธ์‹
  • KorFPB: FinancialPhraseBank ๋ฒˆ์—ญ๋ฐ์ดํ„ฐ
    • Cite: Malo, Pekka, et al. "Good debt or bad debt: Detecting semantic orientations in economic texts." Journal of the Association for Information Science and Technology 65.4 (2014): 782-796.
  • KorFiQA-SA: FiQA-SA ๋ฒˆ์—ญ๋ฐ์ดํ„ฐ
    • Cite: Maia, Macedo & Handschuh, Siegfried & Freitas, Andre & Davis, Brian & McDermott, Ross & Zarrouk, Manel & Balahur, Alexandra. (2018). WWW'18 Open Challenge: Financial Opinion Mining and Question Answering. WWW '18: Companion Proceedings of the The Web Conference 2018. 1941-1942. 10.1145/3184558.3192301.
  • KorHeadline: Gold Commodity News and Dimensions ๋ฒˆ์—ญ๋ฐ์ดํ„ฐ
    • Cite: Sinha, A., & Khandait, T. (2021, April). Impact of News on the Commodity Market: Dataset and Results. In Future of Information and Communication Conference (pp. 589-601). Springer, Cham.

๋ฒ”์šฉ๋„๋ฉ”์ธ ๋ฒค์น˜๋งˆํฌ

Model NSMC PAWS KorNLI KorSTS KorQuAD Avg (KorQuAD ์ œ์™ธ)
ACC ACC ACC spearman EM/F1
KLUE-RoBERTa (Base) 90.47 84.79 81.65 84.40 86.34/94.40 85.33
KoELECTRA (Base) 90.63 84.45 82.24 85.53 84.83/93.45 85.71
KF-DeBERTa (Base) 91.36 86.14 84.54 85.99 86.60/95.07 87.01

License

KF-DeBERTa์˜ ์†Œ์Šค์ฝ”๋“œ ๋ฐ ๋ชจ๋ธ์€ MIT ๋ผ์ด์„ ์Šค ํ•˜์— ๊ณต๊ฐœ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
๋ผ์ด์„ ์Šค ์ „๋ฌธ์€ MIT ํŒŒ์ผ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋ชจ๋ธ์˜ ์‚ฌ์šฉ์œผ๋กœ ์ธํ•ด ๋ฐœ์ƒํ•œ ์–ด๋– ํ•œ ์†ํ•ด์— ๋Œ€ํ•ด์„œ๋„ ๋‹น์‚ฌ๋Š” ์ฑ…์ž„์„ ์ง€์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

Citation

@proceedings{jeon-etal-2023-kfdeberta,
  title         = {KF-DeBERTa: Financial Domain-specific Pre-trained Language Model},
  author        = {Eunkwang Jeon, Jungdae Kim, Minsang Song, and Joohyun Ryu},
  booktitle     = {Proceedings of the 35th Annual Conference on Human and Cognitive Language Technology},
  moth          = {oct},
  year          = {2023},
  publisher     = {Korean Institute of Information Scientists and Engineers},
  url           = {http://www.hclt.kr/symp/?lnb=conference},
  pages         = {143--148},
}
Downloads last month
2,371
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kakaobank/kf-deberta-base

Finetunes
1 model

Space using kakaobank/kf-deberta-base 1