uer commited on
Commit
d5be715
1 Parent(s): 51f6e02

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -6
README.md CHANGED
@@ -6,11 +6,11 @@ widget:
6
 
7
  ---
8
 
9
- # Chinese RoBERTa Base Model for QA
10
 
11
  ## Model description
12
 
13
- The model is used for extractive question answering. You can download the model from the link [roberta-base-chinese-extractive-qa](https://huggingface.co/uer/roberta-base-chinese-extractive-qa).
14
 
15
  ## How to use
16
 
@@ -27,20 +27,19 @@ You can use the model directly with a pipeline for extractive question answering
27
 
28
  ## Training data
29
 
30
- Training data contains three datasets ,including [cmrc2018](https://github.com/ymcui/cmrc2018), [webqa](https://spaces.ac.cn/archives/4338) and [莱斯杯](https://www.kesci.com/home/competition/5d142d8cbb14e6002c04e14a/content/0).
31
 
32
  ## Training procedure
33
 
34
  The model is fine-tuned by [UER-py](https://github.com/dbiir/UER-py/) on [Tencent Cloud TI-ONE](https://cloud.tencent.com/product/tione/). We fine-tune three epochs with a sequence length of 512 on the basis of the pre-trained model [chinese_roberta_L-12_H-768](https://huggingface.co/uer/chinese_roberta_L-12_H-768).
35
 
36
  ```
37
- python3 run_cmrc.py --dataset_path lyric_dataset.pt \
38
- --pretrained_model_path models/cluecorpussmall_roberta_base_seq512_model.bin-250000 \
39
  --vocab_path models/google_zh_vocab.txt \
40
  --train_path extractive_qa.json \
41
  --dev_path datasets/cmrc2018/dev.json \
42
  --output_model_path models/extractive_qa_model.bin \
43
- --learning_rate 3e-5 --batch_size 32 --epochs_num 3 \
44
  --embedding word_pos_seg --encoder transformer --mask fully_visible
45
  ```
46
 
 
6
 
7
  ---
8
 
9
+ # Chinese RoBERTa-Base Model for QA
10
 
11
  ## Model description
12
 
13
+ The model is used for extractive question answering. You can download the model from the link [roberta-base-chinese-extractive-qa](https://huggingface.co/uer/roberta-base-chinese-extractive-qa).
14
 
15
  ## How to use
16
 
 
27
 
28
  ## Training data
29
 
30
+ Training data comes from three sources: [cmrc2018](https://github.com/ymcui/cmrc2018), [webqa](https://spaces.ac.cn/archives/4338), and [laisi](https://www.kesci.com/home/competition/5d142d8cbb14e6002c04e14a/content/0). We only use train set of the three datasets.
31
 
32
  ## Training procedure
33
 
34
  The model is fine-tuned by [UER-py](https://github.com/dbiir/UER-py/) on [Tencent Cloud TI-ONE](https://cloud.tencent.com/product/tione/). We fine-tune three epochs with a sequence length of 512 on the basis of the pre-trained model [chinese_roberta_L-12_H-768](https://huggingface.co/uer/chinese_roberta_L-12_H-768).
35
 
36
  ```
37
+ python3 run_cmrc.py --pretrained_model_path models/cluecorpussmall_roberta_base_seq512_model.bin-250000 \
 
38
  --vocab_path models/google_zh_vocab.txt \
39
  --train_path extractive_qa.json \
40
  --dev_path datasets/cmrc2018/dev.json \
41
  --output_model_path models/extractive_qa_model.bin \
42
+ --learning_rate 3e-5 --batch_size 32 --epochs_num 3 --seq_length 512 \
43
  --embedding word_pos_seg --encoder transformer --mask fully_visible
44
  ```
45