Update README.md
Browse files
README.md
CHANGED
@@ -232,15 +232,15 @@ Table 3. Metrics for different downstream tasks, comparing our different models
|
|
232 |
|
233 |
</figure>
|
234 |
|
235 |
-
Table 4. Metrics for different downstream tasks, comparing our different models as well as other relevant BERT variations from the literature. Dataset for POS and NER is CoNLL 2002. POS, NER and PAWS-X used max length 512 and batch size 16. Batch size for XNLI is 16 too (max length 512). All models were fine-tuned for 5 epochs. Results marked with * indicate
|
236 |
</caption>
|
237 |
|
238 |
| Model | POS (F1/Acc) | NER (F1/Acc) | PAWS-X (Acc) | XNLI (Acc) |
|
239 |
|--------------|----------------------|---------------------|--------------|------------|
|
240 |
-
| BERT-m | 0.9630 / 0.9689 | 0.8616 / 0.9790 | 0.
|
241 |
| BERT-wwm | 0.9639 / 0.9693 | 0.8596 / 0.9790 | 0.8720* | **0.8012** |
|
242 |
-
| BSC-BNE | **0.9655 / 0.9706** | 0.8764 / 0.9818 | 0.5765* | 0.
|
243 |
-
| Beta | 0.9616 / 0.9669 | 0.8640 / 0.9799 | 0.
|
244 |
| Random | 0.9651 / 0.9700 | 0.8638 / 0.9802 | 0.8800* | 0.7795 |
|
245 |
| Stepwise | 0.9642 / 0.9693 | 0.8726 / 0.9818 | 0.8825* | 0.7799 |
|
246 |
| Gaussian | 0.9644 / 0.9692 | **0.8779 / 0.9820** | 0.8875* | 0.7843 |
|
|
|
232 |
|
233 |
</figure>
|
234 |
|
235 |
+
Table 4. Metrics for different downstream tasks, comparing our different models as well as other relevant BERT variations from the literature. Dataset for POS and NER is CoNLL 2002. POS, NER and PAWS-X used max length 512 and batch size 16. Batch size for XNLI is 16 too (max length 512). All models were fine-tuned for 5 epochs. Results marked with * indicate more than one attempt for convergence. Stepwise checkpoint had 204.000 steps during these tests.
|
236 |
</caption>
|
237 |
|
238 |
| Model | POS (F1/Acc) | NER (F1/Acc) | PAWS-X (Acc) | XNLI (Acc) |
|
239 |
|--------------|----------------------|---------------------|--------------|------------|
|
240 |
+
| BERT-m | 0.9630 / 0.9689 | 0.8616 / 0.9790 | 0.8895* | 0.7606 |
|
241 |
| BERT-wwm | 0.9639 / 0.9693 | 0.8596 / 0.9790 | 0.8720* | **0.8012** |
|
242 |
+
| BSC-BNE | **0.9655 / 0.9706** | 0.8764 / 0.9818 | 0.5765* | 0.7771* |
|
243 |
+
| Beta | 0.9616 / 0.9669 | 0.8640 / 0.9799 | 0.8670* | 0.7751* |
|
244 |
| Random | 0.9651 / 0.9700 | 0.8638 / 0.9802 | 0.8800* | 0.7795 |
|
245 |
| Stepwise | 0.9642 / 0.9693 | 0.8726 / 0.9818 | 0.8825* | 0.7799 |
|
246 |
| Gaussian | 0.9644 / 0.9692 | **0.8779 / 0.9820** | 0.8875* | 0.7843 |
|