Update README.md
Browse files
README.md
CHANGED
@@ -155,7 +155,7 @@ Our final models were trained on a different number of steps and sequence length
|
|
155 |
|
156 |
<figure>
|
157 |
|
158 |
-
<caption>Table 1. Evaluation made by the Barcelona Supercomputing Center of their models and BERTIN (beta, seq len 128), from their
|
159 |
|
160 |
| Dataset | Metric | RoBERTa-b | RoBERTa-l | BETO | mBERT | BERTIN |
|
161 |
|-------------|----------|-----------|-----------|--------|--------|--------|
|
@@ -187,7 +187,7 @@ All of our models attained good accuracy values during training in the masked-la
|
|
187 |
|
188 |
</figure>
|
189 |
|
190 |
-
###Downstream Tasks
|
191 |
|
192 |
We are currently in the process of applying our language models to downstream tasks.
|
193 |
For simplicity, we will abbreviate the different models as follows:
|
|
|
155 |
|
156 |
<figure>
|
157 |
|
158 |
+
<caption>Table 1. Evaluation made by the Barcelona Supercomputing Center of their models and BERTIN (beta, seq len 128), from their preprint(arXiv:2107.07253).</caption>
|
159 |
|
160 |
| Dataset | Metric | RoBERTa-b | RoBERTa-l | BETO | mBERT | BERTIN |
|
161 |
|-------------|----------|-----------|-----------|--------|--------|--------|
|
|
|
187 |
|
188 |
</figure>
|
189 |
|
190 |
+
### Downstream Tasks
|
191 |
|
192 |
We are currently in the process of applying our language models to downstream tasks.
|
193 |
For simplicity, we will abbreviate the different models as follows:
|