angel-poc commited on
Commit
093ebb2
1 Parent(s): 58b0f73

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -36,6 +36,7 @@ widget:
36
  - [Licensing information](#licensing-information)
37
  - [Funding](#funding)
38
  - [Disclaimer](#disclaimer)
 
39
 
40
  ## Model description
41
  The longformer-base-4096-bne-es is the [Longformer](https://huggingface.co/allenai/longformer-base-4096) version of the [roberta-base-bne](https://https://huggingface.co/PlanTL-GOB-ES/roberta-base-bne) masked language model for the Spanish language. The model started from the **roberta-base-bne** checkpoint and was pretrained for MLM on long documents from our biomedical and clinical corpora.
@@ -84,6 +85,19 @@ For this Longformer, we have used a small random partition of 7,2GB containing d
84
  ### Tokenization and pre-training
85
  The training corpus has been tokenized using a byte version of Byte-Pair Encoding (BPE) used in the original [RoBERTA](https://arxiv.org/abs/1907.11692) model with a vocabulary size of 50,262 tokens. The RoBERTa-base-bne pre-training consists of a masked language model training that follows the approach employed for the RoBERTa base. The training lasted a total of 40 hours with 8 computing nodes each one with 2 AMD MI50 GPUs of 32GB VRAM.
86
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87
 
88
  ## Additional information
89
 
 
36
  - [Licensing information](#licensing-information)
37
  - [Funding](#funding)
38
  - [Disclaimer](#disclaimer)
39
+ </details>
40
 
41
  ## Model description
42
  The longformer-base-4096-bne-es is the [Longformer](https://huggingface.co/allenai/longformer-base-4096) version of the [roberta-base-bne](https://https://huggingface.co/PlanTL-GOB-ES/roberta-base-bne) masked language model for the Spanish language. The model started from the **roberta-base-bne** checkpoint and was pretrained for MLM on long documents from our biomedical and clinical corpora.
 
85
  ### Tokenization and pre-training
86
  The training corpus has been tokenized using a byte version of Byte-Pair Encoding (BPE) used in the original [RoBERTA](https://arxiv.org/abs/1907.11692) model with a vocabulary size of 50,262 tokens. The RoBERTa-base-bne pre-training consists of a masked language model training that follows the approach employed for the RoBERTa base. The training lasted a total of 40 hours with 8 computing nodes each one with 2 AMD MI50 GPUs of 32GB VRAM.
87
 
88
+ ## Evaluation
89
+
90
+ When fine-tuned on downstream tasks, this model achieved the following performance:
91
+ | Dataset | Metric | [**Longformer-base**](https://huggingface.co/PlanTL-GOB-ES/longformer-base-4096-bne-es) |
92
+ |--------------|----------|------------|
93
+ | MLDoc | F1 | 0.9608 |
94
+ | CoNLL-NERC | F1 | 0.8757 |
95
+ | CAPITEL-NERC | F1 | 0.8985 |
96
+ | PAWS-X | F1 | 0.8878 |
97
+ | UD-POS | F1 | 0.9903 |
98
+ | CAPITEL-POS | F1 | 0.9853 |
99
+ | SQAC | F1 | 0.8026 |
100
+ | STS | Combined | 0.8338 |
101
 
102
  ## Additional information
103