julian-schelb
commited on
Commit
•
f091ca6
1
Parent(s):
b18e680
Update README.md
Browse files
README.md
CHANGED
@@ -24,6 +24,16 @@ datasets:
|
|
24 |
|
25 |
## Model description
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
## About RoBERTa
|
28 |
|
29 |
This model is a fine-tuned version of [XLM-RoBERTa](https://huggingface.co/xlm-roberta-large). The original model was pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. It was introduced in the paper [Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116) by Conneau et al. and first released in [this repository](https://github.com/pytorch/fairseq/tree/master/examples/xlmr).
|
@@ -38,10 +48,6 @@ This way, the model learns an inner representation of 100 languages that can the
|
|
38 |
|
39 |
This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.
|
40 |
|
41 |
-
## Training data
|
42 |
-
|
43 |
-
## Metrics
|
44 |
-
|
45 |
## Usage
|
46 |
|
47 |
You can use this model by using the AutoTokenize and AutoModelForTokenClassification class:
|
|
|
24 |
|
25 |
## Model description
|
26 |
|
27 |
+
## Training data
|
28 |
+
|
29 |
+
## Evaluation results
|
30 |
+
|
31 |
+
This model achieves the following results (meassured using the validation portion of the [wikiann](https://huggingface.co/datasets/wikiann)):
|
32 |
+
|
33 |
+
| Metric | Value |
|
34 |
+
|:------:|:----:|
|
35 |
+
|loss | 87.6 |
|
36 |
+
|
37 |
## About RoBERTa
|
38 |
|
39 |
This model is a fine-tuned version of [XLM-RoBERTa](https://huggingface.co/xlm-roberta-large). The original model was pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. It was introduced in the paper [Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116) by Conneau et al. and first released in [this repository](https://github.com/pytorch/fairseq/tree/master/examples/xlmr).
|
|
|
48 |
|
49 |
This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.
|
50 |
|
|
|
|
|
|
|
|
|
51 |
## Usage
|
52 |
|
53 |
You can use this model by using the AutoTokenize and AutoModelForTokenClassification class:
|