AI-Nordics
/

bert-large-swedish-cased

Inference Endpoints

Model card Files Files and versions Community

Severine commited on Feb 15, 2022

Commit

b7925d4

•

1 Parent(s): 5c9d117

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ language: sv
 # A Swedish Bert model
 ## Model description
-This model has the same architecture as the Bert Large model in [this paper](https://arxiv.org/abs/1810.04805). It was trained with a batch size of 512 in 600k steps. It is implemented with the Megatron Bert Architecture containing following parameters:
 <figure>
 | Hyperparameter       | Value      |
@@ -18,7 +18,7 @@ This model has the same architecture as the Bert Large model in [this paper](htt
 ## Training data
-The model is pretrained on a Swedish text corpus of around 80 GB from a variety of sources as shown below.
 <figure>
 | Dataset       | Genre      | Size(GB)|

 # A Swedish Bert model
 ## Model description
+This model follows the Bert Large model architecture as implemented in [Megatron-LM framework](https://github.com/NVIDIA/Megatron-LM). It was trained with a batch size of 512 in 600k steps. The model contains following parameters:
 <figure>
 | Hyperparameter       | Value      |
 ## Training data
+The model is pretrained on a Swedish text corpus of around 85 GB from a variety of sources as shown below.
 <figure>
 | Dataset       | Genre      | Size(GB)|