Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ language: sv
|
|
5 |
# A Swedish Bert model
|
6 |
|
7 |
## Model description
|
8 |
-
This model
|
9 |
<figure>
|
10 |
|
11 |
| Hyperparameter | Value |
|
@@ -18,7 +18,7 @@ This model has the same architecture as the Bert Large model in [this paper](htt
|
|
18 |
|
19 |
|
20 |
## Training data
|
21 |
-
The model is pretrained on a Swedish text corpus of around
|
22 |
<figure>
|
23 |
|
24 |
| Dataset | Genre | Size(GB)|
|
|
|
5 |
# A Swedish Bert model
|
6 |
|
7 |
## Model description
|
8 |
+
This model follows the Bert Large model architecture as implemented in [Megatron-LM framework](https://github.com/NVIDIA/Megatron-LM). It was trained with a batch size of 512 in 600k steps. The model contains following parameters:
|
9 |
<figure>
|
10 |
|
11 |
| Hyperparameter | Value |
|
|
|
18 |
|
19 |
|
20 |
## Training data
|
21 |
+
The model is pretrained on a Swedish text corpus of around 85 GB from a variety of sources as shown below.
|
22 |
<figure>
|
23 |
|
24 |
| Dataset | Genre | Size(GB)|
|