Severine commited on
Commit
b7925d4
1 Parent(s): 5c9d117

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -5,7 +5,7 @@ language: sv
5
  # A Swedish Bert model
6
 
7
  ## Model description
8
- This model has the same architecture as the Bert Large model in [this paper](https://arxiv.org/abs/1810.04805). It was trained with a batch size of 512 in 600k steps. It is implemented with the Megatron Bert Architecture containing following parameters:
9
  <figure>
10
 
11
  | Hyperparameter | Value |
@@ -18,7 +18,7 @@ This model has the same architecture as the Bert Large model in [this paper](htt
18
 
19
 
20
  ## Training data
21
- The model is pretrained on a Swedish text corpus of around 80 GB from a variety of sources as shown below.
22
  <figure>
23
 
24
  | Dataset | Genre | Size(GB)|
 
5
  # A Swedish Bert model
6
 
7
  ## Model description
8
+ This model follows the Bert Large model architecture as implemented in [Megatron-LM framework](https://github.com/NVIDIA/Megatron-LM). It was trained with a batch size of 512 in 600k steps. The model contains following parameters:
9
  <figure>
10
 
11
  | Hyperparameter | Value |
 
18
 
19
 
20
  ## Training data
21
+ The model is pretrained on a Swedish text corpus of around 85 GB from a variety of sources as shown below.
22
  <figure>
23
 
24
  | Dataset | Genre | Size(GB)|