timpal0l commited on
Commit
99f6c76
1 Parent(s): 2517300

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -71,4 +71,10 @@ print(response[0]["generated_text"].split("<s>Bot: ")[-1])
71
  ```
72
 
73
  ## Training & Data:
74
- The training was done on 1 NVIDIA DGX using DeepSpeed ZeRO 3 for three epochs on roughly 4GB of carefully selected translation data. It is a full finetune of all of the model parameters.
 
 
 
 
 
 
 
71
  ```
72
 
73
  ## Training & Data:
74
+ The training was done on 1 NVIDIA DGX using DeepSpeed ZeRO 3 for three epochs on roughly 4GB of carefully selected translation data. It is a full finetune of all of the model parameters.
75
+
76
+ | Epoch | Training Loss | Evaluation Loss |
77
+ |-------|---------------|-----------------|
78
+ | 1 | 1.309 | 1.281 |
79
+ | 2 | 1.161 | 1.242 |
80
+ | 3 | 1.053 | 1.219 |