Delta-Vector commited on
Commit
83d850a
1 Parent(s): 44e01ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -5
README.md CHANGED
@@ -215,8 +215,4 @@ The training was done for 2 epochs. We used 8 x [H100s](https://www.nvidia.com/
215
 
216
  ## Safety
217
 
218
- Avoid misusing this model, or you’ll need a ‘clicker’ to reset reality. ;)
219
-
220
- ## Musings
221
-
222
- One of the members of Anthracite had quite an interesting idea, to finetune a smaller model for 4 epochs at a lower Learning rate as quote "Smaller models learn slower" - [Kalomaze]() provided access to 8 X A40s and We finetuned what now is [Darkens-8B]() for 4 epochs (and it's 2.5 Epoch version released as [Tor-8B]()) and the result was quite impressive, the 4 epoch model was not "overfit" at all and was rather pleasant to use. Lucy Knada then allowed me to do a full parameter finetune with the same configuration as Darkens/Tor-8B (With some minor dataset tweaks) on 8 * H100s, We hosted and tested the models and i ended up giving the green light to release the 4 epoch version at Magnum 9B V4 and released the 2 epoch version as my own. I felt both were extremely good models, but in testing i preferred the 2 epoch. It was not as "suggestive" as magnum models (and Claude RP log trained models) are. It would not dive into Claudeisms right out of the gate and you could use it for both Safe for work and "Other" purposes.
 
215
 
216
  ## Safety
217
 
218
+ Nein.