TristanBehrens commited on
Commit
874c0f8
1 Parent(s): bf96b11

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - NLP
6
+ license: mit
7
+ datasets:
8
+ - TristanBehrens/bach_garland_2024-100K
9
+ base_model: None
10
+ ---
11
+
12
+ # bach_garland_xlstm - An xLSTM Model
13
+
14
+ ![Trained with Helibrunna](banner.jpg)
15
+
16
+ Trained with [Helibrunna](https://github.com/AI-Guru/helibrunna) by [Dr. Tristan Behrens](https://de.linkedin.com/in/dr-tristan-behrens-734967a2).
17
+
18
+ ## Configuration
19
+
20
+ ```
21
+ training:
22
+ model_name: bach_garland_xlstm
23
+ batch_size: 4
24
+ lr: 0.001
25
+ lr_warmup_steps: 5000
26
+ lr_decay_until_steps: 50000
27
+ lr_decay_factor: 0.001
28
+ weight_decay: 0.1
29
+ amp_precision: bfloat16
30
+ weight_precision: float32
31
+ enable_mixed_precision: true
32
+ num_epochs: 4
33
+ output_dir: output/bach_garland_xlstm
34
+ save_every_step: 500
35
+ log_every_step: 10
36
+ wandb_project: bach_garland_xlstm
37
+ torch_compile: false
38
+ model:
39
+ num_blocks: 4
40
+ embedding_dim: 64
41
+ mlstm_block:
42
+ mlstm:
43
+ num_heads: 4
44
+ slstm_block:
45
+ slstm:
46
+ num_heads: 4
47
+ slstm_at:
48
+ - 2
49
+ context_length: 4096
50
+ vocab_size: 178
51
+ dataset:
52
+ hugging_face_id: TristanBehrens/bach_garland_2024-100K
53
+ tokenizer:
54
+ type: whitespace
55
+ fill_token: '[EOS]'
56
+
57
+ ```