Katpeeler commited on
Commit
36dbc44
1 Parent(s): ee86502

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -9
README.md CHANGED
@@ -1,5 +1,4 @@
1
  ---
2
- license: mit
3
  base_model: gpt2
4
  tags:
5
  - generated_from_trainer
@@ -8,29 +7,53 @@ model-index:
8
  results: []
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
-
14
  # midi_model_3
15
 
16
- This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
  - Loss: 0.5542
19
 
20
  ## Model description
21
 
22
- More information needed
 
 
23
 
24
  ## Intended uses & limitations
25
 
26
- More information needed
 
27
 
28
  ## Training and evaluation data
29
 
30
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
  ## Training procedure
33
 
 
 
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
@@ -43,6 +66,14 @@ The following hyperparameters were used during training:
43
  - lr_scheduler_warmup_ratio: 0.01
44
  - num_epochs: 10
45
 
 
 
 
 
 
 
 
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
@@ -84,4 +115,4 @@ The following hyperparameters were used during training:
84
  - Transformers 4.35.2
85
  - Pytorch 2.1.0+cu118
86
  - Datasets 2.15.0
87
- - Tokenizers 0.15.0
 
1
  ---
 
2
  base_model: gpt2
3
  tags:
4
  - generated_from_trainer
 
7
  results: []
8
  ---
9
 
 
 
 
10
  # midi_model_3
11
 
12
+ This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the js-fakes-4bars dataset.
13
  It achieves the following results on the evaluation set:
14
  - Loss: 0.5542
15
 
16
  ## Model description
17
 
18
+ This model generates encoded midi that follows the format of jsfakes chorales.
19
+ This representation enables the ability to train traditional language models on midi data.
20
+ Also see Magenta [here](https://github.com/magenta/note-seq).
21
 
22
  ## Intended uses & limitations
23
 
24
+ For generating basic encoded midi in the jsfakes style, as a proof of concept.
25
+ This model is very limited, and shows the ability to train and host this kind of model completely free.
26
 
27
  ## Training and evaluation data
28
 
29
+ This model is trained on the js-fakes-4bars dataset, which is a tokenized version of the JS-Fakes dataset by Omar Peracha.
30
+
31
+ - Link to the original datset [here](https://github.com/omarperacha/js-fakes)
32
+ - Link to the tokenized dataset [here](https://huggingface.co/datasets/TristanBehrens/js-fakes-4bars)
33
+ - Training set is 4.02k rows
34
+ - Test set is 463 rows
35
+
36
+ The data encodes midi information as encoded text. Here are some examples of what the data looks like:
37
+
38
+ - PIECE_START (The start of the midi.)
39
+ - PIECE_END (The end of the midi.)
40
+ - STYLE=JSFAKES (A style tag, which is unused in this dataset.)
41
+ - GENRE=JSFAKES (A genre tag, also unused in this dataset.)
42
+ - TRACK_START (The start of an instrument's track.)
43
+ - TRACK_END (The end of an instrument's track.)
44
+ - INST=48 (The instrument the notes will belong to.)
45
+ - BAR_START (The start of a musical measure.)
46
+ - BAR_END (the end of a musical measure.)
47
+ - NOTE_ON=57 (Specifies the note that will start.)
48
+ - NOTE_OFF=57 (Specifies the note that will end.)
49
+ - TIME_DELTA=4 (How long the note plays for.)
50
 
51
  ## Training procedure
52
 
53
+ Training was done through Google Colab's free tier, using a single 15GB Tesla T4 GPU.
54
+ Training was logged through Weights and Biases.
55
+ A link to the full training notebook can be found [here] (https://colab.research.google.com/drive/1uvv-ChthIrmEJMBOVyL7mTm4dcf4QZq7#scrollTo=34kpyWSnaJE1)
56
+
57
  ### Training hyperparameters
58
 
59
  The following hyperparameters were used during training:
 
66
  - lr_scheduler_warmup_ratio: 0.01
67
  - num_epochs: 10
68
 
69
+ ### Training Statistics
70
+
71
+ - Total training runtime: 787 seconds (around 13 minutes)
72
+ - Training samples per second: 45.91
73
+ - Training steps per second: 11.484
74
+ - Average GPU watt usage: 66W
75
+ - Average GPU temperature: 77C
76
+
77
  ### Training results
78
 
79
  | Training Loss | Epoch | Step | Validation Loss |
 
115
  - Transformers 4.35.2
116
  - Pytorch 2.1.0+cu118
117
  - Datasets 2.15.0
118
+ - Tokenizers 0.15.0