Undi95 commited on
Commit
9ed9e92
1 Parent(s): 67c3c50

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -7
README.md CHANGED
@@ -22,21 +22,56 @@ As some people have told us our models are sloppy, Ikari decided to say fuck it
22
 
23
  Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back!
24
 
 
 
 
 
 
25
 
26
  ## Credits:
27
  - Undi
28
  - IkariDev
29
 
30
- ## Training data used:
31
- We will point out all dataset we used here, please be patient the time we get them all back kek.
32
 
33
- Temporary credit for the following madlads, who contributed to the datasets we have build over time: Gryphe, Caitlyn, Kalomaze, Gifted Gummy Bee, Sao [...]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
- # Prompt template: Mistral
36
 
37
- ```
38
- <s>[INST] {input} [/INST] {output}</s>
39
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
  ## Others
42
 
 
22
 
23
  Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back!
24
 
25
+ # Prompt template: Mistral
26
+
27
+ ```
28
+ <s>[INST] {input} [/INST] {output}</s>
29
+ ```
30
 
31
  ## Credits:
32
  - Undi
33
  - IkariDev
34
 
35
+ ## Training data we used to make our dataset:
 
36
 
37
+ - [Epiculous/Gnosis](https://huggingface.co/Epiculous/Gnosis)
38
+ - [ChaoticNeutrals/Luminous_Opus](https://huggingface.co/datasets/ChaoticNeutrals/Luminous_Opus)
39
+ - [ChaoticNeutrals/Synthetic-Dark-RP](https://huggingface.co/datasets/ChaoticNeutrals/Synthetic-Dark-RP)
40
+ - [ChaoticNeutrals/Synthetic-RP](https://huggingface.co/datasets/ChaoticNeutrals/Synthetic-RP)
41
+ - [Gryphe/Sonnet3.5-SlimOrcaDedupCleaned](https://huggingface.co/datasets/Gryphe/Sonnet3.5-SlimOrcaDedupCleaned)
42
+ - [Gryphe/Opus-WritingPrompts](https://huggingface.co/datasets/Gryphe/Opus-WritingPrompts)
43
+ - [meseca/writing-opus-6k](https://huggingface.co/datasets/meseca/writing-opus-6k)
44
+ - [meseca/opus-instruct-9k](https://huggingface.co/datasets/meseca/opus-instruct-9k)
45
+ - [PJMixers/grimulkan_theory-of-mind-ShareGPT](https://huggingface.co/datasets/PJMixers/grimulkan_theory-of-mind-ShareGPT)
46
+ - [NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)
47
+ - [Undi95/toxic-dpo-v0.1-sharegpt](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-sharegpt)
48
+ - [cgato/SlimOrcaDedupCleaned](https://huggingface.co/datasets/cgato/SlimOrcaDedupCleaned)
49
+ - [kalomaze/Opus_Instruct_25k](https://huggingface.co/datasets/kalomaze/Opus_Instruct_25k)
50
+ - [Doctor-Shotgun/no-robots-sharegpt](https://huggingface.co/datasets/Doctor-Shotgun/no-robots-sharegpt)
51
+ - [Norquinal/claude_multiround_chat_30k](https://huggingface.co/datasets/Norquinal/claude_multiround_chat_30k)
52
+ - [nothingiisreal/Claude-3-Opus-Instruct-15K](https://huggingface.co/datasets/nothingiisreal/Claude-3-Opus-Instruct-15K)
53
+ - All the Aesirs dataset, cleaned, unslopped
54
+ - All le luminae dataset, cleaned, unslopped
55
+ - Small part of Airoboros reduced
56
 
57
+ We sadly didn't find the sources of the following, DM us if you recognize your set !
58
 
59
+ - Opus_Instruct-v2-6.5K-Filtered-v2-sharegpt
60
+ - claude_sharegpt_trimmed
61
+ - CapybaraPure_Decontaminated-ShareGPT_reduced
62
+
63
+ ## Datasets credits:
64
+ - Epiculous
65
+ - ChaoticNeutrals
66
+ - Gryphe
67
+ - meseca
68
+ - PJMixers
69
+ - NobodyExistsOnTheInternet
70
+ - cgato
71
+ - kalomaze
72
+ - Doctor-Shotgun
73
+ - Norquinal
74
+ - nothingiisreal
75
 
76
  ## Others
77