File size: 1,854 Bytes
adf4b14
 
 
2bfc9d7
 
 
 
 
 
 
 
 
adf4b14
 
2bfc9d7
adf4b14
 
 
2bfc9d7
 
adf4b14
2bfc9d7
 
adf4b14
2bfc9d7
 
 
adf4b14
 
 
 
2bfc9d7
6d9772d
2bfc9d7
473911e
2bfc9d7
 
 
 
adf4b14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
"""
# https://huggingface.co/datasets/Tongjilibo/self_cognition

https://huggingface.co/datasets/arcee-ai/The-Tome
# https://huggingface.co/datasets/Locutusque/function-calling-chatml
# https://huggingface.co/datasets/cognitivecomputations/SystemChat-2.0
# https://huggingface.co/datasets/cognitivecomputations/open-instruct-uncensored
# https://huggingface.co/datasets/arcee-ai/reasoning-sharegpt
# https://huggingface.co/datasets/arcee-ai/infini-instruct-top-500k
# https://huggingface.co/datasets/arcee-ai/BAAI-Infinity-Instruct-System
# https://huggingface.co/datasets/arcee-ai/financial-instructions-cleaned-2

https://huggingface.co/datasets/HuggingFaceH4/no_robots
https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k
https://huggingface.co/datasets/HuggingFaceH4/deita-10k-v0-sft
https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1
https://huggingface.co/datasets/teknium/OpenHermes-2.5
https://huggingface.co/datasets/Open-Orca/slimorca-deduped-cleaned-corrected
https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned

https://huggingface.co/datasets/arcee-ai/EvolKit-20k
https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K
https://huggingface.co/datasets/WizardLMTeam/WizardLM_evol_instruct_V2_196k
https://huggingface.co/datasets/arcee-ai/agent-data
https://huggingface.co/datasets/ai2-adapt-dev/olmoe-commercial

https://huggingface.co/datasets/ai2-adapt-dev/openmath-2-math

https://huggingface.co/datasets/KingNish/reasoning-base-20k
https://huggingface.co/datasets/Magpie-Align/Magpie-Reasoning-150K
https://huggingface.co/datasets/thesven/gsm8k-reasoning
"""

# Non-conversation

"""
# https://huggingface.co/datasets/gair-prox/RedPajama-pro
# https://huggingface.co/datasets/codecomplete/base_dataset
# https://huggingface.co/datasets/SivilTaram/starcoder2-documentation
"""