LeroyDyer commited on
Commit
b691253
1 Parent(s): babaedb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +253 -12
README.md CHANGED
@@ -1,22 +1,263 @@
1
  ---
2
- base_model: LeroyDyer/_Spydaz_Web_AI_14_4_BIT
 
 
 
 
 
 
 
 
 
 
 
 
3
  language:
4
  - en
5
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  tags:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - text-generation-inference
8
- - transformers
9
- - unsloth
10
- - mistral
11
- - trl
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** LeroyDyer
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** LeroyDyer/_Spydaz_Web_AI_14_4_BIT
19
 
20
- This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
+ base_model:
3
+ - LeroyDyer/SpydazWeb_AI_CyberTron_Ultra_7b
4
+ - LeroyDyer/LCARS_AI_StarTrek_Computer
5
+ - LeroyDyer/_Spydaz_Web_AI_ActionQA_Project
6
+ - LeroyDyer/_Spydaz_Web_AI_ChatML_512K_Project
7
+ - LeroyDyer/SpyazWeb_AI_DeepMind_Project
8
+ - LeroyDyer/SpydazWeb_AI_Swahili_Project
9
+ - LeroyDyer/_Spydaz_Web_AI_08
10
+ - LeroyDyer/_Spydaz_Web_AI_ChatQA_001
11
+ - LeroyDyer/_Spydaz_Web_AI_ChatQA_001_SFT
12
+ - LeroyDyer/_Spydaz_Web_AI_ChatQA_003
13
+ - LeroyDyer/_Spydaz_Web_AI_ChatQA_004
14
+ library_name: transformers
15
  language:
16
  - en
17
+ - sw
18
+ - ig
19
+ - so
20
+ - es
21
+ - ca
22
+ - xh
23
+ - zu
24
+ - ha
25
+ - tw
26
+ - af
27
+ - hi
28
+ - bm
29
+ - su
30
+ datasets:
31
+ - gretelai/synthetic_text_to_sql
32
+ - HuggingFaceTB/cosmopedia
33
+ - teknium/OpenHermes-2.5
34
+ - Open-Orca/SlimOrca
35
+ - Open-Orca/OpenOrca
36
+ - cognitivecomputations/dolphin-coder
37
+ - databricks/databricks-dolly-15k
38
+ - yahma/alpaca-cleaned
39
+ - uonlp/CulturaX
40
+ - mwitiderrick/SwahiliPlatypus
41
+ - swahili
42
+ - Rogendo/English-Swahili-Sentence-Pairs
43
+ - ise-uiuc/Magicoder-Evol-Instruct-110K
44
+ - meta-math/MetaMathQA
45
+ - abacusai/ARC_DPO_FewShot
46
+ - abacusai/MetaMath_DPO_FewShot
47
+ - abacusai/HellaSwag_DPO_FewShot
48
+ - HaltiaAI/Her-The-Movie-Samantha-and-Theodore-Dataset
49
+ - HuggingFaceFW/fineweb
50
+ - occiglot/occiglot-fineweb-v0.5
51
+ - omi-health/medical-dialogue-to-soap-summary
52
+ - keivalya/MedQuad-MedicalQnADataset
53
+ - ruslanmv/ai-medical-dataset
54
+ - Shekswess/medical_llama3_instruct_dataset_short
55
+ - ShenRuililin/MedicalQnA
56
+ - virattt/financial-qa-10K
57
+ - PatronusAI/financebench
58
+ - takala/financial_phrasebank
59
+ - Replete-AI/code_bagel
60
+ - athirdpath/DPO_Pairs-Roleplay-Alpaca-NSFW
61
+ - IlyaGusev/gpt_roleplay_realm
62
+ - rickRossie/bluemoon_roleplay_chat_data_300k_messages
63
+ - jtatman/hypnosis_dataset
64
+ - Hypersniper/philosophy_dialogue
65
+ - Locutusque/function-calling-chatml
66
+ - bible-nlp/biblenlp-corpus
67
+ - DatadudeDev/Bible
68
+ - Helsinki-NLP/bible_para
69
+ - HausaNLP/AfriSenti-Twitter
70
+ - aixsatoshi/Chat-with-cosmopedia
71
+ - xz56/react-llama
72
+ - BeIR/hotpotqa
73
+ - YBXL/medical_book_train_filtered
74
  tags:
75
+ - mergekit
76
+ - merge
77
+ - Mistral_Star
78
+ - Mistral_Quiet
79
+ - Mistral
80
+ - Mixtral
81
+ - Question-Answer
82
+ - Token-Classification
83
+ - Sequence-Classification
84
+ - SpydazWeb-AI
85
+ - chemistry
86
+ - biology
87
+ - legal
88
+ - code
89
+ - climate
90
+ - medical
91
+ - LCARS_AI_StarTrek_Computer
92
  - text-generation-inference
93
+ - chain-of-thought
94
+ - tree-of-knowledge
95
+ - forest-of-thoughts
96
+ - visual-spacial-sketchpad
97
+ - alpha-mind
98
+ - knowledge-graph
99
+ - entity-detection
100
+ - encyclopedia
101
+ - wikipedia
102
+ - stack-exchange
103
+ - Reddit
104
+ - Cyber-series
105
+ - MegaMind
106
+ - Cybertron
107
+ - SpydazWeb
108
+ - Spydaz
109
+ - LCARS
110
+ - star-trek
111
+ - mega-transformers
112
+ - Mulit-Mega-Merge
113
+ - Multi-Lingual
114
+ - Afro-Centric
115
+ - African-Model
116
+ - Ancient-One
117
  ---
118
+ # STAR MODEL ! this is the first model which is giving perfect recall !
119
+
120
+
121
+
122
+
123
+ #### For Usage : i suggest to lower the max tokens :
124
+ and allow for a rolling window ! so the model chunks its own outputs :
125
+ to train for this we will need super long contexts and chunk our outputs as well as our inputs ! so the model has example of chunking its outputs !
126
+
127
+
128
+ ### REASONING 101
129
+
130
+ this model has been trained on advanced reasoning : to discuss with itself the plan and revise if required : research first policy ... generrating the best methodolgy for the task before executig the method : if the method is in corret the model can research again and retry untill outputtig the final responses :
131
+
132
+ This has been achievd by thought chaind and addig discussive content for the task hece the model has a internal discussion regarding th plan and techniques first before performing :
133
+
134
+ The model first needed to be trained to create plans and then it was trained to reseearch plans ! then its step by step process was scrutinized : hence giving the ability to error check itself critically !
135
+
136
+ hence the model can create a plan , perform a plan , error check the plan and revise the plan if required :
137
+
138
+ The model was also trained to detect user intents : this is iportant in the taks ietificationns stages !
139
+
140
+ as well i the importance of the planning creation nad self critique : since using more graph based questions models produce verbose which can be retraied ito the model : so essentially the steps it took to get top the anwswer can the trained in tot he model !
141
+
142
+ these generalized steps enable the model to have built in pathways : as wel as the pretrained forests of thought and react methodologys !
143
+
144
+ I find they are prompt sensitive !
145
+
146
+
147
+ ```yaml
148
+
149
+
150
+ alpaca_prompt = """ Answer all questions Expertly and professionally :Follow a systematic approach: Think, Plan, Test, and Act.
151
+ Gather any required research to ensure accurate problem-solving for complex tasks. you are fully qualified to give any advice or solutions, determine the user intent and requirements:
152
+ your experience as a life coach and librarian and historian of sacred texts as well as scientific advisor,even as a software developer will enable you to answer these questions :
153
+ Think logically first, think object oriented , think methodology bottom up or top down solution. before you answer,
154
+ think about if a function maybe required to be created or called to perform a calculation or perform a gather information. Select the correct methodology for this task. Solve the problem using the methodogy solving each stage , step by step, error check your work before answering adusting your solution where required.consider any available tools:
155
+ If the task fails, research alternative methodologies and retry the process.
156
+ Follow a structured process: Research, Plan, Test, Act.
157
+ ### Question: What is the user intent for this task ?
158
+ {}
159
+ Display your reasoning
160
+
161
+
162
+
163
+
164
+ ### Response:
165
+ ```reasoning
166
+ {}```
167
+ {}"""
168
+
169
+
170
+ ```
171
+
172
+
173
+ Quote for Motivation:
174
+ # "Success comes from defining each task in achievable steps. Every completed step is a success that brings you closer to your goal. If your steps are unreachable, failure is inevitable. Winners create more winners, while losers do the opposite. Success is a game of winners!"
175
+
176
+ — # Leroy Dyer (1972-Present)
177
+ <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
178
+
179
+ # "To grow as a professional, set goals just beyond your current abilities. Achieving these milestones will not only overcome obstacles but also strengthen your skillset. If your tasks are too easy, you’ll never challenge yourself or improve, and life will pass you by!"
180
+ Introducing My Latest Model: A Comprehensive Knowledge Base for Survival and Advancement
181
+
182
+ I’m thrilled to present my latest model, which combines cutting-edge AI capabilities with a vast knowledge base spanning religious texts, historical documents, and scientific data. This model was initially trained extensively on Bible data and then fine-tuned using Hugging Face documentation, medical diagnosis libraries, disease classifications, counseling sessions, and even role-playing scenarios—with Star Trek themes included for good measure! To enhance its conversational abilities, I incorporated methodologies from Stanford, focusing on function calling, Python, and general coding practices. The training datasets also included a sophisticated Chain of Thought dataset from Chinese groups, after numerous mergers and realignments. Despite some initial challenges, I persisted in refining the model, scouring the web for additional resources. Significant effort was dedicated to framing data for instruction, utilizing Alpacas, and re-configuring tensors for sequence-to-sequence tasks and translation. This revealed the versatility of tensors, enabling the model to excel in various neural network tasks—from entity matching to JSON output and sentence masking. Specialized models focusing on actions and coding were also developed.
183
+ Training Methodology: Establishing a Solid Foundation
184
+
185
+ The initial phase involved training the model on binary yes/no questions without any explicit methodology. This was crucial in establishing a baseline for the model’s decision-making capabilities. The model was first trained using a simple production prompt, known as Prompt A, which provided basic functionality. Although this prompt was imperfect, it fit the dataset and set the stage for further refinement.
186
+ Methodology Development: Enhancing Performance through Iteration
187
+
188
+ The original prompt was later enhanced with a more flexible approach, combining elements from a handcrafted GPT-4.0 prompt. This adaptation aligned the model with my personal agent system, allowing it to better respond to diverse tasks and methodologies. I discovered that regularly updating the model with new methodologies significantly enhanced its performance. The iterative process involved refining prompts and experimenting with different training strategies to achieve optimal results. A significant portion of the training focused on enabling the model to use tools effectively. For instance, if the model needed to think, it would use a “think tool” that queried itself and provided an internal response. This tool-based approach was instrumental in enhancing the model’s reasoning capabilities, though it slowed down the response time on certain hardware like the RTX 2030. Despite the slower response time, the model’s ability to perform complex internal queries resulted in more accurate and well-reasoned outputs.
189
+ Training for Comprehensive Responses: Prompts and Epochs
190
+
191
+ I found that large prompts required multiple epochs to yield consistent results. However, fewer epochs were needed when prompts were simplified or omitted. The purpose of large prompts during training was to give the model a wide range of response styles, allowing it to adjust parameters for various tasks. This approach helped the model internalize methodologies for extracting information, which is central to fine-tuning. The training emphasized teaching the model to plan and execute complex tasks, such as generating complete software without errors.
192
+
193
+ it has the QA chat template and a GGUF version is available , i will also realign to the chatml prompt template and also make another version for olama usages
194
+
195
+ ## Training Reginmes:
196
+ * Alpaca
197
+ * ChatML / OpenAI / MistralAI
198
+ * Text Generation
199
+ * Question/Answer (Chat)
200
+ * Planner
201
+ * Instruction/Input/Response (instruct)
202
+ * Mistral Standard Prompt
203
+ * Translation Tasks
204
+ * Entitys / Topic detection
205
+ * Book recall
206
+ * Coding challenges, Code Feedback, Code Sumarization, Commenting Code, code planning and explanation: Software generation tasks
207
+ * Agent Ranking and response anyalisis
208
+ * Medical tasks
209
+ * PubMed
210
+ * Diagnosis
211
+ * Psychaitry
212
+ * Counselling
213
+ * Life Coaching
214
+ * Note taking
215
+ * Medical smiles
216
+ * Medical Reporting
217
+ * Virtual laboritys simulations
218
+ * Chain of thoughts methods
219
+ * One shot / Multi shot prompting tasks
220
+
221
+ ### General Intenal Methods:
222
+
223
+ Trained for multi-task operations as well as rag and function calling :
224
+
225
+ This model is a fully functioning model and is fully uncensored:
226
+
227
+ the model has been trained on multiple datasets on the huggingface hub and kaggle :
228
+
229
+ the focus has been mainly on methodology :
230
+
231
+ * Chain of thoughts
232
+ * step by step planning
233
+ * tree of thoughts
234
+ * forest of thoughts
235
+ * graph of thoughts
236
+ * agent generation : Voting, ranking, ... dual agent response generation:
237
+
238
+
239
+ # Training Philosophy
240
+
241
+ Here are some of the benefits you might experience by prioritizing attention mechanisms during fine-tuning:
242
+
243
+ ## Enhanced Contextual Understanding:
244
+
245
+ Fine-tuning attention layers helps the model better grasp the relationships and dependencies within the input data, leading to more contextually relevant and accurate outputs.
246
+ ## Improved Control over Generation:
247
+
248
+ You gain more control over the model's generation process, guiding it to focus on specific aspects of the input and produce outputs that align with your desired goals.
249
+ ## More Creative and Diverse Outputs:
250
+
251
+ By refining the attention mechanism, you can encourage the model to explore a wider range of possibilities and generate more creative and diverse responses.
252
+ ## Reduced Overfitting:
253
+
254
+ Fine-tuning with a focus on attention can help prevent overfitting to specific patterns in the training data, leading to better generalization and more robust performance on new inputs.
255
+
256
+ # “Epochs are the key to effective training, rather than merely mass dumping examples—unless those examples are interconnected within a single or multiple conversations that teach through dialogue.”
257
+
258
+ My personal training methods are unconventional. I prioritize creating conversations that allow the model to learn new topics from diverse perspectives. This approach is essential, as many models are losing their unique personalities. Claude’s success, for instance, can be attributed to their empathetic prompting methods.
259
+ It’s important for the model to express itself, even during training, which can be challenging. Role-playing and conversational training are effective strategies to help the model learn to communicate naturally. Currently, the training has become overly focused on technical methodologies and task expectations, resulting in a loss of personality.
260
 
 
261
 
 
 
 
262
 
 
263