Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,104 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- eng
|
4 |
+
tags:
|
5 |
+
- llama-2
|
6 |
+
- sft
|
7 |
+
license:
|
8 |
+
- mit
|
9 |
+
datasets:
|
10 |
+
- LDJnr/Puffin
|
11 |
+
---
|
12 |
+
|
13 |
+
![puffin](https://i.imgur.com/R2xTHMb.png)
|
14 |
+
|
15 |
+
## **Redmond-Puffin-70B**
|
16 |
+
|
17 |
+
**The first commercially available language model released by Nous Research!**
|
18 |
+
|
19 |
+
This is a larger version of Puffin-
|
20 |
+
which was originally the worlds first third-party llama-2 fine-tune. leveraging a hand curated set of 3K high quality examples, many of which take full advantage of the 4096 context length of Llama 2. This model was fine-tuned by Nous Research, with LDJ leading the training and dataset curation, along with significant dataset formation contributions by J-Supha.
|
21 |
+
|
22 |
+
Special thank you to Pygmalion AI for sponsoring the compute.
|
23 |
+
|
24 |
+
Special thank you to Emozilla for assisting with training experimentations and benchmarking.
|
25 |
+
|
26 |
+
## Model Training
|
27 |
+
|
28 |
+
Redmond-Puffin 70B is a new model trained for multiple epochs on a dataset of 3,000 carefully curated GPT-4 examples, most of which are long context conversations between a real human and GPT-4.
|
29 |
+
|
30 |
+
Additional data came from carefully curated sub sections of datasets such as CamelAI's Physics, Chemistry, Biology and Math.
|
31 |
+
|
32 |
+
## Prompt Format
|
33 |
+
|
34 |
+
The reccomended model usage is:
|
35 |
+
|
36 |
+
```
|
37 |
+
### human:
|
38 |
+
|
39 |
+
### response:
|
40 |
+
```
|
41 |
+
Optional reccomended pre-prompt / system prompt:
|
42 |
+
|
43 |
+
```
|
44 |
+
### human: Interact in conversation to the best of your ability, please be concise, logical, intelligent and coherent.
|
45 |
+
|
46 |
+
### response: Sure! sounds good.
|
47 |
+
```
|
48 |
+
|
49 |
+
## When should I use Puffin or Hermes 2?
|
50 |
+
|
51 |
+
Although full benchmarks have not completed for Puffin,
|
52 |
+
Original Puffin 13B and Hermes-2 13B both beat previous SOTA for GPT4ALL benchmarks, with Hermes-2 winning by a 0.1% margin over Puffin.
|
53 |
+
|
54 |
+
Overall, for general purpose zero-shot and/or single turn instructions, Hermes will likely be the way to go. Puffin may be prefferred for creative long conversation interactions, like having Puffin play a character or help brain storm creative ideas or concepts that make contextual sense within an already deep conversation.
|
55 |
+
|
56 |
+
Thank you to the comprehensive analysis and comparison of Puffin and Hermes by reddit user WolframRavenwolf here: https://www.reddit.com/r/LocalLLaMA/comments/158j9r9/nous_hermes_llama2_vs_redmond_puffin_13b/
|
57 |
+
|
58 |
+
## Example Outputs!:
|
59 |
+
|
60 |
+
![puffin](https://i.imgur.com/P0MsN8B.png)
|
61 |
+
|
62 |
+
![puffin](https://i.imgur.com/8EO3ThV.png)
|
63 |
+
|
64 |
+
![puffin](https://i.imgur.com/5IWolFw.png)
|
65 |
+
|
66 |
+
![puffin](https://i.imgur.com/TQui8m7.png)
|
67 |
+
|
68 |
+
![puffin](https://i.imgur.com/tderIfl.png)
|
69 |
+
|
70 |
+
## Notable Features:
|
71 |
+
|
72 |
+
- The first Llama-2 based fine-tuned model released by Nous Research.
|
73 |
+
|
74 |
+
- Ability to recall information upto 2023 without internet (ChatGPT cut off date is in 2021)
|
75 |
+
|
76 |
+
- Pretrained on 2 trillion tokens of text. (This is double the amount of most Open LLM's)
|
77 |
+
|
78 |
+
- Pretrained with a context length of 4096 tokens, and fine-tuned on a significant amount of multi-turn conversations reaching that full token limit.
|
79 |
+
|
80 |
+
- The first commercially available language model released by Nous Research.
|
81 |
+
|
82 |
+
## Future Plans
|
83 |
+
|
84 |
+
This is a relatively early build amongst the grand plans for the future of Puffin!
|
85 |
+
|
86 |
+
Current limitations: Some token mismatch problems have been identified, these may effect the current output quality, we plan to have this solved in Puffin V2 along with other improvements.
|
87 |
+
|
88 |
+
## How you can help!
|
89 |
+
|
90 |
+
In the near future we plan on leveraging the help of domain specific expert volunteers to eliminate any mathematically/verifiably incorrect answers from our training curations.
|
91 |
+
|
92 |
+
If you have at-least a bachelors in mathematics, physics, biology or chemistry and would like to volunteer even just 30 minutes of your expertise time, please contact LDJ on discord!
|
93 |
+
|
94 |
+
## Benchmarks (New benchmarks coming soon, however here are the 13B benchmarks for now)!
|
95 |
+
|
96 |
+
As of Puffins release, it achieves a new SOTA for the GPT4All benchmarks! Supplanting Hermes for the #1 position!
|
97 |
+
(Rounded to nearest tenth)
|
98 |
+
|
99 |
+
Previous Sota: Hermes - 68.8
|
100 |
+
New Sota: Puffin - 69.9 (+1.1)
|
101 |
+
|
102 |
+
Puffin 13B supplants Hermes-2 for the #1 spot in Arc-E, HellaSwag and Winogrande!
|
103 |
+
|
104 |
+
Puffin also perfectly ties with Hermes in PIQA, however Hermes-2 still excels in much of Big Bench and AGIEval, so it's highly reccomended you give it a try as well!
|