DavidAU commited on
Commit
fa8add6
1 Parent(s): ab3c7f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +107 -3
README.md CHANGED
@@ -1,3 +1,107 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - creative
7
+ - story
8
+ - writing
9
+ - fiction
10
+ - float32
11
+ - roleplaying
12
+ - rp
13
+ - enhanced
14
+ - space whale
15
+ - 32 bit upscale
16
+ ---
17
+
18
+ [uploads in progress]
19
+
20
+ <font color=red><h3> Ultra High Remaster of the incredible: Psyonic-Cetacean-20b - Imatrix Plus. </h3></font>
21
+
22
+ This is a Floating Point 32 upscale, where all components and merges were remastered to floating point 32.
23
+ This includes all the merges (recreated with master files), and where possible subbing full FP32 models.
24
+
25
+ The goal: Carry forward maximum precision right up to the point where it is "GUFFed".
26
+
27
+ This includes F32 master file for GGUF too... at a whopping 78 GBs.
28
+
29
+ WHY?
30
+
31
+ Because the difference between F32 vs BF16 is... over 8 DECIMAL places.
32
+
33
+ And as each merge / model is modified there are "losses" along the way.
34
+
35
+ These losses are carried forward and in turn lead to more losses.
36
+
37
+ And decimal points are critical to model performance.
38
+
39
+ SMALL?
40
+
41
+ Yes... but multipled by each merge(s), and compression(s): 20 billion times.
42
+
43
+ <B>The result:</b>
44
+
45
+ At Q2K an impressive drop of 533 points in perplexity. (lower is better)
46
+ (VS: Q2K orginal base model: PPL = 9.8077 +/- 0.06821 )
47
+
48
+ At Q4KM a whopping drop of 976 points in perplexity.
49
+ (VS: Q4km orginal base model -> PPL = 8.7858 +/- 0.06074)
50
+
51
+ At Q6 an awesome drop of 234 points in perplexity.
52
+ (VS: Q6 orginal base model -> PPL = 8.6070 +/- 0.05907 )
53
+
54
+ To put this in perspective "Q6" now operates ABOVE the orginal full precision version of "Psyonic-Cetacean-20b"
55
+ and Q4KM operates at close to Q6 level quality.
56
+
57
+ This because at "Q6" the quant / compressed model is considered to be accurate within "+0.0008 ppl" of the full,
58
+ uncompressed / unquanted model and it exceeds this threshold by over 200 points.
59
+
60
+ <I> Imatrix quants take this even further in most cases DOUBLING the "drop" in perplexity realized in the reg quants. </i>
61
+
62
+ Q4km-imatrix :
63
+
64
+ Final estimate: PPL = 8.6095 +/- 0.05898
65
+
66
+ (Non imatrix: Final estimate: PPL = 8.6902 +/- 0.05985 )
67
+
68
+ (VS: Q4km base model -> PPL = 8.7858 +/- 0.06074)
69
+
70
+ (VS: Q6 BASE model -> Final estimate: PPL = 8.6070 +/- 0.05907 Q6)
71
+
72
+
73
+ But... what about Q8?
74
+
75
+ The mountain moved:
76
+
77
+ 150 points better: PPL = 8.5850 +/- 0.05881 VS: BASE/ORGINAL: PPL = 8.6012 +/- 0.05900
78
+
79
+ <B>The bottom line here is this:</b>
80
+
81
+ Higher quality instruction following and output.
82
+
83
+ Likewise you can use a smaller compression, with higher token per second and still get great quality.
84
+
85
+ Same great model... turbo charged.
86
+
87
+ This is the first group of remasters.
88
+
89
+ <B>The FOUR Horsemen:</B>
90
+
91
+ This repo will be followed by a "reg quant plus" repo, which added additional components into the GGUF (all levels) at floating point 32
92
+ precision to further increase the sheer creativity and raw AI horsepower.
93
+
94
+ This process shaves at extra 50-100 points off perplexity... again.
95
+
96
+ Following this group will be a full float 32 precision Imatrix (including reg quants "imatrixed").
97
+
98
+ Test results VS org and "ultra" regular quants will be posted when they come in.
99
+
100
+ Imatrix Plus repo (with the same floating 32 enhancement at "reg quant plus") that will push the limit even more.
101
+
102
+ Details of all methods (and pitfalls to avoid) employed to make this high precision remasters will be
103
+ posted shortly along with comparsions of orginal model and new ultra remaster.
104
+
105
+ Thanks again to Jeb Carter, the orginal creator of "Psyonic-Cetacean 20B"
106
+
107
+ [ https://huggingface.co/jebcarter/psyonic-cetacean-20B ]