What was your thought process on this version? I thought it would be a good idea to open a discussion.

by Joseph717171 - opened Jul 3

Jul 3

So, I downloaded your newest version, and GGUF-Quantized using Bartowski's imatrix for it, what did you do different in this version from the previous versions. And, what was your thought process and goal that you hand in mind? 😋

Nitral-AI

Owner Jul 3

•

edited Jul 3

So, I downloaded your newest version, and GGUF-Quantized using Bartowski's imatrix for it, what did you do different in this version from the previous versions. And, what was your thought process and goal that you hand in mind? 😋

This was my attempt at doing a partial train over the SPPo iter3 model with some of the datasets, and splitting the others over 0.2. (then merged them together with slerp) Motivation was to improve foundational logic and hopefully diversity of prose used by the end model. Honestly this version feels overbaked and repetitive compared to 0.5 in my own testing, albeit with a small improvement to logic. (Waiting on additional user feedback before i make any decisions with new variations.)

Joseph717171

Jul 3

I noticed the model gets confused over whether it is me or it when roleplaying. Consistently the model will RP as me, when it should RP as itself - it's kind of funny in an annoying way. 😂

Nitral-AI

Owner Jul 3

I noticed the model gets confused over whether it is me or it when roleplaying. Consistently the model will RP as me, when it should RP as itself - it's kind of funny in an annoying way. 😂

Interesting, i have several users testing and this is the first ive heard of that issue on this variation. (ive also been testing extensively to attempt and get the samplers dialed in and have yet to see that.)

Joseph717171

Jul 3

I was curious, would you be able to make use of Self-Play Preference Optimization for your model. It would be interesting to see what it learns from the datasets. Keep in mind, I have no idea the undertaking this would be. But, it's fun to imagine and to speculate. 🤔

Nitral-AI

Owner Jul 3

I was curious, would you be able to make use of Self-Play Preference Optimization for your model. It would be interesting to see what it learns from the datasets. Keep in mind, I have no idea the undertaking this would be. But, it's fun to imagine and to speculate. 🤔

Training over the base model they provided is super feasible, but setting up the pipeline and doing it myself with one of the models post training is a whole other magnitude of compute.

Nitral-AI

Owner Jul 3

You are using the L3 Presets provided in the 0.72 repo correct, and not the chatml ones from 0.5?

Joseph717171

Jul 3

•

edited Jul 3

That's what I was thinking: continuing Self-Play Preference Optimization off of their UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3. 😋

Joseph717171

Jul 3

Yes, I don't use ChatML for LLaMa-3-8B-Instruct Fine-Tunes. 😂

Nitral-AI

Owner Jul 3

•

edited Jul 3

Yes, I don't use ChatML for LLaMa-3-8B-Instruct Fine-Tunes. 😂

Just making sure, 0.6 was trained on chatml (and i provided experimental presets for 0.5 in chatml at one point). So people were using those presets with the other versions... and it was leading to some problems.

Joseph717171

Jul 3

Are you on Discord? I like HuggingFace's discussion forms. However, as a place for dynamic debate and discussion, they are a bit antiquated and limited in their utility. It be great to communicate to communicate there as well, if you're down. 🤔

Joseph717171

Jul 3

•

edited Jul 3

We could use NeverSleep's discord. 😋

https://discord.gg/AT5gpexk

Joseph717171

Jul 3

I just saw you're already on there. My bad. 😅

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment