Request

by isr431 - opened 9 days ago

9 days ago

•

This model is spectacular and is approaching Claude 3 levels of creativity/prose. I know this might be a long shot, but I was wondering if you'd consider fine-tuning Qwen 2.5 14B? I've found it's quite a bit smarter than Nemo and I really like its writing style (thought it does contain GPT slop and refusals). Just thought I'd throw that out there - would love to hear what you think!

sometimesanotion

9 days ago

•

edited 9 days ago

Hear hear. I think Nemo's tokenizer is intriguing, but I'd like to see what can happen with that Qwen 2.5 14b model!

Delta-Vector

Anthracite org 9 days ago

We can definitely try, in the mean time you might be like this finetune done by another person on Supernova Medius 14B (A a cross-architecture qwen-LLama3 distillation done by Arcee) - https://huggingface.co/underwoods/medius-erebus-magnum-14b

No guarantees it works but in my testing it seemed pretty good.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment