gradientai/Llama-3-8B-Instruct-262k

rombodawg

May 2

vihangsharma

May 2

scam?

rombodawg

May 2

@vihangsharma they had a nice reply here
https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k/discussions/11

michaelfeil

Gradient AI org May 3

Hey, @vihangsharma as mentioned in the other threads we have worked on better alignment.

@rombodawg We liked your meme!

https://huggingface.co/gradientai/Llama-3-70B-Instruct-Gradient-262k Let us know if you are interested in doing the same for 8B.

rombodawg

May 3

Hey, @vihangsharma as mentioned in the other threads we have worked on better alignment.

@rombodawg We liked your meme!

https://huggingface.co/gradientai/Llama-3-70B-Instruct-Gradient-262k Let us know if you are interested in doing the same for 8B.

I would love to see 8b have the same effectiveness at extremely high context inference as the 70b. The majority of the open source community is running modest hardware, at most a rtx 3090 with 24gb of vram, but evem thats rare, an update to the 8b-instruct model would be astounding

koesn

Jun 1

•

edited Jun 2

Much appreciate to Gradient team, thank's for this amazing model. I haven't tried to the extent over 100k tokens. But I'm actively using ±26k-±100k input including very long system prompt, exactly as Mark said about this use case. Miqu is great on handling those scenario, but it's 32k is very limiting so I have to back and forth to GPT-4o. Now Gradient 70b 262k fill the gaps and I replaced Miqu with it. Now I'm happily using Gradient's 262k to process my ±100k tokens system prompt. Gradient's legacy brings new possibilities.

gradientai
/

Llama-3-8B-Instruct-262k

ITS NOT REAL