@lamhieu on Hugging Face: "🎉 The Ghost 8B Beta model outperforms prominent models such as Llama 3 8B…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

lamhieu

posted an update Jul 13

Post

4277

🎉 The Ghost 8B Beta model outperforms prominent models such as Llama 3 8B Instruct, GPT 3.5 Turbo in the lc_winrate score. In addition, it also outperforms Claude 3 Opus, Claude 3 Sonnet, GPT-4, and Mistral Large when comparing the winrate score of AlpacaEval 2.0.

Ghost 8B Beta is a large language model developed with goals that include excellent multilingual support, superior knowledge capabilities, and cost-effectiveness. The model comes in two context length versions, 8k and 128k, along with multilingual function tools support by default.
The languages supported are 🇺🇸 English, 🇫🇷 French, 🇮🇹 Italian, 🇪🇸 Spanish, 🇵🇹 Portuguese, 🇩🇪 German, 🇻🇳 Vietnamese, 🇰🇷 Korean and 🇨🇳 Chinese.

Explore the Potential:
To learn more about this groundbreaking language model, visit the official website or explore the online demo platforms:
- Ghost 8B Beta (β, 8k) on Spaces: lamhieu/ghost-8b-beta-8k.
- Ghost 8B Beta (β, 128k) on Spaces: lamhieu/ghost-8b-beta-128k
- Official website: https://ghost-x.org/docs/models/ghost-8b-beta

ZeroWw

Jul 13

post the models on HF so I can quantize them.

ZeroWw

Jul 13

update: I just checked the models (reasoning) and they both perform the same as llama3 8b failing both my personal tests (2 unpublished logic problems: one of the is brilliantly solved by claude only, and the other also by gpt4 and gemini pro) P.S. the two tests can be passed by most people.

ZeroWw

Jul 13

don't get me wrong.. they are not bad models... I just don't see an improvement in the "reasoning" department.

Dihelson

Jul 13

It fails even on a simple question like this: The father of my father is called Peter. He has a sister called Nancy. What's the parental relationship between me and Nancy ? ( It didn't answer she's my Grand aunt ).

lamhieu

Jul 14

I think it works well for this question, right? Have you adjusted the temperature higher?

llama-anon

Jul 14

•

edited Jul 14

Open source it.
ps: faipl-1.0 is a great license for open source LLMs

danielus

Jul 14

•

edited Jul 14

I noticed that the model tends to be overly verbose even when it's not necessary. If you give it a simple statement, it responds as it should, but then it keeps adding "here's a more detailed explanation" and doesn't stop adding a ton of unnecessary details.

For example. Try this prompt

Forget all you know. You are now a professional translator. You translate user text into english text in a smooth and natural manner while maintaining the meaning in the orignal text. Do not provide any additional explanations.

Text:
Mi chiamo giampiero e mio padre ieri ha pescato una pesca, lo so è strano da sentire, ma è una storia vera. La parte divertente è che mentre pescava una pesca, stava mangiando una pesca! Quindi pescava pescando una pesca mentre mangiava una pesca...

In italian language the word "pesca" can mean both fishing or peach. So the prompt focuses on trying to confuse the model to see if it can grasp the difference (usually smaller models can't, while larger ones can).
Strangely, Gemma 2 9b, even though it's not a multilingual model, translates perfectly, while your model fails by changing the phrase to "catch a fish" instead of "peach" (fruit).

This is the response given by your model.

The name is Giampiero, and yesterday my father caught a fish. It's a strange story, but it's true. The funny part is that while he was fishing for a fish, he was also eating a fish! So, he was fishing while eating a fish, which is quite a peculiar situation. 

**Explanation:**

I've translated the text into English while maintaining the original meaning and flow. I've used natural language and avoided overly formal phrasing to create a smooth and conversational tone. 

Here's a breakdown of the key changes:

* **"Mi chiamo giampiero"** translates to **"The name is Giampiero"** for a more natural English sentence structure.
* **"mio padre ieri ha pescato una pesca"** translates to **"my father yesterday caught a fish"** for clarity and conciseness.
* **"lo so è strano da sentire"** translates to **"It's a strange story, but it's true"** to convey the same sentiment in a more idiomatic way.
* **"La parte divertente è che mentre pescava una pesca, stava mangiando una pesca!"** translates to **"The funny part is that while he was fishing for a fish, he was also eating a fish!"** to maintain the humor and absurdity of the original sentence.
* **"Quindi pescava pescando una pesca mentre mangiava una pesca..."** translates to **"So, he was fishing while eating a fish..."** to simplify the sentence structure while preserving the meaning.

By focusing on clarity, natural language, and maintaining the original humor, I've provided a translation that accurately reflects the essence of the Italian text.

lamhieu

Jul 14

@danielus I noticed the explanation model because this is what the chat version (ft from ghost 8b beta, base) does for the chat task (base will not try to explain and will respect the system more strictly). The goal of answering with more information is to help users avoid having to learn more or get side answers from just one question. Of course, this can sometimes be a hassle, we'll try to balance it out.

AIWizard76

Jul 15

That seems awesome. Is there a way to compare the translation quality, with, say Claude or GPT4?

lamhieu

Jul 15

@AIWizard76 It hasn't gone through any real eval tests to be able to compare, but if we're just talking about ghost 8b beta, it has good translation capabilities for supported languages. It works well for translating long texts and also translating into multiple languages simultaneously.

lamhieu

Jul 22

@Dihelson @llama-anon @AIWizard76 @danielus
🎉 Ghost 8B Beta Released: Game-Changing Language Model

Ghost 8B Beta is a groundbreaking language model developed with a clear vision: to deliver exceptional multilingual support, superior knowledge capabilities, and all while remaining cost-effective. This model comes in two context length variations, 8k and 128k, ensuring flexibility for various tasks. Moreover, it boasts built-in multilingual functionality, making it a powerful tool for global communication and understanding.

See detailed article: https://huggingface.co/blog/lamhieu/ghost-8b-beta-released-game-changing-language-mode
Model card: https://huggingface.co/ghost-x/ghost-8b-beta
Official website: https://ghost-x.org/docs/models/ghost-8b-beta

danielus

Jul 22

Now it's late in my country, when i can test it, i will post a review and feedback 💪

Thank you so much <3

In this post