appvoid (appvoid)

reacted to merve's post with 🔥 3 days ago

Post

4666

OmniVision-968M: a new local VLM for edge devices, fast & small but performant
💨 a new vision language model with 9x less image tokens, super efficient
📖 aligned with DPO for reducing hallucinations
⚡️ Apache 2.0 license 🔥

Demo hf.co/spaces/NexaAIDev/omnivlm-dpo-demo
Model NexaAIDev/omnivision-968M

4 replies

·

reacted to m-ric's post with 🚀 13 days ago

Post

1612

𝗔𝗻𝗱𝗿𝗼𝗶𝗱𝗟𝗮𝗯: 𝗙𝗶𝗿𝘀𝘁 𝗲𝘃𝗲𝗿 𝘀𝘆𝘀𝘁𝗲𝗺𝗮𝘁𝗶𝗰 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸 𝗳𝗼𝗿 𝗔𝗻𝗱𝗿𝗼𝗶𝗱 𝗺𝗼𝗯𝗶𝗹𝗲 𝗮𝗴𝗲𝗻𝘁𝘀 𝘀𝗵𝗼𝘄𝘀 𝘁𝗵𝗮𝘁 𝘀𝗺𝗮𝗹𝗹, 𝗳𝗶𝗻𝗲-𝘁𝘂𝗻𝗲𝗱 𝗼𝗽𝗲𝗻 𝗺𝗼𝗱𝗲𝗹𝘀 𝗰𝗮𝗻 𝗽𝗼𝘄𝗲𝗿 𝗮 𝗝𝗔𝗥𝗩𝗜𝗦 𝘀𝘆𝘀𝘁𝗲𝗺 𝗼𝗻 𝘆𝗼𝘂𝗿 𝘀𝗺𝗮𝗿𝘁𝗽𝗵𝗼𝗻𝗲 📱🔥

A team from Tsinghua University just released AndroidLab, the first systematic framework to evaluate and train Android mobile agents that works with both text-only and multimodal models.

They show that fine-tuning small open-source models can significantly boost performance, matching that of much bigger closed models like GPT-4o.

The team built:

📊 A reproducible benchmark with 138 tasks across 9 apps to evaluate mobile agents systematically

📝📱 A framework supporting both text-only (via XML) and visual (via marked screenshots) interfaces

✅ An instruction dataset of 10.5k operation traces for training mobile agents

Key insights:

- 📈 Fine-tuning improves performance BY A LOT: Open-source model Llama-3.1-8B improves from 2% to 24% success rate after training, nearly reaching GPT-4o performance although it’s much smaller
- ⚙️ Text-only agents match multimodal ones: XML-based agents achieve similar performance to screenshot-based multimodal agents.

Read their paper here 👉 AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents (2410.24024)

reacted to KnutJaegersberg's post with 🤗 about 1 month ago

Post

2211

Wrote a blog post with some ideas about prompt engineering

https://huggingface.co/blog/KnutJaegersberg/first-principles-prompt-engineering

posted an update about 1 month ago

Post

1274

If someone would like to keep pushing the limits of what's possible on cpu while being efficient/fast, here's my un-trained arco model scaled-up to 770m parameters. Consider it a modern gpt-2-large to experiment with
appvoid/arco-plus

replied to their post about 2 months ago

How long did it take to reply and what are your context window limits? Model type?

it takes 3-5 seconds to reply when the prompt is longer than 30-50 words on average but it increases linearly with number of tokens in the prompt, the one on the picture is llama 3 1b but the one i'm using right now is arco 2 which is a llama model, cannot keep any kind of general knowledge, i noticed with qwen 2 (and later confirmed with meta's model) that you don't need a lot of parameters to get general knowledge, you just need tons of data

posted an update about 2 months ago

Post

3300

700m parameters are the sweet spot for cpu usage, please let's make more of those!

2 replies

·

posted an update about 2 months ago

Post

1823

meta just released 1b parameters model and to honor it i released arco 2 just in time for the fine-tuners to tweak around, enjoy these small powerful language models!!!

meta-llama/Llama-3.2-1B
appvoid/arco-2

1 reply

·

posted an update 2 months ago

Post

758

WHY ARE THERE NOT TEXT FEWSHOT DATASETS @ HUGGINGFACE? 😲

reacted to zolicsaki's post with 🔥 2 months ago

Post

1287

Fast inference is no longer a nice-to-have demo; it will be the driving force behind future frontier models. Time to switch over to custom AI hardware and short Nvidia.

Try out SambaNova's lightning fast API for free at https://sambanova.ai/fast-api?api_ref=444868

reacted to KnutJaegersberg's post with ❤️ 2 months ago

Post

1133

appvoid/arco

arco consistently outperforms every sota model below 600m parameters on average

appvoid/arco

posted an update 3 months ago

Post

1281

i just made the best 0.5b model to date (again)

its name is arco and is ready to fight any 0.5b model at arc challenge

appvoid/arco

replied to clem's post 3 months ago

as a model-tweaker is such a huge relief to know we have hf for years to come

reacted to clem's post with ❤️ 3 months ago

Post

3629

This isn’t a goal of ours because we have plenty of money in the bank but quite excited to see that @huggingfaceis profitable these days, with 220 team members and most of our platform being free (like model hosting) and open-source for the community!

Especially noteworthy at a time when most AI startups wouldn’t survive a year or two without VC money. Yay!

4 replies

·

reacted to severo's post with 🚀 4 months ago

Post

3437

[New tool] Follow interesting ML persons 👩‍🎨 👨‍🎤 👩‍🏫 with Followgraph

severo/followgraph

Please try it and tell me if it helped you discover high-quality content 👍 👎

I repurposed "Followgraph for Mastodon" (https://followgraph.vercel.app/).

My new follows: @TheBloke @mlabonne @teknium @KnutJaegersberg @SkalskiP @AmelieSchreiber @lbourdois @ceyda @andrewyng @Pclanglais @karpathy

And you?

5 replies

·

replied to severo's post 4 months ago

@karpathy , @teknium and @KnutJaegersberg are cool guys to follow, but seriously, what happened to @TheBloke ?

posted an update 4 months ago

Post

1499

palmer-004 becomes 🔥turbo🔥 now is half the size, twice the speed and the best overall 0.5b language model in huggingface.

appvoid/palmer-004-turbo

1 reply

·

reacted to qnguyen3's post with 🔥 5 months ago

Post

3740

nanoLLaVA-1.5 is here! Same size (1B), better performance 🔥🔥🔥
And it is much more powerful than v1.0
Try it out now on HF Spaces: qnguyen3/nanoLLaVA
Model: qnguyen3/nanoLLaVA-1.5

3 replies

·

replied to qnguyen3's post 5 months ago

congrats on this achievement!

replied to their post 6 months ago

Sorry, forgot to add it. It's added now as apache license.

posted an update 6 months ago

Post

860

Get your hands on the new, best, most-performant tiny model on huggingface. With 32k context window, you can fine-tune it on larger datasets or your preferred rag functionality.

appvoid/palmer-004

2 replies

·

appvoid

AI & ML interests

Recent Activity

Organizations

appvoid's activity