fdaudens (Florent Daudens)

posted an update about 17 hours ago

Post

455

🚀 DeepSeek just dropped DeepSeek-R1-Lite-Preview with “reasoning” capacity.

- Matches OpenAI o1-preview on AIME & MATH benchmarks.
- Transparent process output
- Open-source model to be released

Try it out: https://chat.deepseek.com/

reacted to fracapuano's post with ❤️ 1 day ago

Post

932

Sharing what we have built over the course of the weekend at the @llamameta hackathon, by Cerebral Valley in London 🇬🇧 👇

@gabrycina @calebgcc and I competed with 200+ participants and 50+ teams for a 24-hrs sprint centered around hacking for impact! We focused on applications of robotics to those in need of assisted living, moving our focus to enable greater autonomy and accessibility of robotics in everyday life.

complete list of assets 👇
🤗 trained robotics policies
v1:
- fracapuano/moss-pills
- fracapuano/moss-cup
v2:
- fracapuano/meta-grasp

🤗 datasets
v1:
- fracapuano/pills
- fracapuano/cup
v2:
- fracapuano/cupim

You can find a live demo of our submission at: https://x.com/_fracapuano/status/1858102728691458554

If you want to know more about how we collected 100GB+ of data, trained multiple RL-policies using @lerobot and used Llama-3.2 models to handle user interactions and switch between tasks, go ahead and have a look! Also, don't be a stranger, and reach out 🦾

Our project is fully open-source, for the community (and ourselves, 👨‍🍳) to build! A huge thank you to @cadene for the help (and the robot 🤭) - truly feeling these hugs-vibes 🤗 , and to @thomwolf and @clem for sharing our work across

Little extra:
➡️ Our 🧠EEG waves🧠-based control of the 🦾robotic arm🦾

posted an update 1 day ago

Post

684

My new favorite bookmark: AnyChat. The ultimate AI Swiss Army knife that lets you switch between ChatGPT, Gemini, Claude, LLaMA, Grok & more—all in one place!

Really cool work by @akhaliq

akhaliq/anychat

replied to cfahlgren1's post 2 days ago

The Crown!

posted an update 3 days ago

Post

1507

🚀 @Qwen just dropped 2.5-Turbo!

1M token context (that's entire "War and Peace"!) + 4.3x faster processing speed. Same price, way more power 🔥

Check out the demo: Qwen/Qwen2.5-Turbo-1M-Demo

#QWEN

posted an update 6 days ago

Post

1153

🪄 MagicQuill: AI that reads your mind for image edits! Point at what bugs you, and it suggests the perfect fixes. No more manual editing headaches. Try it here: AI4Editing/MagicQuill

reacted to merve's post with 🔥 6 days ago

Post

1920

Amazing past days at open ML, it's raining coding models, let's have a recap 🌧️ Find all models and datasets here merve/nov-15-releases-67372d0ebdc354756a52ecd0

Models
💻 Coding: Qwen team released two Qwen2.5-Coder checkpoints of 32B and 7B. Infly released OpenCoder: 1.5B and 8B coding models with instruction SFT'd versions and their datasets! 💗

🖼️ Image/Video Gen: Alibaba vision lab released In-context LoRA -- 10 LoRA models on different themes based on Flux. Also Mochi the sota video generation model with A2.0 license now comes natively supported in diffusers 👏

🖼️ VLMs/Multimodal: NexaAIDev released Omnivision 968M a new vision language model aligned with DPO for reducing hallucinations, also comes with GGUF ckpts 👏 Microsoft released LLM2CLIP, a new CLIP-like model with longer context window allowing complex text inputs and better search

🎮 AGI?: Etched released Oasis 500M, a diffusion based open world model that takes keyboard input and outputs gameplay 🤯

Datasets
Common Corpus: A text dataset with 2T tokens with permissive license for EN/FR on various sources: code, science, finance, culture 📖

reacted to AdinaY's post with 🔥 6 days ago

Post

2479

Let’s dive into the exciting releases from the Chinese community last week 🔥🚀
More details 👉 https://huggingface.co/zh-ai-community

Code model:
✨Qwen 2.5 coder by Alibaba Qwen
Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f
✨OpenCoder by InflyAI - Fully open code model🙌
infly/opencoder-672cec44bbb86c39910fb55e

Image model:
✨Hunyuan3D-1.0 by Tencent
tencent/Hunyuan3D-1

MLLM:
✨JanusFlow by DeepSeek
deepseek-ai/JanusFlow-1.3B
deepseek-ai/JanusFlow-1.3B
✨Mono-InternVL-2B by OpenGVlab
OpenGVLab/Mono-InternVL-2B

Video model:
✨CogVideoX 1.5 by ChatGLM
THUDM/CogVideoX1.5-5B-SAT

Audio model:
✨Fish Agent by FishAudio
fishaudio/fish-agent-v0.1-3b

Dataset:
✨OPI dataset by BAAIBeijing
BAAI/OPI

posted an update 7 days ago

Post

737

@maxiw just created a dataset of the posts on the Hub and gathered some stats: https://huggingface.co/posts/maxiw/833289193510507
The :heart: is winning on the Hub!

reacted to maxiw's post with 🔥❤️ 8 days ago

Post

4485

I was curious to see what people post here on HF so I created a dataset with all HF Posts: maxiw/hf-posts

Some interesting stats:

Top 5 Authors by Total Impressions:
-----------------------------------
@merve : 171,783 impressions (68 posts)
@fdaudens : 135,253 impressions (81 posts)
@singhsidhukuldeep : 122,591 impressions (81 posts)
@akhaliq : 119,526 impressions (78 posts)
@MonsterMMORPG : 112,500 impressions (45 posts)

Top 5 Users by Number of Reactions Given:
----------------------------------------
@osanseviero : 1278 reactions
@clem : 910 reactions
@John6666 : 899 reactions
@victor : 674 reactions
@samusenps : 655 reactions

Top 5 Most Used Reactions:
-------------------------
❤️: 7048 times
🔥: 5921 times
👍: 4856 times
🚀: 2549 times
🤗: 2065 times

9 replies

·

posted an update 8 days ago

Post

1817

Been reading about the "bigger models = better AI" narrative getting pushed back today.

@thomwolf tackled this head on at Web Summit and highlighted how important small models are (and why closed-source companies haven't pushed for this 😬). They're crushing it: today's 1B parameter models outperform last year's 10B models.

Fascinating to hear him talk about the secret sauce behind this approach.

posted an update 9 days ago

Post

1816

Fascinating point from @thomwolf at Web Summit: AI misuse (deepfakes, fake news) is actually easier to make with closed models, not with open-source ones.

This challenges the common narrative that open-source AI is inherently more dangerous. The reality is more nuanced - while we may think open source is technically easier to misuse, closed models' accessibility and product-focused design appear to be driving more actual harm.

Important context for current AI safety discussions and regulation debates.

Do you agree? 👇

1 reply

·

posted an update 10 days ago

Post

2220

🤯 AI progress keeps blowing my mind! Just experienced Qwen's new Coder demo - built a complete flashcard web app with a single prompt. The results are incredible!

This demo is part of the new Qwen2.5 Coder family (0.5B to 32B models), surpassing/matching GPT4o and Claude Sonnet 3.5 across multiple coding benchmarks.

- 128K context window for 14B/32B models
- Drop-in replacement for GPT-4 in Cursor & Artifacts
- Models on the Hub under Apache 2.0 license

🔗 Try it yourself: Qwen/Qwen2.5-Coder-Artifacts

This is democratization of coding in real-time. Excited to see AI tools becoming more capable and accessible.

What would you build with this? Share your ideas below! 👇

#AI #Programming #TechInnovation #OpenSource #SoftwareDevelopment

1 reply

·

posted an update 16 days ago

Post

2358

Just tested Argilla's new data annotation feature - it's a game changer for AI project quality.

Upload CSVs, work with published datasets, or improve existing ones directly on HuggingFace Hub. Setup took < 2 minutes, no code needed (see example below where I selected a dataset to classify tweets in categories).

Real world impact: Missing in Chicago won a Pulitzer using a similar approach - 200 volunteers labeled police misconduct files to train their model. That's the power of good data annotation.

Three immediate use cases I see:
- Build collaborative training sets with your community (surprisingly underused in AI journalism)
- Turn your website chatbot logs into high-quality fine-tuning data
- Compare generated vs published content (great for SEO headlines)

Works for solo projects or teams up to 100 people. All integrated with HuggingFace Hub for immediate model training.

Interesting to see tools like this making data quality more accessible. Data quality is the hidden driver of AI success that we don't talk about enough.

- Check out the blogpost: https://huggingface.co/blog/argilla-ui-hub
- And the quickstart guide: https://docs.argilla.io/latest/getting_started/quickstart/

reacted to m-ric's post with 🚀 16 days ago

Post

2483

𝗛𝘂𝗻𝘆𝘂𝗮𝗻-𝗟𝗮𝗿𝗴𝗲 𝗷𝘂𝘀𝘁 𝗿𝗲𝗹𝗲𝗮𝘀𝗲𝗱 𝗯𝘆 𝗧𝗲𝗻𝗰𝗲𝗻𝘁: 𝗟𝗮𝗿𝗴𝗲𝘀𝘁 𝗲𝘃𝗲𝗿 𝗼𝗽𝗲𝗻 𝗠𝗼𝗘 𝗟𝗟𝗠, 𝗼𝗻𝗹𝘆 𝟱𝟮𝗕 𝗮𝗰𝘁𝗶𝘃𝗲 𝗽𝗮𝗿𝗮𝗺𝗲𝘁𝗲𝗿𝘀 𝗯𝘂𝘁 𝗯𝗲𝗮𝘁𝘀 𝗟𝗟𝗮𝗠𝗔 𝟯.𝟭-𝟰𝟬𝟱𝗕 𝗼𝗻 𝗺𝗼𝘀𝘁 𝗮𝗰𝗮𝗱𝗲𝗺𝗶𝗰 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀 🚀

⚡ Mixture of Experts (MoE) architecture: 389 B parameters in total, but only 52B are activated for any input

🧪 Trained on 7T tokens, including 1.5T tokens of synthetic data

🏗️ Architecture : Novel "recycle routing" prevents token dropping when experts are overrloaded

📊 Great benchmark results: Surpasses Llama-3-405B-Instruct in most benchmarks although it has 8x fewer active parameters
‣ Impressive perf on MATH: 77.4

🐋 Large context length: up to 256K tokens

🔒 License:
‣ Commercial use allowed, except if your products have >100M monthly active users
‣ No access in the EU

🤗 Model weights available on HF!

Read the full paper here 👉 Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent (2411.02265)

posted an update 20 days ago

Post

1192

🎙️ "We need digital sobriety." @sasha challenges Big Tech's race for nuclear energy on BBC AI Decoded. Instead of pursuing more power, shouldn't we first ask if we really need AI everywhere?

Such an eye-opening chat! Check it out here: https://www.youtube.com/watch?v=3wAduy52mGc

posted an update 21 days ago

Post

2346

First AI Journalism Lab cohort just wrapped - endless inspiration for newsrooms:
- Ludwig Siegele built an AI style checker for The Economist
- Rodney Gibbs created a tool helping small newsrooms analyze stories through user needs
- Monsur Hussain developed AI trend monitoring system for fact-checking WhatsApp claims
- David Cohn built a system for analyzing audience engagement
- Clare Spencer crafted video personas with AI

The insights on adoption during the discussion were fascinating - their approach really resonated with me. Instead of forcing AI tools onto teams, they emphasized getting skeptics involved early in testing and creating safe spaces for open discussion. Start small with enthusiastic participants, build a community of internal AI champions, and focus on solving specific problems rather than pushing for adoption.

As a coach, I also learned a lot. My 5 key takeaways:
- Newsrooms are bursting with AI x journalism innovation
- Internal alignment > technical challenges. Strong dev/PM relationships = magic
- Early prototyping + user involvement = better adoption. Set realistic expectations & embrace feedback
- Cross-newsroom collaboration supercharges innovation
- Great products can emerge in weeks with proper scoping

See the projects: https://www.youtube.com/watch?v=5PMxMDfDI_0&

Kudos to Kyle Plantz, Nikita Roy, Craig Newmark Graduate School of Journalism at CUNY for making it happen!

1 reply

·

posted an update 23 days ago

Post

2260

🔍 NYT leveraged AI to investigate election interference by analyzing 400+ hours of recorded meetings - that's 5M words of data!

AI spotted patterns, humans verified facts. Every AI-flagged quote was manually verified against source recordings. Really appreciate that they published their full methodology - transparency matters when using AI in journalism.

A perfect blend of tech & journalism.

The future of journalism isn't robots replacing reporters - it's AI helping humans process massive datasets more efficiently. Sometimes the most powerful tech solutions are the least flashy ones.

Read the article: https://www.nytimes.com/interactive/2024/10/28/us/politics/inside-the-movement-behind-trumps-election-lies.html?unlocked_article_code=1.Vk4.ucv9.dbHVquTQaf0G&smid=nytcore-ios-share

reacted to albertvillanova's post with 🚀 23 days ago

Post

3089

🚀 Exciting update! You can now compare multiple models side-by-side with the Hugging Face Open LLM Comparator! 📊

open-llm-leaderboard/comparator

Dive into multi-model evaluations, pinpoint the best model for your needs, and explore insights across top open LLMs all in one place. Ready to level up your model comparison game?

Florent Daudens

AI & ML interests

Recent Activity

Articles

Bringing Open-Source Models to Spreadsheets 🚀

Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality

Organizations

fdaudens's activity