Ann Huang PRO

erinys

AI & ML interests

None yet

Recent Activity

reacted to reach-vb's post with πŸš€ 1 day ago
reacted to reach-vb's post with πŸ”₯ 1 day ago
reacted to maxiw's post with πŸ€— 8 days ago

Articles

Organizations

erinys's activity

reacted to reach-vb's post with πŸš€πŸ”₯ 1 day ago
view post
Post
3918
What a brilliant week for Open Source AI!

Qwen 2.5 Coder by Alibaba - 0.5B / 1.5B / 3B / 7B / 14B/ 32B (Base + Instruct) Code generation LLMs, with 32B tackling giants like Gemnini 1.5 Pro, Claude Sonnet
Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f

LLM2CLIP from Microsoft - Leverage LLMs to train ultra-powerful CLIP models! Boosts performance over the previous SOTA by ~17%
microsoft/llm2clip-672323a266173cfa40b32d4c

Athene v2 Chat & Agent by NexusFlow - SoTA general LLM fine-tuned from Qwen 2.5 72B excels at Chat + Function Calling/ JSON/ Agents
Nexusflow/athene-v2-6735b85e505981a794fb02cc

Orca Agent Instruct by Microsoft - 1 million instruct pairs covering text editing, creative writing, coding, reading comprehension, etc - permissively licensed
microsoft/orca-agentinstruct-1M-v1

Ultravox by FixieAI - 70B/ 8B model approaching GPT4o level, pick any LLM, train an adapter with Whisper as Audio Encoder
reach-vb/ultravox-audio-language-model-release-67373b602af0a52b2a88ae71

JanusFlow 1.3 by DeepSeek - Next iteration of their Unified MultiModal LLM Janus with RectifiedFlow
deepseek-ai/JanusFlow-1.3B

Common Corpus by Pleais - 2,003,039,184,047 multilingual, commercially permissive and high quality tokens!
PleIAs/common_corpus

I'm sure I missed a lot, can't wait for the next week!

Put down in comments what I missed! πŸ€—
reacted to maxiw's post with πŸ€—β€οΈ 8 days ago
view post
Post
4483
I was curious to see what people post here on HF so I created a dataset with all HF Posts: maxiw/hf-posts

Some interesting stats:

Top 5 Authors by Total Impressions:
-----------------------------------
@merve : 171,783 impressions (68 posts)
@fdaudens : 135,253 impressions (81 posts)
@singhsidhukuldeep : 122,591 impressions (81 posts)
@akhaliq : 119,526 impressions (78 posts)
@MonsterMMORPG : 112,500 impressions (45 posts)

Top 5 Users by Number of Reactions Given:
----------------------------------------
@osanseviero : 1278 reactions
@clem : 910 reactions
@John6666 : 899 reactions
@victor : 674 reactions
@samusenps : 655 reactions

Top 5 Most Used Reactions:
-------------------------
❀️: 7048 times
πŸ”₯: 5921 times
πŸ‘: 4856 times
πŸš€: 2549 times
πŸ€—: 2065 times
Β·
posted an update about 1 month ago
reacted to jsulz's post with πŸ”₯ about 1 month ago
view post
Post
1638
The Hugging Face Hub hosts over 1.5M Model, Dataset, and Space repositories. To scale to 10M+, the XetHub team (https://huggingface.co/xet-team) is replacing Git LFS with a new technology that improves storage and transfer capabilities with some future developer experience benefits to boot.

Thanks to @yuchenglow and @port8080 (for their analysis covering LFS usage from March 2022–Sept 2024), we now have insights into what we’re storing. Check out the Gradio app to explore:
- Storage growth over time
- File types over all repositories
- Some simple optimizations we're investigating

xet-team/lfs-analysis
replied to their post about 2 months ago
view reply

This is great feedback @John6666 - and I've seen your suggestions in the other thread as well. As a non-ML engineer myself, it's been really interesting to explore HF with fresh eyes! We're doing some early exploration on HF understandability and discoverability in our team - would you be open to chatting sometime about potential approaches? We'd love to get your feedback!

posted an update about 2 months ago
view post
Post
1937
We shut down XetHub today after almost 2 years. What we learned from launching our Git-scaled product from scratch:
- Don't make me change my workflow
- Data inertia is real
- ML best practices are still evolving

Closing the door on our public product lets us focus on our new goal of scaling HF Hub's storage backend to improve devX for a larger community. We'd love to hear your thoughts on what experiences we can improve!

Read the full post: https://xethub.com/blog/shutting-down-xethub-learnings-and-takeaways
Β·
reacted to merve's post with πŸ”₯ about 2 months ago
view post
Post
3981
If you feel like you missed out for ECCV 2024, there's an app to browse the papers, rank for popularity, filter for open models, datasets and demos πŸ“

Get started at ECCV/ECCV2024-papers ✨
posted an update about 2 months ago
view post
Post
1370
We did a thing! Eight weeks into our Hugging Face tenure, we can demo a round-trip of Xet-backed files from our local machine to a prod Hugging Face S3 bucket and back. πŸš€

It’s been exciting to dive into how the Hub is built and design our steel thread through the infrastructure. Now that the thread is up, we can kick off project Capacious Extremis πŸͺ„ to add all the other goodies: authentication, authorization, deduplication, privacy, and more.

What does this mean for you? You’re one step closer to ⚑ faster downloads, uploads, and iterative development on Hugging Face Hub!
This is our first step toward replacing Git LFS as the Hub's storage backend: https://huggingface.co/blog/xethub-joins-hf

Check out the demo on LinkedIn to see the transfer in action: https://www.linkedin.com/posts/annux_youve-heard-of-blue-steel-but-have-activity-7245062126535405568-3cvJ
reacted to jsulz's post with πŸš€ about 2 months ago
view post
Post
1910
In August, the XetHub team joined Hugging Face
- https://huggingface.co/blog/xethub-joins-hf - and we’ve been rolling up our sleeves to bring the best of both worlds together. We started with a deep dive into the current state of files stored with Git LFS on the Hub.

Getting this information was no small feat. We had to:
* Analyze a complete database dump of all repositories and files stored in Git LFS across Hugging Face.
* Parse through metadata on file sizes and types to accurately map the storage breakdown across Spaces, Models, and Datasets.

You can read more about the findings (with some jaw-dropping stats + charts) here https://www.linkedin.com/feed/update/urn:li:activity:7244486280351285248