Lucain Pouget PRO
AI & ML interests
Articles
Organizations
Wauplin's activity
1,000 spots available first-come first serve with some surprises during the stream!
You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM
We've just released ๐๐๐๐๐๐๐๐๐๐๐_๐๐๐ v0.25.0 and it's packed with powerful new features and improvements!
โจ ๐ง๐ผ๐ฝ ๐๐ถ๐ด๐ต๐น๐ถ๐ด๐ต๐๐:
โข ๐ ๐จ๐ฝ๐น๐ผ๐ฎ๐ฑ ๐น๐ฎ๐ฟ๐ด๐ฒ ๐ณ๐ผ๐น๐ฑ๐ฒ๐ฟ๐ with ease using
huggingface-cli upload-large-folder
. Designed for your massive models and datasets. Much recommended if you struggle to upload your Llama 70B fine-tuned model ๐คกโข ๐ ๐ฆ๐ฒ๐ฎ๐ฟ๐ฐ๐ต ๐๐ฃ๐: new search filters (gated status, inference status) and fetch trending score.
โข โก๐๐ป๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ฐ๐ฒ๐๐น๐ถ๐ฒ๐ป๐: major improvements simplifying chat completions and handling async tasks better.
Weโve also introduced tons of bug fixes and quality-of-life improvements - thanks to the awesome contributions from our community! ๐ช
๐ก Check out the release notes: Wauplin/huggingface_hub#8
Want to try it out? Install the release with:
pip install huggingface_hub==0.25.0
Thanks for the ping @clem !
This documentation is more recent regarding HfApi
(the Python client). You have methods like model_info
and list_models
to get details about models (and similarly with datasets and Spaces). In addition to the package reference, we also have a small guide on how to use it.
Otherwise, if you are interested in the HTTP endpoint to build your requests yourself, here is the API reference.
Depends what you want to do. We have full documentation here: https://huggingface.co/docs/huggingface_hub/index. You can find many guides showing you how to use the library.
Are you referring to Agents in transformers
? If yes, here is the docs about it: https://huggingface.co/docs/transformers/agents. Regarding tools, TGI supports them and the InferenceClient from huggingface_hub as well, meaning you can pass tools to chat_completion
(see "Example using tools:" section in https://huggingface.co/docs/huggingface_hub/v0.24.0/en/package_reference/inference_client#huggingface_hub.InferenceClient.chat_completion). These tools parameters were already available on huggingface_hub 0.23.x.
Hope this answers your question :)
Exciting updates include:
โก InferenceClient is now a drop-in replacement for OpenAI's chat completion!
โจ Support for response_format, adapter_id , truncate, and more in InferenceClient
๐พ Serialization module with a save_torch_model helper that handles shared layers, sharding, naming convention, and safe serialization. Basically a condensed version of logic scattered across safetensors, transformers , accelerate
๐ Optimized HfFileSystem to avoid getting rate limited when browsing HuggingFaceFW/fineweb
๐จ HfApi & CLI improvements: prevent empty commits, create repo inside resource group, webhooks API, more options in the Search API, etc.
Check out the full release notes for more details:
Wauplin/huggingface_hub#7
๐
I asked 8 LLMs to "Tell me a bedtime story about bears and waffles."
Claude 3.5 Sonnet and GPT-4o gave me the worst stories: no conflict, no moral, zero creativity.
In contrast, smaller models were quite creative and wrote stories involving talking waffle trees and bears ostracized for their love of waffles.
Here you can see a comparison between Claude 3.5 Sonnet and NeuralDaredevil-8B-abliterated. They both start with a family of bears but quickly diverge in terms of personality, conflict, etc.
I mapped it to the hero's journey to have some kind of framework. Prompt engineering can definitely help here, but it's still disappointing that the larger models don't create better stories right off the bat.
Do you know why smaller models outperform the frontier models here?
Mostly that it's better integrated with HF services. If you pass a model_id
you can use the serverless Inference API without setting an base_url
. No need either to pass an api_key
if you are already logged in (with $HF_TOKEN
environment variable or huggingface-cli login
). If you are an Inference Endpoint user (i.e. deploying a model using https://ui.endpoints.huggingface.co/), you get a seamless integration to make requests to it with URL already configured. Finally, you are assured that the client will stay up to date with latest updates in TGI/Inference API/Inference Endpoints.
Why use the InferenceClient?
๐ Seamless transition: keep your existing code structure while leveraging LLMs hosted on the Hugging Face Hub.
๐ค Direct integration: easily launch a model to run inference using our Inference Endpoint service.
๐ Stay Updated: always be in sync with the latest Text-Generation-Inference (TGI) updates.
More details in https://huggingface.co/docs/huggingface_hub/main/en/guides/inference#openai-compatibility
I'm Alex, I'm 16, I've been an internship at Hugging Face for a little over a week and I've already learned a lot about using and prompting LLM models. With @victor as tutor I've just finished a space that analyzes your feelings by prompting an LLM chat model. The aim is to extend it so that it can categorize hugging face posts.
alex-abb/LLM_Feeling_Analyzer
Weโre embracing a larger mission, becoming part of a brilliant and kind team and a shared vision about the future of AI.
Over the past year, weโve been collaborating with Hugging Face on countless projects: launching partner of Docker Spaces, empowering the community to clean Alpaca translations into Spanish and other languages, launching argilla/notus-7b-v1 building on Zephyrโs learnings, the Data is Better Together initiative with hundreds of community contributors, or releasing argilla/OpenHermesPreferences, one of the largest open preference tuning datasets
After more than 2,000 Slack messages and over 60 people collaborating for over a year, it already felt like we were part of the same team, pushing in the same direction. After a week of the smoothest transition you can imagine, weโre now the same team.
To those of you whoโve been following us, this wonโt be a huge surprise, but it will be a big deal in the coming months. This acquisition means weโll double down on empowering the community to build and collaborate on high quality datasets, weโll bring full support for multimodal datasets, and weโll be in a better place to collaborate with the Open Source AI community. For enterprises, this means that the Enterprise Hub will unlock highly requested features like single sign-on and integration with Inference Endpoints.
As a founder, I am proud of the Argilla team. We're now part of something bigger and a larger team but with the same values, culture, and goals. Grateful to have shared this journey with my beloved co-founders Paco and Amรฉlie.
Finally, huge thanks to the Chief Llama Officer @osanseviero for sparking this and being such a great partner during the acquisition process.
Would love to answer any questions you have so feel free to add them below!
ModelHubMixin is a class developed by HF to integrate AI models with the hub with ease and it comes with 3 methods :
* save_pretrained
* from_pretrained
* push_to_hub
Shoutout to @nielsr , @Wauplin and everyone else on HF for their awesome work ๐ค
If you are not familiar with ModelHubMixin and you are looking for extra resources you might consider :
* docs: https://huggingface.co/docs/huggingface_hub/main/en/package_reference/mixins
๐blog about training models with the trainer API and using ModelHubMixin: https://huggingface.co/blog/not-lain/trainer-api-and-mixin-classes
๐GitHub repo with pip integration: https://github.com/not-lain/PyTorchModelHubMixin-template
๐basic guide: https://huggingface.co/posts/not-lain/884273241241808
They are used to seamlessly integrate your AI model with huggingface and to save/ load your model easily ๐
1๏ธโฃ make sure you're using the appropriate library version
pip install -qU "huggingface_hub>=0.22"
2๏ธโฃ inherit from the appropriate class
from huggingface_hub import PyTorchModelHubMixin
from torch import nn
class MyModel(nn.Module,PyTorchModelHubMixin):
def __init__(self, a, b):
super().__init__()
self.layer = nn.Linear(a,b)
def forward(self,inputs):
return self.layer(inputs)
first_model = MyModel(3,1)
4๏ธโฃ push the model to the hub (or use save_pretrained method to save locally)
first_model.push_to_hub("not-lain/test")
5๏ธโฃ Load and initialize the model from the hub using the original class
pretrained_model = MyModel.from_pretrained("not-lain/test")
Exciting updates include:
๐ Seamless download to local dir!
๐ก Grammar and Tools in InferenceClient!
๐ Documentation full translated to Korean!
๐ฅ User API: get likes, upvotes, nb of repos, etc.!
๐งฉ Better model cards and encoding for ModelHubMixin!
Check out the full release notes for more details:
Wauplin/huggingface_hub#6
๐
First, google/codegemma-release-66152ac7b683e2667abdee11 - a new set of code-focused Gemma models at 2B and 7B, in both pretrained and instruction-tuned variants. These exhibit outstanding performance on academic benchmarks and (in my experience) real-life usage. Read more in the excellent HuggingFace blog: https://huggingface.co/blog/codegemma
Second, ( google/recurrentgemma-release-66152cbdd2d6619cb1665b7a), which is based on the outstanding Google DeepMind research in Griffin: https://arxiv.org/abs/2402.19427. RecurrentGemma is a research variant that enables higher throughput and vastly improved memory usage. We are excited about new architectures, especially in the lightweight Gemma sizes, where innovations like RecurrentGemma can scale modern AI to many more use cases.
For details on the launches of these models, check out our launch blog -- and please do not hesitate to send us feedback. We are excited to see what you build with CodeGemma and RecurrentGemma!
Huge thanks to the Hugging Face team for helping ensure that these models work flawlessly in the Hugging Face ecosystem at launch!