Artples (Artur Lauche)

reacted to their post with 🚀 3 months ago

Post

2393

Looking for a combination of speed and quality? Look no further! I've created a space that merges Open WebUI's excellent interface and features with the lightning-fast performance of the Groq API. Experience top-tier models in no time. Try it out for free here:
L-AI/groq-chat

"A big thank you to Groq for providing their fantastic API at no cost!"

posted an update 3 months ago

Post

2393

Looking for a combination of speed and quality? Look no further! I've created a space that merges Open WebUI's excellent interface and features with the lightning-fast performance of the Groq API. Experience top-tier models in no time. Try it out for free here:
L-AI/groq-chat

"A big thank you to Groq for providing their fantastic API at no cost!"

replied to Smoke666's post 5 months ago

From the code, it looks like he uses Llama 70b hosted with Groq with following parameters: max_tokens=1024, temperature=1.3 and as a system prompt: You are a useful assistant. You reply with efficient answers.

reacted to Taylor658's post with 👀 5 months ago

Post

4229

Luma AI has just launched Dream Machine, a Sora and Kling AI-like tool that generates videos from simple text and images. 🎥
Dream Machine is out of beta and offers a free tier to test it out.

I tried this extremely simple prompt with the pic below and thought the capture of my prompt into a drone camera-like video was decent:

You are a drone operator. Create a 30-second video from a drone heading eastbound over the western suburbs of Bismarck, North Dakota, looking east towards the city on an overcast summer evening during the golden hour from an altitude of 200 ft.

Dream Machine also has a paid tier. However, like its paid tier text-to-image brethren from 2023 (who all fared EXTREMELY badly once good text-to-image capabilities became the norm in open and closed source LLMs), time will tell if the pay tier model will work for text and image to video. ⏳

This will be evident in 3 to 5 months once GPT-5, Gemini-2, Mistral-9, Llama 4, et al., all models with enhanced multimodal capabilities, are released. 🚀

posted an update 5 months ago

Post

2720

I've tried out my new Space for copying Websites with Gemini 1.5 Flash and i gave it a image of Huggingchat. The results were interesting, but you can see it for yourself.

Space:
L-AI/Gemini-UI-Generator

Website-Demo:
https://leunos.com/hf-chat-fake

Screenshot:

1 reply

·

reacted to Taylor658's post with 🤗 5 months ago

Post

2020

huggingface, SakanaAILabs and @arcee_ai are sponsoring a Model Merging Competition with really sweet 💰cash prizes💰 at the 2024 NeurIPSConf! (https://neurips.cc) 🎉

Submissions are now open and will remain open until September 2024. 🚀

🔗 Register here: https://llm-merging.github.io/
🗣️ Join the Discord discussion: https://discord.com/invite/dPBHEVnV

1 reply

·

replied to their post 6 months ago

It's already on HF:
Models: https://huggingface.co/collections/google/paligemma-release-6643a9ffbf57de2ae0448dda
Demonstration: https://huggingface.co/spaces/google/paligemma-hf

posted an update 6 months ago

Post

1256

Hello everyone,

I wanted to share some exciting news: Google has just launched PaliGemma, a new Gemma Model which is multimodal and has 3 billion parameters.

What do you all think about this development? Are you as intrigued by its potential as I am?

4 replies

·

reacted to thomwolf's post with ❤️ 7 months ago

Post

4777

A Little guide to building Large Language Models in 2024

This is a post-recording of a 75min lecture I gave two weeks ago on how to train a LLM from scratch in 2024. I tried to keep it short and comprehensive – focusing on concepts that are crucial for training good LLM but often hidden in tech reports.

In the lecture, I introduce the students to all the important concepts/tools/techniques for training good performance LLM:
* finding, preparing and evaluating web scale data
* understanding model parallelism and efficient training
* fine-tuning/aligning models
* fast inference

There is of course many things and details missing and that I should have added to it, don't hesitate to tell me you're most frustrating omission and I'll add it in a future part. In particular I think I'll add more focus on how to filter topics well and extensively and maybe more practical anecdotes and details.

Now that I recorded it I've been thinking this could be part 1 of a two-parts series with a 2nd fully hands-on video on how to run all these steps with some libraries and recipes we've released recently at HF around LLM training (and could be easily adapted to your other framework anyway):
*datatrove for all things web-scale data preparation: https://github.com/huggingface/datatrove
*nanotron for lightweight 4D parallelism LLM training: https://github.com/huggingface/nanotron
*lighteval for in-training fast parallel LLM evaluations: https://github.com/huggingface/lighteval

Here is the link to watch the lecture on Youtube: https://www.youtube.com/watch?v=2-SPH9hIKT8
And here is the link to the Google slides: https://docs.google.com/presentation/d/1IkzESdOwdmwvPxIELYJi8--K3EZ98_cL6c5ZcLKSyVg/edit#slide=id.p

Enjoy and happy to hear feedback on it and what to add, correct, extend in a second part.

2 replies

·

Artur Lauche

AI & ML interests

Organizations

Artples's activity