chansung (chansung park)

reacted to their post with 🤗 7 days ago

Post

1596

🎙️ Listen to the audio "Podcast" of every single Hugging Face Daily Papers.

Now, "AI Paper Reviewer" project can automatically generates audio podcasts on any papers published on arXiv, and this is integrated into the GitHub Action pipeline. I sounds pretty similar to hashtag#NotebookLM in my opinion.

🎙️ Try out yourself at https://deep-diver.github.io/ai-paper-reviewer/

This audio podcast is powered by Google technologies: 1) Google DeepMind Gemini 1.5 Flash model to generate scripts of a podcast, then 2) Google Cloud Vertex AI's Text to Speech model to synthesize the voice turning the scripts into the natural sounding voices (with latest addition of "Journey" voice style)

"AI Paper Reviewer" is also an open source project. Anyone can use it to build and own a personal blog on any papers of your interests. Hence, checkout the project repository below if you are interested in!
: https://github.com/deep-diver/paper-reviewer

This project is going to support other models including open weights soon for both text-based content generation and voice synthesis for the podcast. The only reason I chose Gemini model is that it offers a "free-tier" which is enough to shape up this projects with non-realtime batch generations. I'm excited to see how others will use this tool to explore the world of AI research, hence feel free to share your feedback and suggestions!

1 reply

·

posted an update 7 days ago

Post

1596

🎙️ Listen to the audio "Podcast" of every single Hugging Face Daily Papers.

Now, "AI Paper Reviewer" project can automatically generates audio podcasts on any papers published on arXiv, and this is integrated into the GitHub Action pipeline. I sounds pretty similar to hashtag#NotebookLM in my opinion.

🎙️ Try out yourself at https://deep-diver.github.io/ai-paper-reviewer/

This audio podcast is powered by Google technologies: 1) Google DeepMind Gemini 1.5 Flash model to generate scripts of a podcast, then 2) Google Cloud Vertex AI's Text to Speech model to synthesize the voice turning the scripts into the natural sounding voices (with latest addition of "Journey" voice style)

"AI Paper Reviewer" is also an open source project. Anyone can use it to build and own a personal blog on any papers of your interests. Hence, checkout the project repository below if you are interested in!
: https://github.com/deep-diver/paper-reviewer

This project is going to support other models including open weights soon for both text-based content generation and voice synthesis for the podcast. The only reason I chose Gemini model is that it offers a "free-tier" which is enough to shape up this projects with non-realtime batch generations. I'm excited to see how others will use this tool to explore the world of AI research, hence feel free to share your feedback and suggestions!

1 reply

·

reacted to their post with 👍 15 days ago

Post

4450

Effortlessly stay up-to-date with AI research trends using a new AI tool, "AI Paper Reviewer" !!

It analyzes a list of Hugging Face Daily Papers(w/ @akhaliq ) and turn them into insightful blog posts. This project leverages Gemini models (1.5 Pro, 1.5 Flash, and 1.5 Flash-8B) for content generation and Upstage Document Parse for parsing the layout and contents.
blog link: https://deep-diver.github.io/ai-paper-reviewer/

Also, here is the link of GitHub repository for parsing and generating pipeline. By using this, you can easily build your own GitHub static pages based on any arXiv papers with your own interest!
: https://github.com/deep-diver/paper-reviewer

posted an update 15 days ago

Post

4450

Effortlessly stay up-to-date with AI research trends using a new AI tool, "AI Paper Reviewer" !!

It analyzes a list of Hugging Face Daily Papers(w/ @akhaliq ) and turn them into insightful blog posts. This project leverages Gemini models (1.5 Pro, 1.5 Flash, and 1.5 Flash-8B) for content generation and Upstage Document Parse for parsing the layout and contents.
blog link: https://deep-diver.github.io/ai-paper-reviewer/

Also, here is the link of GitHub repository for parsing and generating pipeline. By using this, you can easily build your own GitHub static pages based on any arXiv papers with your own interest!
: https://github.com/deep-diver/paper-reviewer

reacted to m-ric's post with 👀 7 months ago

Post

2776

💰❌ 𝐑𝐞𝐬𝐞𝐚𝐫𝐜𝐡 𝐟𝐨𝐫 𝐭𝐡𝐞 𝐯𝐞𝐫𝐲 𝐆𝐏𝐔 𝐏𝐨𝐨𝐫 - 𝐒𝐜𝐚𝐥𝐢𝐧𝐠 𝐥𝐚𝐰𝐬 𝐫𝐞𝐩𝐥𝐢𝐜𝐚𝐭𝐢𝐨𝐧

🎆 Good news: 𝘆𝗼𝘂 𝗰𝗮𝗻 𝗱𝗼 𝗰𝘂𝘁𝘁𝗶𝗻𝗴-𝗲𝗱𝗴𝗲 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝘄𝗶𝘁𝗵 𝗮 𝗰𝗮𝗹𝗰𝘂𝗹𝗮𝘁𝗼𝗿 𝗮𝗻𝗱 𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗣𝗮𝗶𝗻𝘁 𝟮𝟬𝟬𝟲!

The Chinchilla experiments (by Google DeepMind) ran hundreds of pre-trainings with models >1B parameters (I do not want to imagine how much that cost) to 𝗳𝗶𝗻𝗱 𝘁𝗵𝗲 𝗼𝗽𝘁𝗶𝗺𝗮𝗹 𝗿𝗮𝘁𝗶𝗼 𝗼𝗳 𝗺𝗼𝗱𝗲𝗹 𝘀𝗶𝘇𝗲 𝘃𝘀 𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝘁𝗼𝗸𝗲𝗻𝘀. Why is this question so important?
Well, you only ever have access to a fixed compute, counted in FLOPs (floating point operations). So if your model is bigger, you will have less compute to train on many tokens, and if you want to train on more tokens, your model will be smaller. When model trainings cost million, you absolutely need to get this right.

The new paper "Chinchilla Scaling: A replication attempt" by Epoch AI sets on on the ambitious goal of reproducing this.

But since the authors do not have infinite money, they decided to directly run their computations from DeepMind's own experiments! They took the figure from the last experiment (cf slide below), measured point positions, picked color codes, and ended up reconstructing the underlying data.

💥 They then just fit the scaling laws proposed by the Chinchilla Authors, but arrived at wildly different results! They find that as a rough rule of thumb, you should use 20 training tokens for each parameter in your model, instead of the 70 obtained in the original paper. They also point out inconsistencies in the paper, and unrealistically narrow confidence intervals.

➡️ This only contradicts the results from the last (out of 3) experiments in the Chinchilla paper. And the model trained at the end of the Chinchilla paper still seems properly scaled.

✅ But it does show that a tiny bit more theoretical work can go a long way, especially given the huge financial costs that such an error can have!

reacted to their post with 👍🔥 7 months ago

Post

4006

🦙🦙 LLaMA Duo project update

Last time, I gave a brief introduction about LLaMA Duo project with @sayakpaul . It is a simple toolset to aligning sLLM with service LLM with coverage dataset 👉🏻 (https://huggingface.co/posts/chansung/708646454991943).
- coverage dataset is what we believe to be the most important/desired (instruction, response) pairs. In system thinking, each instruction could be an analogy of a function from traditional programming. We make unit tests and measure the coverage % for all the features/functions. Similarly, we need to ensure if our fine-tuned model could handle what % of given instructions from coverage dataset satisfactory (hence coverage dataset).

We have tested it with "Coding" category of data from HuggingFaceH4/no_robots dataset. It has about 300 SFT training data points under Coding category. After fine-tuning Gemma 7B model on that, the result was very poor. LLaMA Duo's evaluation tool gave < 20% of metrics in similarity and preciseness on the test split.

So, we used LLaMA Duo's synthetic data generation tool to generate 60k data points that looks similar to the original dataset. We first created ~10k synthetic data points, then created 50k more based on the synthetic dataset itself.

After fine-tuning Gemma 7B on the 60k synthetic dataset, the evaluation result went up to 80~90% high. Also, when testing out the model in UI, it tends to give good responses.

It is a good showcase to transition from service LLM to sLLM or having a backup sLLM for service LLM failure scenarios. I am going to expand this experiments on all categories of no_robots dataset. It will roughly generate > 100k data points.

Here are some links:
- LLaMA Duo project repo: https://github.com/deep-diver/llamaduo
- 60k Coding synthetic dataset: chansung/merged_ds_coding
- Fine-tuned Gemma 7B model: chansung/coding_llamaduo_60k_v0.2

posted an update 7 months ago

Post

4006

🦙🦙 LLaMA Duo project update

Last time, I gave a brief introduction about LLaMA Duo project with @sayakpaul . It is a simple toolset to aligning sLLM with service LLM with coverage dataset 👉🏻 (https://huggingface.co/posts/chansung/708646454991943).
- coverage dataset is what we believe to be the most important/desired (instruction, response) pairs. In system thinking, each instruction could be an analogy of a function from traditional programming. We make unit tests and measure the coverage % for all the features/functions. Similarly, we need to ensure if our fine-tuned model could handle what % of given instructions from coverage dataset satisfactory (hence coverage dataset).

We have tested it with "Coding" category of data from HuggingFaceH4/no_robots dataset. It has about 300 SFT training data points under Coding category. After fine-tuning Gemma 7B model on that, the result was very poor. LLaMA Duo's evaluation tool gave < 20% of metrics in similarity and preciseness on the test split.

So, we used LLaMA Duo's synthetic data generation tool to generate 60k data points that looks similar to the original dataset. We first created ~10k synthetic data points, then created 50k more based on the synthetic dataset itself.

After fine-tuning Gemma 7B on the 60k synthetic dataset, the evaluation result went up to 80~90% high. Also, when testing out the model in UI, it tends to give good responses.

It is a good showcase to transition from service LLM to sLLM or having a backup sLLM for service LLM failure scenarios. I am going to expand this experiments on all categories of no_robots dataset. It will roughly generate > 100k data points.

Here are some links:
- LLaMA Duo project repo: https://github.com/deep-diver/llamaduo
- 60k Coding synthetic dataset: chansung/merged_ds_coding
- Fine-tuned Gemma 7B model: chansung/coding_llamaduo_60k_v0.2

reacted to their post with 🤗 7 months ago

Post

4389

💻 Smoothing the Transition from Service LLM to Local LLM

Imagine your go-to LLM service is down, or you need to use it offline – yikes! This project is all about having that "Plan B" ready to go. Here's LLaMA Duo I've been building with @sayakpaul :

✨ Fine-tune a smaller LLM: We used Hugging Face's alignment-handbook to teach a smaller-sized LLM to mimic my favorite large language model. Think of it as that super-smart AI assistant getting a capable understudy.

🤖 Batch Inference: Let's get that fine-tuned LLM working! My scripts generate lots of text like a champ, and we've made sure things run smoothly even with bigger workloads.

🧐 Evaluation: How well is my small LLM doing? We integrated with the Gemini API to use it as an expert judge – it compares my model's work to the original. Talk about a tough critic!

🪄 Synthetic Data Generation: Need to boost that model's performance? Using Gemini's feedback, we can create even more training data, custom-made to make the LLM better.

🧱 Building Blocks: This isn't just a one-time thing – it's a toolkit for all kinds of LLMOps work. Want to change your evaluation metrics? Bring in models trained differently? Absolutely, let's make it happen.

Why this project is awesome:

💪 Reliability: Keep things running no matter what happens to your main LLM source.
🔒 Privacy: Process sensitive information on your own terms.
🗺️ Offline capable: No internet connection? No problem!
🕰️ Version Control: Lock in your favorite LLM's behavior, even if the service model changes.

We'm excited to share the code on GitHub. Curious to see what you all think! 👉🏻 https://github.com/deep-diver/llamaduo

posted an update 7 months ago

Post

4389

💻 Smoothing the Transition from Service LLM to Local LLM

Imagine your go-to LLM service is down, or you need to use it offline – yikes! This project is all about having that "Plan B" ready to go. Here's LLaMA Duo I've been building with @sayakpaul :

✨ Fine-tune a smaller LLM: We used Hugging Face's alignment-handbook to teach a smaller-sized LLM to mimic my favorite large language model. Think of it as that super-smart AI assistant getting a capable understudy.

🤖 Batch Inference: Let's get that fine-tuned LLM working! My scripts generate lots of text like a champ, and we've made sure things run smoothly even with bigger workloads.

🧐 Evaluation: How well is my small LLM doing? We integrated with the Gemini API to use it as an expert judge – it compares my model's work to the original. Talk about a tough critic!

🪄 Synthetic Data Generation: Need to boost that model's performance? Using Gemini's feedback, we can create even more training data, custom-made to make the LLM better.

🧱 Building Blocks: This isn't just a one-time thing – it's a toolkit for all kinds of LLMOps work. Want to change your evaluation metrics? Bring in models trained differently? Absolutely, let's make it happen.

Why this project is awesome:

💪 Reliability: Keep things running no matter what happens to your main LLM source.
🔒 Privacy: Process sensitive information on your own terms.
🗺️ Offline capable: No internet connection? No problem!
🕰️ Version Control: Lock in your favorite LLM's behavior, even if the service model changes.

We'm excited to share the code on GitHub. Curious to see what you all think! 👉🏻 https://github.com/deep-diver/llamaduo

replied to their post 8 months ago

awesome! going to add one more env var to switch mode then :)

reacted to their post with 👍 8 months ago

Post

2536

Realize LLM powered idea on Hugging Face Space.

I made Space for you to duplicate, then it comes with Gradio and LLM served by Hugging Face's efficient Text Generation Inference(TGI) framework packed into a single machine.

It provides a sample app code snippet with gr.ChatInterface. However, it is not limited to chat usage, but you can leverage the efficiency of TGI for any sort of apps built in Gradio.

Have you ever enjoyed playing with Hugging Chat? Then, you will enjoy writing your own idea with this. Because both are built on top of TGI!

Focus on your app code, and go beyond chat!

chansung/gradio_together_tgi

2 replies

·

posted an update 8 months ago

Post

2536

Realize LLM powered idea on Hugging Face Space.

I made Space for you to duplicate, then it comes with Gradio and LLM served by Hugging Face's efficient Text Generation Inference(TGI) framework packed into a single machine.

It provides a sample app code snippet with gr.ChatInterface. However, it is not limited to chat usage, but you can leverage the efficiency of TGI for any sort of apps built in Gradio.

Have you ever enjoyed playing with Hugging Chat? Then, you will enjoy writing your own idea with this. Because both are built on top of TGI!

Focus on your app code, and go beyond chat!

chansung/gradio_together_tgi

2 replies

·

reacted to sayakpaul's post with ❤️ 8 months ago

Post

1909

How about engaging in a creative chat with your favorite video character? 💬

@chansung and I worked on a weekend project combining the benefits of Gemini 1.0 and powerful chat models like Zephyr to demo this.

We use Gemini 1.0 to produce the personality traits of any character found in an input video. We then prepare a system prompt with the discovered traits to start chatting with an LLM (Zephyr in this case).

Managing a video captioning model is a little out of our expertise, hence Gemini FTW here 😶‍🌫️

👨‍💻 Code: https://github.com/deep-diver/Vid2Persona
🤗 Demo: chansung/vid2persona

reacted to their post with 🤗 8 months ago

Post

🎥 🤾 Vid2Persona: talk to person from video clip

A fun project over the last week with @sayakpaul . It has a simple pipeline from extracting traits of video characters to chatting with them.

Under the hood, this project leverages the power of both commercial and open source models. We used Google's Gemini 1.0 Pro Vision model to understand the video content directly, then we used HuggingFaceH4/zephyr-7b-beta model to make conversation!

Try it Hugging Face Space and let us know what you think.
: chansung/vid2persona

The space application is a dedicated implementation for ZeroGPU environment + Hugging Face Inference API with PRO account. If you wish to host it on your own environment, consider duplicate the space or run locally with the project repository
: https://github.com/deep-diver/Vid2Persona

posted an update 8 months ago

Post

🎥 🤾 Vid2Persona: talk to person from video clip

A fun project over the last week with @sayakpaul . It has a simple pipeline from extracting traits of video characters to chatting with them.

Under the hood, this project leverages the power of both commercial and open source models. We used Google's Gemini 1.0 Pro Vision model to understand the video content directly, then we used HuggingFaceH4/zephyr-7b-beta model to make conversation!

Try it Hugging Face Space and let us know what you think.
: chansung/vid2persona

The space application is a dedicated implementation for ZeroGPU environment + Hugging Face Inference API with PRO account. If you wish to host it on your own environment, consider duplicate the space or run locally with the project repository
: https://github.com/deep-diver/Vid2Persona

reacted to their post with 👍 9 months ago

Post

Updating PaperQA Gradio app and Hugging Face Space.
: Link ➡️ chansung/paper_qa
: Standalone repo ➡️ https://github.com/deep-diver/paperqa-ui

The final goal is to let ppl have their own paper archive. At the end, You will be able to easily *clone* on local or Hugging Face Space with Google's Gemini API Key (which is free), Hugging Face Access Token. You can just drop arXiv IDs at the bottom, then all the auto analyze papers are automatically archived on Hugging Face Dataset repo.

Here are few updates included, and dig in the source code if you want similar features for your use cases!
🖥️ making complex UI + fully responsive
+ making UI as quickly as possible (avoid server-client when possible)
💬 Permanent Chat history management with in-browser local storage
+ Chat history management *per* paper
+ Chat history management in lazy mode (too many paper, impossible to create chat history for every single paper beforehand, hence)

Current plan is to support Gemini and any open source models on Hugging Face PRO account, but will expand it to GPT4 soon.

Any suggestion on this project is welcome! possibly,
- hooking up RAG system (open models' context length is small)
- hooking up Internet search system
- image/figure analysis
....

posted an update 9 months ago

Post

Updating PaperQA Gradio app and Hugging Face Space.
: Link ➡️ chansung/paper_qa
: Standalone repo ➡️ https://github.com/deep-diver/paperqa-ui

The final goal is to let ppl have their own paper archive. At the end, You will be able to easily *clone* on local or Hugging Face Space with Google's Gemini API Key (which is free), Hugging Face Access Token. You can just drop arXiv IDs at the bottom, then all the auto analyze papers are automatically archived on Hugging Face Dataset repo.

Here are few updates included, and dig in the source code if you want similar features for your use cases!
🖥️ making complex UI + fully responsive
+ making UI as quickly as possible (avoid server-client when possible)
💬 Permanent Chat history management with in-browser local storage
+ Chat history management *per* paper
+ Chat history management in lazy mode (too many paper, impossible to create chat history for every single paper beforehand, hence)

Current plan is to support Gemini and any open source models on Hugging Face PRO account, but will expand it to GPT4 soon.

Any suggestion on this project is welcome! possibly,
- hooking up RAG system (open models' context length is small)
- hooking up Internet search system
- image/figure analysis
....

reacted to ArthurZ's post with 🤝 9 months ago

Post

mamba is now available in transformers. Thanks to @tridao and @albertgu for this brilliant model! 🚀 and the amazing mamba-ssm kernels powering this!
Checkout the collection here:
state-spaces/transformers-compatible-mamba-65e7b40ab87e5297e45ae406

5 replies

·

reacted to vikhyatk's post with ❤️ 9 months ago

Post

Just released moondream2 - a small 1.8B parameter vision language model. Now fully open source (Apache 2.0) so you can use it without restrictions on commercial use!

vikhyatk/moondream2

8 replies

·

chansung park PRO

AI & ML interests

Recent Activity

Articles

dstack to manage clusters of on-prem servers for AI workloads with ease

dstack: Your LLM Launchpad - From Fine-Tuning to Serving, Simplified

Deploying 🤗 ViT on Vertex AI

Deploying 🤗 ViT on Kubernetes with TF Serving

Organizations

chansung's activity