huggingface/HuggingDiscussions · [FEEDBACK] Daily Papers

Hugging Face org Jun 12

•

Note that this is not a post about adding new papers, it's about feedback on the Daily Papers community update feature.

How to submit a paper to the Daily Papers, like @akhaliq (AK)?

Submitting is available to paper authors
Only recent papers (less than 7d) can be featured on the Daily

Then drop the arxiv id in the form at https://huggingface.co/papers/submit

Add medias to the paper (images, videos) when relevant
You can start the discussion to engage with the community

Please check out the documentation

RollingPig

Jun 17

https://arxiv.org/abs/2406.01954

runninglsy

Jun 18

•

edited Jun 27

We are excited to share our recent work on MLLM architecture design titled "Ovis: Structural Embedding Alignment for Multimodal Large Language Model".

Paper: https://arxiv.org/abs/2405.20797
Github: https://github.com/AIDC-AI/Ovis
Model: https://huggingface.co/AIDC-AI/Ovis-Clip-Llama3-8B
Data: https://huggingface.co/datasets/AIDC-AI/Ovis-dataset

Yiwen-ntu

Jun 18

This comment has been hidden

kramp

Hugging Face org Jun 18

@Yiwen-ntu for now we support only videos as paper covers in the Daily.

renqiux0302

Jun 19

This comment has been hidden

taki555

Jun 19

This comment has been hidden

devichand

Jun 20

we are excited to share our work titled "Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models" : https://arxiv.org/abs/2406.12644

90 hidden messages

Expand all

lmc22

22 days ago

Text4Seg: Reimagining Image Segmentation as Text Generation
Paper: https://arxiv.org/abs/2410.09855
Github: https://github.com/mc-lan/Text4Seg

hhyangcs

21 days ago

Depth Any Video with Scalable Synthetic Data

Depth Any Video introduces a scalable synthetic data pipeline, capturing 40,000 video clips from diverse games, and leverages powerful priors of generative video diffusion models to advance video depth estimation. By incorporating rotary position encoding, flow matching, and a mixed-duration training strategy, it robustly handles varying video lengths and frame rates. Additionally, a novel depth interpolation method enables high-resolution depth inference, achieving superior spatial accuracy and temporal consistency over previous models.

Arxiv link: https://arxiv.org/abs/2410.10815
Project page: https://depthanyvideo.github.io
Code: https://github.com/Nightmare-n/DepthAnyVideo
Huggingface gradio demo: https://huggingface.co/spaces/hhyangcs/depth-any-video

ronniecao

13 days ago

•

edited 13 days ago

We are excited to share our recent proposed code completion benchmark "Codev-Bench: How Do LLMs Understand Develop-Centric Code Completion?".

📑 https://arxiv.org/abs/2410.01353
🚀 https://github.com/LingmaTongyi/Codev-Bench
🤗 https://huggingface.co/datasets/TongyiLingma/CodevBench

shallowdream204

12 days ago

📑 https://arxiv.org/abs/2410.18666
🚀 https://github.com/shallowdream204/DreamClear
🤗 https://huggingface.co/shallowdream204/DreamClear

zpschang

5 days ago

Hi AK and HF team,

Our paper https://arxiv.org/abs/2411.00785 titled "IGOR: Image-GOal Representations Are the Atomic Control Units for Foundation Models in Embodied AI" is just be made public today, although being onhold by arxiv for more than 7 days. However, the Daily paper submission website shows that it is more than 7 days old. We appreciate your help if you could help post the paper on the Daily Paper.

ionutmodo

4 days ago

•

edited 4 days ago

Hi AK and HF team,

I am happy to introduce our MicroAdam optimizer, a low-memory variant of Adam optimizer that has a memory footprint of 0.9d bytes, compared to 2d bytes of AdamW-8bits. We achieve this result by only storing 99% sparse gradients and reconstructing the optimizer states at each step, which is a fast operation due to our optimized implementation using CUDA kernels. MicroAdam was mainly developed for finetuning tasks in mind. Please check out our work:

(Paper) 📑: https://arxiv.org/pdf/2405.15593
(Code) 🚀: https://github.com/IST-DASLab/MicroAdam

Wenxuuuan

3 days ago

This comment has been hidden

ionutmodo

2 days ago

Hi everyone,

I would like to introduce GridSearcher, a tool we have been developing in our DAS-Lab @ ISTA to speed up the hyper-parameter tuning process. Grid searcher is a pure python project designed to bypass the bash scripts to run grids of parameters for the ML projects. It provides a more flexible and user friendly way to manage and execute multiple programs in parallel. It is designed for systems where users have direct SSH access to machines and can run their python scripts right away.

Do you have access to your GPUs via SLURM? no problem, you can run srun --gres=gpu:8 --partition=gpu100 --time=10-00:00:00 --mem=1000G --cpus-per-task=200 --pty bash to request a bash session to your cluster to be able to use direct ssh access on the node, then use GridSearcher.

I am sure our project will help you save time, please check out our code on GitHub:

(Code) 🚀: https://github.com/IST-DASLab/GridSearcher/