Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.09085

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3 • 31
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17 • 25
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27 • 121
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17 • 21

Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17 • 21
weihao1115/wmmf_sam_on_hsi

Mask Generation • Updated Sep 9 • 2
weihao1115/wmmf_sam_on_mslidar

Mask Generation • Updated Sep 9 • 2
weihao1115/wmmf_sam_on_sar

Mask Generation • Updated Sep 9 • 2

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13 • 50
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Paper • 2406.09406 • Published Jun 13 • 13
VideoGUI: A Benchmark for GUI Automation from Instructional Videos

Paper • 2406.10227 • Published Jun 14 • 9
What If We Recaption Billions of Web Images with LLaMA-3?

Paper • 2406.08478 • Published Jun 12 • 39

MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Paper • 2405.20340 • Published May 30 • 19
Spectrally Pruned Gaussian Fields with Neural Compensation

Paper • 2405.00676 • Published May 1 • 8
Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Paper • 2404.18212 • Published Apr 28 • 27
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

iVideoGPT: Interactive VideoGPTs are Scalable World Models

Paper • 2405.15223 • Published May 24 • 12
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24 • 53
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 85
Matryoshka Multimodal Models

Paper • 2405.17430 • Published May 27 • 30

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs