Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2310.19773

Video Understanding

MM-VID: Advancing Video Understanding with GPT-4V(ision)

Paper • 2310.19773 • Published Oct 30, 2023 • 19
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

Paper • 2310.05863 • Published Oct 9, 2023 • 1
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 84
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization

Paper • 2311.10126 • Published Nov 16, 2023 • 7

MM-VID: Advancing Video Understanding with GPT-4V(ision)

Paper • 2310.19773 • Published Oct 30, 2023 • 19

Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 40
De-Diffusion Makes Text a Strong Cross-Modal Interface

Paper • 2311.00618 • Published Nov 1, 2023 • 21
MM-VID: Advancing Video Understanding with GPT-4V(ision)

Paper • 2310.19773 • Published Oct 30, 2023 • 19
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

Paper • 2310.15308 • Published Oct 23, 2023 • 22

MM-VID: Advancing Video Understanding with GPT-4V(ision)

Paper • 2310.19773 • Published Oct 30, 2023 • 19

Vision-Language

SILC: Improving Vision Language Pretraining with Self-Distillation

Paper • 2310.13355 • Published Oct 20, 2023 • 6
Woodpecker: Hallucination Correction for Multimodal Large Language Models

Paper • 2310.16045 • Published Oct 24, 2023 • 14
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Paper • 2201.12086 • Published Jan 28, 2022 • 3
ImageNetVC: Zero-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories

Paper • 2305.15028 • Published May 24, 2023 • 1

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 22
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 16
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 9
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 8

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs