-
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 77 -
GLIGEN: Open-Set Grounded Text-to-Image Generation
Paper • 2301.07093 • Published • 3 -
Music Consistency Models
Paper • 2404.13358 • Published • 12 -
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Paper • 2404.14507 • Published • 21
Collections
Discover the best community collections!
Collections including paper arxiv:2310.03502
-
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 77 -
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Paper • 2404.07448 • Published • 11 -
RegionGPT: Towards Region Understanding Vision Language Model
Paper • 2403.02330 • Published • 2 -
GLIGEN: Open-Set Grounded Text-to-Image Generation
Paper • 2301.07093 • Published • 3
-
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion
Paper • 2310.03502 • Published • 77 -
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Paper • 2404.07448 • Published • 11 -
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
Paper • 2404.07973 • Published • 30 -
COCONut: Modernizing COCO Segmentation
Paper • 2404.08639 • Published • 27
-
Demystifying CLIP Data
Paper • 2309.16671 • Published • 20 -
Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 10 -
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Paper • 2404.01367 • Published • 20 -
On the Scalability of Diffusion-based Text-to-Image Generation
Paper • 2404.02883 • Published • 17
-
U-Net: Convolutional Networks for Biomedical Image Segmentation
Paper • 1505.04597 • Published • 7 -
Image Segmentation using U-Net Architecture for Powder X-ray Diffraction Images
Paper • 2310.16186 • Published • 2 -
H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes
Paper • 1709.07330 • Published • 2 -
Deep LOGISMOS: Deep Learning Graph-based 3D Segmentation of Pancreatic Tumors on CT scans
Paper • 1801.08599 • Published • 2
-
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Paper • 2403.06775 • Published • 3 -
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Paper • 2010.11929 • Published • 6 -
Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
Paper • 2110.07040 • Published • 2 -
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Paper • 1811.00056 • Published • 2
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 113 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 73 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 33