-
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper • 2310.00704 • Published • 19 -
Structural Similarities Between Language Models and Neural Response Measurements
Paper • 2306.01930 • Published • 2 -
Streaming Transformer ASR with Blockwise Synchronous Beam Search
Paper • 2006.14941 • Published • 2 -
NU-GAN: High resolution neural upsampling with GAN
Paper • 2010.11362 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:2310.00704
-
Music ControlNet: Multiple Time-varying Controls for Music Generation
Paper • 2311.07069 • Published • 43 -
FLAP: Fast Language-Audio Pre-training
Paper • 2311.01615 • Published • 16 -
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Paper • 2310.11954 • Published • 24 -
MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies
Paper • 2308.01546 • Published • 17
-
Large-Scale Automatic Audiobook Creation
Paper • 2309.03926 • Published • 53 -
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper • 2310.00704 • Published • 19 -
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Paper • 2309.11977 • Published • 2 -
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
Paper • 2308.16692 • Published • 1
-
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing
Paper • 2310.12404 • Published • 15 -
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Paper • 2310.11954 • Published • 24 -
A Survey of AI Music Generation Tools and Models
Paper • 2308.12982 • Published • 1 -
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper • 2310.00704 • Published • 19
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 14 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 25 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 6 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
-
NExT-GPT: Any-to-Any Multimodal LLM
Paper • 2309.05519 • Published • 78 -
Large Language Model for Science: A Study on P vs. NP
Paper • 2309.05689 • Published • 20 -
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Paper • 2309.06126 • Published • 16 -
Large Language Models for Compiler Optimization
Paper • 2309.07062 • Published • 22
-
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper • 2005.11401 • Published • 12 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 30 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper • 2205.14135 • Published • 11
-
OmnimatteRF: Robust Omnimatte with 3D Background Modeling
Paper • 2309.07749 • Published • 7 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 24 -
Generative Image Dynamics
Paper • 2309.07906 • Published • 52 -
MagiCapture: High-Resolution Multi-Concept Portrait Customization
Paper • 2309.06895 • Published • 27
-
Natural Language Supervision for General-Purpose Audio Representations
Paper • 2309.05767 • Published • 9 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 24 -
FoleyGen: Visually-Guided Audio Generation
Paper • 2309.10537 • Published • 8 -
Toward Joint Language Modeling for Speech Units and Text
Paper • 2310.08715 • Published • 7
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 8