-
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Paper • 2402.08093 • Published • 54 -
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper • 2311.00945 • Published • 14 -
Matcha-TTS: A fast TTS architecture with conditional flow matching
Paper • 2309.03199 • Published • 11 -
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Paper • 1712.05884 • Published • 2
Collections
Discover the best community collections!
Collections including paper arxiv:1712.05884
-
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Paper • 1712.05884 • Published • 2 -
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Paper • 2404.09956 • Published • 11 -
Music Consistency Models
Paper • 2404.13358 • Published • 12 -
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Paper • 2404.14700 • Published • 29
-
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Paper • 1712.05884 • Published • 2 -
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild
Paper • 2403.16973 • Published • 2 -
High Fidelity Neural Audio Compression
Paper • 2210.13438 • Published • 3 -
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
Paper • 2404.03204 • Published • 7
-
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Paper • 1801.03924 • Published • 2 -
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper • 2403.15042 • Published • 25 -
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Paper • 1712.05884 • Published • 2 -
Efficient Memory Management for Large Language Model Serving with PagedAttention
Paper • 2309.06180 • Published • 25
-
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper • 2310.00704 • Published • 19 -
Structural Similarities Between Language Models and Neural Response Measurements
Paper • 2306.01930 • Published • 2 -
Streaming Transformer ASR with Blockwise Synchronous Beam Search
Paper • 2006.14941 • Published • 2 -
NU-GAN: High resolution neural upsampling with GAN
Paper • 2010.11362 • Published • 2