AudioLLM - a zerozeyi Collection

zerozeyi 's Collections

LLM

3D

AudioLLM

updated Jul 29

GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

Paper • 2406.11768 • Published Jun 17 • 20
Investigating Decoder-only Large Language Models for Speech-to-text Translation

Paper • 2407.03169 • Published Jul 3 • 9
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation

Paper • 2407.02869 • Published Jul 3 • 18
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4 • 35
Stable Audio Open

Paper • 2407.14358 • Published Jul 19 • 23