view article Article MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR By abhinand • 21 days ago • 30
Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models Paper • 2409.12139 • Published Sep 18 • 11
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published Aug 29 • 47
view article Article Introducing AuraFace: Open-Source Face Recognition and Identity Preservation Models By isidentical • Aug 26 • 35
LongVILA: Scaling Long-Context Visual Language Models for Long Videos Paper • 2408.10188 • Published Aug 19 • 51
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19 • 73
view article Article Revolutionizing Video Transcription: Unveiling Gemma-2b-it and Langchain in the Era of Transformers By Andyrasika • Mar 12 • 3
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Paper • 2407.09025 • Published Jul 12 • 128
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 177
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Apr 22 • 78
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 251
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22 • 126
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones Paper • 2312.16862 • Published Dec 28, 2023 • 30
Aligning Large Multimodal Models with Factually Augmented RLHF Paper • 2309.14525 • Published Sep 25, 2023 • 29
Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts Paper • 2307.07218 • Published Jul 14, 2023 • 26