LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 10 items • Updated 1 day ago • 19
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 7 items • Updated about 11 hours ago • 31
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated 1 day ago • 128
Apollo: Band-sequence Modeling for High-Quality Audio Restoration Paper • 2409.08514 • Published 7 days ago • 5
Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection Paper • 2409.08513 • Published 7 days ago • 8
DrawingSpinUp: 3D Animation from Single Character Drawings Paper • 2409.08615 • Published 7 days ago • 10
A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis Paper • 2409.08947 • Published 6 days ago • 11
InstantDrag: Improving Interactivity in Drag-based Image Editing Paper • 2409.08857 • Published 6 days ago • 24
Policy Filtration in RLHF to Fine-Tune LLM for Code Generation Paper • 2409.06957 • Published 9 days ago • 5
Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models Paper • 2409.06277 • Published 10 days ago • 12
One missing piece in Vision and Language: A Survey on Comics Understanding Paper • 2409.09502 • Published 5 days ago • 23
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Paper • 2409.10516 • Published 3 days ago • 26
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation Paper • 2409.09214 • Published 6 days ago • 36
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published 1 day ago • 14
Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey Paper • 2409.11564 • Published 2 days ago • 11
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published 1 day ago • 38
DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated 7 days ago • 53
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Paper • 2409.11355 • Published 2 days ago • 23
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion Paper • 2409.11406 • Published 2 days ago • 19
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 1 day ago • 460
WebInstruct 🌐 Embeddings 🧱 Models Collection A collection of SoTA embeddings model fine-tuned on WebInstruct dataset to learn to pair instructions with its responses • 3 items • Updated 15 days ago • 11
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11 • 65
Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task Paper • 2409.04005 • Published 14 days ago • 16
Evaluating Multiview Object Consistency in Humans and Image Models Paper • 2409.05862 • Published 10 days ago • 8
Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments Paper • 2409.05865 • Published 10 days ago • 14
POINTS: Improving Your Vision-language Model with Affordable Strategies Paper • 2409.04828 • Published 12 days ago • 21
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery Paper • 2409.05591 • Published 10 days ago • 24
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance Paper • 2409.04593 • Published 13 days ago • 19
OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs Paper • 2409.05152 • Published 11 days ago • 27
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct Paper • 2409.05840 • Published 10 days ago • 43
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published 15 days ago • 70
SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation Paper • 2409.06633 • Published 9 days ago • 14
Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis Paper • 2409.06135 • Published 10 days ago • 14
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published 9 days ago • 51
INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding Paper • 2409.06210 • Published 10 days ago • 24
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Paper • 2409.06595 • Published 9 days ago • 37
Roleplay, Creative Writing, Uncensored, NSFW Collection Newest models at bottom of the page. Please review the org model card for special instruction(s) / Template(s) / Tips for use. • 248 items • Updated about 20 hours ago • 34
Minitron 4B Derivative Collection These models are tuned over a healed Minitron Width Base 4B model. These models should perform near the level of Llama 2 7B for RP. • 10 items • Updated about 21 hours ago • 5
Granite Guardian Collection A collection of models created by IBM for safeguarding language models. • 2 items • Updated 10 days ago • 5
Physics of Language Models: Part 1, Context-Free Grammar Paper • 2305.13673 • Published May 23, 2023 • 6
Llama-3.1 Quantization Collection Neural Magic quantized Llama-3.1 models • 21 items • Updated 8 days ago • 32
Robust Speech Recognition via Large-Scale Weak Supervision Paper • 2212.04356 • Published Dec 6, 2022 • 17
Whisper Release Collection Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 74
Theia Collection Distilling Diverse Vision Foundation Models for Robot Learning • 4 items • Updated Jul 29 • 9
Quantized-Llama Collection Quantized Llama models in 2,4, and 8 bit versions • 5 items • Updated 1 day ago • 3