ucyang (Unchun Yang)

upvoted 4 collections 1 day ago

upvoted a collection 2 days ago

Llama3-8B-1.58

Collection

A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! • 3 items • Updated 5 days ago • 8

upvoted an article 2 days ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

2 days ago

• 104

upvoted a paper 2 days ago

Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 8

upvoted a paper 3 days ago

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Paper • 2409.09214 • Published 6 days ago • 38

upvoted a paper 4 days ago

Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning

Paper • 2406.12050 • Published Jun 17 • 16

upvoted a collection 7 days ago

DataGemma Release

Collection

A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated 8 days ago • 53

upvoted a paper 9 days ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published 17 days ago • 72

upvoted a collection 9 days ago

DeepSeek-V2.5

Collection

1 item • Updated 14 days ago • 19

upvoted 11 papers 10 days ago

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture

Paper • 2409.02889 • Published 16 days ago • 53

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Paper • 2409.02813 • Published 16 days ago • 27

LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models

Paper • 2409.00509 • Published 20 days ago • 38

Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Paper • 2409.01437 • Published 18 days ago • 70

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published 17 days ago • 74

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published 22 days ago • 92

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published 23 days ago • 81

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published 24 days ago • 137

Foundation Models for Music: A Survey

Paper • 2408.14340 • Published 25 days ago • 38

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published 29 days ago • 50

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published 29 days ago • 61

upvoted a collection 11 days ago

TableBench

Collection

TableBench • 8 items • Updated Aug 14 • 2

upvoted a paper 13 days ago

The Future of Open Human Feedback

Paper • 2408.16961 • Published Aug 15 • 19

upvoted a collection 15 days ago

Yi-Coder

Collection

4 items • Updated 16 days ago • 28

upvoted a paper 16 days ago

Robust Speech Recognition via Large-Scale Weak Supervision

Paper • 2212.04356 • Published Dec 6, 2022 • 17

upvoted 2 collections 16 days ago

Whisper Release

Collection

Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 74

NVEagle

Collection

4 items • Updated 22 days ago • 11

upvoted an article 16 days ago

Article

The 5 Most Under-Rated Tools on Hugging Face

29 days ago

• 74

upvoted an article 17 days ago

Article

Fine-Tune Whisper with 🤗 Transformers

Nov 3, 2022

• 85

upvoted a paper 18 days ago

Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models

Paper • 2408.02442 • Published Aug 5 • 17

upvoted a paper 20 days ago

ZeRO: Memory Optimizations Toward Training Trillion Parameter Models

Paper • 1910.02054 • Published Oct 4, 2019 • 4

upvoted a collection 21 days ago

Qwen2-VL

Collection

Vision-language model series based on Qwen2 • 15 items • Updated 2 days ago • 114

upvoted 3 papers 22 days ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published about 1 month ago • 40

TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17 • 51

Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17 • 20

upvoted 2 papers 23 days ago

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

Paper • 2408.07199 • Published Aug 13 • 19

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published 29 days ago • 109

upvoted 3 papers 24 days ago

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Paper • 2408.10188 • Published Aug 19 • 51

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Paper • 2408.08459 • Published Aug 15 • 44

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16 • 96

upvoted 6 papers 25 days ago

InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning

Paper • 2408.07089 • Published Aug 9 • 12

DeepSpeak Dataset v1.0

Paper • 2408.05366 • Published Aug 9 • 10

Generative Photomontage

Paper • 2408.07116 • Published Aug 13 • 19

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13 • 65

ControlNeXt: Powerful and Efficient Control for Image and Video Generation

Paper • 2408.06070 • Published Aug 12 • 52

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9 • 46

upvoted a paper 27 days ago

Sapiens: Foundation for Human Vision Models

Paper • 2408.12569 • Published 29 days ago • 84

upvoted a collection 27 days ago

CLAIR and APO

Collection

Data and Models for the paper "Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment" • 8 items • Updated Aug 14 • 3

upvoted 2 papers 27 days ago

Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment

Paper • 2408.06266 • Published Aug 12 • 9

GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI

Paper • 2408.03361 • Published Aug 6 • 85

upvoted a collection 29 days ago

Jamba-1.5

Collection

The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated 29 days ago • 71

upvoted an article about 1 month ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

By

•

Aug 19

• 72

upvoted 2 collections about 1 month ago

Tinyllama-1.1B-v1.1

Collection

3 items • Updated Apr 2 • 4

mEdIT

Collection

Collection of the publicly available mEdIT dataset and instruction-tuned models for multilingual text revision. • 3 items • Updated May 17 • 2

upvoted a paper about 1 month ago

CoEdIT: Text Editing by Task-Specific Instruction Tuning

Paper • 2305.09857 • Published May 17, 2023 • 7

upvoted 2 collections about 1 month ago

CoEdIT

Collection

Collection of the publicly available CoEdIT dataset and instruction-tuned models for text editing. • 6 items • Updated Apr 15 • 6

💻 Local SmolLMs

Collection

SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos • 14 items • Updated about 1 month ago • 40

upvoted a paper about 1 month ago

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15 • 51

Unchun Yang

AI & ML interests

Organizations

ucyang's activity

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

The 5 Most Under-Rated Tools on Hugging Face

Fine-Tune Whisper with 🤗 Transformers

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging