83 38 255

Lee Junbum PRO

beomi

https://junbuml.ee

AI & ML interests

AI/ML GDE. Advancing Low-Resource Language Open Access LLM

Recent Activity

reacted to ArthurZ's post with 🤝 about 14 hours ago

reacted to ArthurZ's post with 🤯 about 14 hours ago

reacted to ArthurZ's post with ❤️ about 14 hours ago

Organizations

beomi's activity

upvoted a paper 1 day ago

Top-nσ: Not All Logits Are You Need

Paper • 2411.07641 • Published 9 days ago • 15

upvoted an article 16 days ago

Article

Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚

•

Aug 26

• 34

upvoted an article 30 days ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18

• 202

upvoted a paper about 1 month ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1 • 144

upvoted an article about 1 month ago

Article

Fixing Gradient Accumulation

Oct 16

• 41

upvoted 3 papers about 1 month ago

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11 • 42

SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights

Paper • 2410.09008 • Published Oct 11 • 16

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 166

upvoted a collection about 2 months ago

Moshi v0.1 Release

Collection

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 218

upvoted a paper 2 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19 • 135

upvoted a collection 2 months ago

DataGemma Release

Collection

A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated Sep 12 • 78

upvoted 2 papers 3 months ago

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20 • 85

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 41

upvoted an article 3 months ago

Article

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

Mar 20

• 66

upvoted 6 papers 4 months ago

The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism

Paper • 2407.10457 • Published Jul 15 • 22