1 101 570

Motoki Wu

tokestermw

AI & ML interests

None yet

Recent Activity

liked a Space 3 days ago

Qwen/Qwen2.5-Turbo-1M-Demo

liked a model 5 days ago

voyageai/voyage-multimodal-3

liked a dataset 5 days ago

microsoft/orca-agentinstruct-1M-v1

Organizations

tokestermw's activity

upvoted a collection 21 days ago

MobileLLM

Collection

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 15 days ago • 95

upvoted a collection 27 days ago

C4AI Aya Expanse

Collection

Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 3 items • Updated 28 days ago • 26

upvoted a paper 27 days ago

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8 • 80

upvoted an article about 2 months ago

Article

Our Transformers Code Agent beats the GAIA benchmark!

Jul 1

• 46

upvoted a paper about 2 months ago

Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation

Paper • 2409.12941 • Published Sep 19 • 22

upvoted a collection about 2 months ago

Moshi v0.1 Release

Collection

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 218

upvoted an article about 2 months ago

Article

Document Similarity Search with ColPali

•

Sep 21

• 47

upvoted 4 papers 2 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19 • 135

OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs

Paper • 2409.05152 • Published Sep 8 • 30

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data

Paper • 2409.03810 • Published Sep 5 • 30

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning

Paper • 2402.10110 • Published Feb 15 • 3

upvoted 4 papers 3 months ago

upvoted 2 articles 3 months ago

Article

Improving Hugging Face Training Efficiency Through Packing with Flash Attention

Aug 21

• 22

Article

Perspectives for first principles prompt engineering

•

Aug 18

• 16

upvoted a paper 3 months ago

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Paper • 2408.08274 • Published Aug 15 • 12

upvoted an article 3 months ago

Article

Tool Use, Unified

Aug 12

• 64

upvoted a paper 3 months ago

ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities

Paper • 2408.04682 • Published Aug 8 • 14