LLM - a nezubn Collection

nezubn 's Collections

LLM

evals

vision

reinforcement learning

models

LLM

updated Jul 21

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13 • 49
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 104
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

Paper • 2403.20041 • Published Mar 29 • 34
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2 • 44
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 104
Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4 • 62
Make Your LLM Fully Utilize the Context

Paper • 2404.16811 • Published Apr 25 • 52
SUTRA: Scalable Multilingual Language Model Architecture

Paper • 2405.06694 • Published May 7 • 37
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory

Paper • 2405.08707 • Published May 14 • 27
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15 • 87
Layer-Condensed KV Cache for Efficient Inference of Large Language Models

Paper • 2405.10637 • Published May 17 • 19
Runtime error

243

📊

Llm Pricing