罗杰斯's picture

21 38

罗杰斯

rojasdiego

·

https://rojasdiego.com

AI & ML interests

LLMs for Code Generation

Organizations

rojasdiego's activity

upvoted a paper 13 days ago

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Paper • 2410.02367 • Published 14 days ago • 45

upvoted a collection 20 days ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 11 items • Updated 21 days ago • 370

upvoted 3 papers about 1 month ago

Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5 • 86

Granite Code Models: A Family of Open Foundation Models for Code Intelligence

Paper • 2405.04324 • Published May 7 • 21

Scaling Granite Code Models to 128K Context

Paper • 2407.13739 • Published Jul 18 • 19

upvoted a collection about 1 month ago

SOTA Code LLMs

Top LLMs for code. Instruct variants. Fit on single A100. • 5 items • Updated Sep 2 • 1

upvoted 2 collections about 2 months ago

Arctic-embed

A collection of text embedding models optimized for retrieval accuracy and efficiency • 6 items • Updated Jul 18 • 14

MoEs papers reading list

59 items • Updated 6 days ago • 134

upvoted 3 papers about 2 months ago

WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

Paper • 2408.16532 • Published Aug 29 • 46

CogVLM2: Visual Language Models for Image and Video Understanding

Paper • 2408.16500 • Published Aug 29 • 56

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22 • 61

upvoted 2 papers 4 months ago

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Paper • 2406.15877 • Published Jun 22 • 45

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Paper • 2406.12793 • Published Jun 18 • 31

upvoted 2 papers 8 months ago

MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

Paper • 2402.15627 • Published Feb 23 • 34

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 596

upvoted a collection 9 months ago

Stable Code

Suite of developer assistant models • 5 items • Updated Apr 8 • 37

upvoted 2 papers 10 months ago

Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws

Paper • 2401.00448 • Published Dec 31, 2023 • 28

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138

upvoted 3 papers 11 months ago

What's In My Big Data?

Paper • 2310.20707 • Published Oct 31, 2023 • 10

Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 118

AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 35