LLM - a Dinozorus Collection

Dinozorus 's Collections

RAG

LLM

Vision

LLM

updated 16 days ago

The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26 • 78
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 104
ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 89
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4 • 60
Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30 • 73
Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16 • 125
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

Paper • 2405.19327 • Published May 29 • 45
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12 • 65
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1 • 85
Searching for Best Practices in Retrieval-Augmented Generation

Paper • 2407.01219 • Published Jul 1 • 11
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Paper • 2309.03883 • Published Sep 7, 2023 • 33
Lynx: An Open Source Hallucination Evaluation Model

Paper • 2407.08488 • Published Jul 11
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 33
Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27 • 138
Human Feedback is not Gold Standard

Paper • 2309.16349 • Published Sep 28, 2023 • 5
Differential Transformer

Paper • 2410.05258 • Published 29 days ago • 165
Were RNNs All We Needed?

Paper • 2410.01201 • Published Oct 2 • 46