A General Theoretical Paradigm to Understand Learning from Human Preferences Paper • 2310.12036 • Published Oct 18, 2023 • 14
sentence-transformers-from-synthetic-data Collection Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 4 items • Updated Jun 21 • 21
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 116
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 182
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 61
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 680
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9 • 64
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Paper • 2402.19427 • Published Feb 29 • 52
StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows Paper • 2403.11322 • Published Mar 17 • 1
Improving Text Embeddings with Large Language Models Paper • 2401.00368 • Published Dec 31, 2023 • 79
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting Paper • 2309.04269 • Published Sep 8, 2023 • 32
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning Paper • 2307.08691 • Published Jul 17, 2023 • 8
Meta-Transformer: A Unified Framework for Multimodal Learning Paper • 2307.10802 • Published Jul 20, 2023 • 43
Challenges and Applications of Large Language Models Paper • 2307.10169 • Published Jul 19, 2023 • 47
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Paper • 2205.14135 • Published May 27, 2022 • 11