daily-papers - a tyzhu Collection

tyzhu 's Collections

daily-papers

updated 2 days ago

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published about 1 month ago • 34
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Paper • 2409.11242 • Published 29 days ago • 5
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

Paper • 2409.11136 • Published 29 days ago • 21
On the Diagram of Thought

Paper • 2409.10038 • Published about 1 month ago • 11
Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published 13 days ago • 33
Large Language Models as Markov Chains

Paper • 2410.02724 • Published 13 days ago • 28
Contrastive Localized Language-Image Pre-Training

Paper • 2410.02746 • Published 13 days ago • 28
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

Paper • 2410.02749 • Published 13 days ago • 12
L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?

Paper • 2410.02115 • Published 14 days ago • 10
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations

Paper • 2410.02762 • Published 13 days ago • 9
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models

Paper • 2410.01335 • Published 15 days ago • 5
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning

Paper • 2410.01044 • Published 15 days ago • 34
Not All LLM Reasoners Are Created Equal

Paper • 2410.01748 • Published 14 days ago • 27
Quantifying Generalization Complexity for Large Language Models

Paper • 2410.01769 • Published 14 days ago • 12
InfiniPot: Infinite Context Processing on Memory-Constrained LLMs

Paper • 2410.01518 • Published 14 days ago • 2
Law of the Weakest Link: Cross Capabilities of Large Language Models

Paper • 2409.19951 • Published 17 days ago • 53
Hyper-Connections

Paper • 2409.19606 • Published 18 days ago • 18
Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published 25 days ago • 26
LongGenBench: Long-context Generation Benchmark

Paper • 2410.04199 • Published 11 days ago • 16
Erasing Conceptual Knowledge from Language Models

Paper • 2410.02760 • Published 13 days ago • 12
Differential Transformer

Paper • 2410.05258 • Published 9 days ago • 150
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

Paper • 2410.02707 • Published 13 days ago • 44
Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published 15 days ago • 130
Selective Attention Improves Transformer

Paper • 2410.02703 • Published 13 days ago • 22
Mentor-KD: Making Small Language Models Better Multi-step Reasoners

Paper • 2410.09037 • Published 5 days ago • 4
Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Paper • 2410.09335 • Published 5 days ago • 11
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published 5 days ago • 30
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights

Paper • 2410.09008 • Published 5 days ago • 15
Mechanistic Permutability: Match Features Across Layers

Paper • 2410.07656 • Published 7 days ago • 16
SimpleStrat: Diversifying Language Model Generation with Stratification

Paper • 2410.09038 • Published 5 days ago • 3
PositionID: LLMs can Control Lengths, Copy and Paste with Explicit Positional Awareness

Paper • 2410.07035 • Published 7 days ago • 16