TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization Paper • 2402.13249 • Published Feb 20 • 10
The FinBen: An Holistic Financial Benchmark for Large Language Models Paper • 2402.12659 • Published Feb 20 • 16
Instruction-tuned Language Models are Better Knowledge Learners Paper • 2402.12847 • Published Feb 20 • 24
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models Paper • 2402.13064 • Published Feb 20 • 46
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21 • 111
User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21 • 18
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting Paper • 2402.13720 • Published Feb 21 • 5
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models Paper • 2402.10986 • Published Feb 16 • 76
Speculative Streaming: Fast LLM Inference without Auxiliary Models Paper • 2402.11131 • Published Feb 16 • 41
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling Paper • 2402.12226 • Published Feb 19 • 40
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration Paper • 2402.11550 • Published Feb 18 • 15
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements Paper • 2402.10963 • Published Feb 13 • 9
Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning Paper • 2402.11690 • Published Feb 18 • 7
Linear Transformers with Learnable Kernel Functions are Better In-Context Models Paper • 2402.10644 • Published Feb 16 • 78
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss Paper • 2402.10790 • Published Feb 16 • 40
SPAR: Personalized Content-Based Recommendation via Long Engagement Attention Paper • 2402.10555 • Published Feb 16 • 32
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows Paper • 2402.10379 • Published Feb 16 • 29
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing Paper • 2402.10294 • Published Feb 15 • 22
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models Paper • 2402.10524 • Published Feb 16 • 21
Large Language Models as Zero-shot Dialogue State Tracker through Function Calling Paper • 2402.10466 • Published Feb 16 • 16
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement Paper • 2402.14658 • Published Feb 22 • 82
TinyLLaVA: A Framework of Small-scale Large Multimodal Models Paper • 2402.14289 • Published Feb 22 • 19
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming Paper • 2402.14261 • Published Feb 22 • 10