-
Textbooks Are All You Need
Paper • 2306.11644 • Published • 142 -
Textbooks Are All You Need II: phi-1.5 technical report
Paper • 2309.05463 • Published • 87 -
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
Paper • 2305.07759 • Published • 33 -
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Paper • 2406.20094 • Published • 94
Collections
Discover the best community collections!
Collections including paper arxiv:2406.20094
-
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment
Paper • 2401.12474 • Published • 35 -
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 24 -
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
Paper • 2310.00746 • Published • 1 -
LESS: Selecting Influential Data for Targeted Instruction Tuning
Paper • 2402.04333 • Published • 3
-
SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning
Paper • 2409.05556 • Published • 1 -
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Paper • 2409.04109 • Published • 43 -
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?
Paper • 2409.15277 • Published • 34 -
Learning Task Decomposition to Assist Humans in Competitive Programming
Paper • 2406.04604 • Published • 4
-
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers
Paper • 2407.09413 • Published • 9 -
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Paper • 2406.20094 • Published • 94 -
Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity
Paper • 2406.17720 • Published • 7 -
LiveBench: A Challenging, Contamination-Free LLM Benchmark
Paper • 2406.19314 • Published • 19
-
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs
Paper • 2407.00653 • Published • 11 -
Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs
Paper • 2406.20086 • Published • 4 -
UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI
Paper • 2407.00106 • Published • 5 -
MIRAI: Evaluating LLM Agents for Event Forecasting
Paper • 2407.01231 • Published • 16
-
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Paper • 2406.17557 • Published • 86 -
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Paper • 2406.16860 • Published • 57 -
Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity
Paper • 2406.17720 • Published • 7 -
Scaling Synthetic Data Creation with 1,000,000,000 Personas
Paper • 2406.20094 • Published • 94
-
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper • 2406.11813 • Published • 30 -
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries
Paper • 2406.12824 • Published • 20 -
Tokenization Falling Short: The Curse of Tokenization
Paper • 2406.11687 • Published • 15 -
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level
Paper • 2406.11817 • Published • 13
-
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 67 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 126 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 53 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 85