AugmentedLearning - a samaffolter Collection

samaffolter 's Collections

AugmentedLearning

AugmentedLearning

updated Jan 1

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Paper • 2312.15685 • Published Dec 25, 2023 • 17
mistralai/Mixtral-8x7B-Instruct-v0.1

Text Generation • Updated Aug 19 • 877k • • 4.18k
microsoft/phi-2

Text Generation • Updated Apr 29 • 263k • 3.24k
TinyLlama/TinyLlama-1.1B-Chat-v1.0

Text Generation • Updated Mar 17 • 1.15M • 1.09k
Are Emergent Abilities in Large Language Models just In-Context Learning?

Paper • 2309.01809 • Published Sep 4, 2023 • 3
Commonsense Knowledge Transfer for Pre-trained Language Models

Paper • 2306.02388 • Published Jun 4, 2023 • 1
Schema-learning and rebinding as mechanisms of in-context learning and emergence

Paper • 2307.01201 • Published Jun 16, 2023 • 2
Finding Neurons in a Haystack: Case Studies with Sparse Probing

Paper • 2305.01610 • Published May 2, 2023 • 2
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference

Paper • 2308.12066 • Published Aug 23, 2023 • 4
Experts Weights Averaging: A New General Training Scheme for Vision Transformers

Paper • 2308.06093 • Published Aug 11, 2023 • 2
Multi-Head Adapter Routing for Cross-Task Generalization

Paper • 2211.03831 • Published Nov 7, 2022 • 2
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception

Paper • 2305.06324 • Published May 10, 2023 • 1
Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Paper • 2309.10020 • Published Sep 18, 2023 • 40
MIMIC-IT: Multi-Modal In-Context Instruction Tuning

Paper • 2306.05425 • Published Jun 8, 2023 • 11
Evaluation and Mitigation of Agnosia in Multimodal Large Language Models

Paper • 2309.04041 • Published Sep 7, 2023 • 1
From Sparse to Soft Mixtures of Experts

Paper • 2308.00951 • Published Aug 2, 2023 • 20