-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 143 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 11 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 50 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 44
Collections
Discover the best community collections!
Collections including paper arxiv:2404.04167
-
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
Paper • 2409.02897 • Published • 44 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12 -
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Paper • 2405.19327 • Published • 46
-
DeViDe: Faceted medical knowledge for improved medical vision-language pre-training
Paper • 2404.03618 • Published • 2 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12 -
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Paper • 2305.09781 • Published • 4 -
McEval: Massively Multilingual Code Evaluation
Paper • 2406.07436 • Published • 39
-
Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition
Paper • 2404.02514 • Published • 9 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12 -
Length Generalization of Causal Transformers without Position Encoding
Paper • 2404.12224 • Published • 1 -
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B
Paper • 2406.07394 • Published • 22
-
LLM-ABR: Designing Adaptive Bitrate Algorithms via Large Language Models
Paper • 2404.01617 • Published • 6 -
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Paper • 2404.02905 • Published • 64 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 28 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12
-
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 35 -
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Paper • 2211.12588 • Published • 3 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 26 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12
-
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 35 -
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Paper • 2211.12588 • Published • 3 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 26 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12
-
RakutenAI-7B: Extending Large Language Models for Japanese
Paper • 2403.15484 • Published • 12 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12 -
abhinand/malayalam-llama-7b-instruct-v0.1
Text Generation • Updated • 959 • 11