maximousblk
's Collections
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective
Depth Up-Scaling
Paper
•
2312.15166
•
Published
•
56
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Paper
•
2312.12456
•
Published
•
41
Cached Transformers: Improving Transformers with Differentiable Memory
Cache
Paper
•
2312.12742
•
Published
•
12
Mini-GPTs: Efficient Large Language Models through Contextual Pruning
Paper
•
2312.12682
•
Published
•
8
LLM in a flash: Efficient Large Language Model Inference with Limited
Memory
Paper
•
2312.11514
•
Published
•
258
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Paper
•
2312.07987
•
Published
•
40
Distributed Inference and Fine-tuning of Large Language Models Over The
Internet
Paper
•
2312.08361
•
Published
•
25
COLMAP-Free 3D Gaussian Splatting
Paper
•
2312.07504
•
Published
•
10
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper
•
2312.00752
•
Published
•
138
SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh
Reconstruction and High-Quality Mesh Rendering
Paper
•
2311.12775
•
Published
•
28
Exponentially Faster Language Modelling
Paper
•
2311.10770
•
Published
•
118
Orca 2: Teaching Small Language Models How to Reason
Paper
•
2311.11045
•
Published
•
70
EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models
Paper
•
2308.14352
•
Published
Scaling up GANs for Text-to-Image Synthesis
Paper
•
2303.05511
•
Published
•
4
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper
•
2312.16862
•
Published
•
30
One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and
Erasing Applications
Paper
•
2312.16145
•
Published
•
8
Paper
•
2310.06825
•
Published
•
47
Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale
Pretraining Corpus for Math
Paper
•
2312.17120
•
Published
•
25
Paper
•
2312.17244
•
Published
•
9
Unsupervised Universal Image Segmentation
Paper
•
2312.17243
•
Published
•
19
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper
•
2401.01325
•
Published
•
26
Paper
•
2401.04088
•
Published
•
157
MoE-Mamba: Efficient Selective State Space Models with Mixture of
Experts
Paper
•
2401.04081
•
Published
•
70
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Paper
•
2309.11235
•
Published
•
16
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
•
2307.09288
•
Published
•
242
BlackMamba: Mixture of Experts for State-Space Models
Paper
•
2402.01771
•
Published
•
23
Paper
•
2402.13144
•
Published
•
94
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
602
Gemma: Open Models Based on Gemini Research and Technology
Paper
•
2403.08295
•
Published
•
47
GaussianImage: 1000 FPS Image Representation and Compression by 2D
Gaussian Splatting
Paper
•
2403.08551
•
Published
•
8
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
•
2403.19887
•
Published
•
104
Mixture-of-Depths: Dynamically allocating compute in transformer-based
language models
Paper
•
2404.02258
•
Published
•
104
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
84
Ferret-v2: An Improved Baseline for Referring and Grounding with Large
Language Models
Paper
•
2404.07973
•
Published
•
30
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
•
2404.07143
•
Published
•
103
Griffin: Mixing Gated Linear Recurrences with Local Attention for
Efficient Language Models
Paper
•
2402.19427
•
Published
•
52
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
•
2404.14219
•
Published
•
251
PowerInfer-2: Fast Large Language Model Inference on a Smartphone
Paper
•
2406.06282
•
Published
•
36
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated
Parameters
Paper
•
2406.05955
•
Published
•
22