Edd's picture

Edd

Erland

·

AI & ML interests

None yet

Recent Activity

reacted to danielhanchen's post with 🔥 about 1 hour ago

upvoted a collection about 4 hours ago

Vision/multimodal Models

updated a model 2 days ago

Erland/model-gguf-mistral

Organizations

None yet

Erland's activity

upvoted a collection about 4 hours ago

Vision/multimodal Models

Collection of the most popular vision models including Llama 3.2, LlaVa, Qwen2 VL, Pixtral, PaliGemma and more! • 22 items • Updated about 10 hours ago • 3

upvoted a collection about 1 month ago

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 15 items • Updated 19 days ago • 76

upvoted 2 papers 2 months ago

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Paper • 2409.07146 • Published Sep 11 • 19

Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5 • 88

upvoted an article 3 months ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

Aug 14

• 50

upvoted an article 6 months ago

Article

Indexify: Bringing HuggingFace Models to Real-Time Pipelines for Production Applications

By

•

May 31

• 7

upvoted a collection 6 months ago

Blackhole

A black hole with lots of high-quality dialogue datasets in many fields, and multilingual helps to train LLMs with SFT and DPO methods easier. • 32 items • Updated Aug 18 • 6

upvoted a paper 6 months ago

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

upvoted 3 papers 8 months ago

Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

Paper • 2403.20041 • Published Mar 29 • 34

BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Paper • 2403.09347 • Published Mar 14 • 20

LocalMamba: Visual State Space Model with Windowed Selective Scan

Paper • 2403.09338 • Published Mar 14 • 7

upvoted a collection 10 months ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 217

upvoted a paper 11 months ago

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

Paper • 2312.09390 • Published Dec 14, 2023 • 32

upvoted 3 papers about 1 year ago

Prompt Cache: Modular Attention Reuse for Low-Latency Inference

Paper • 2311.04934 • Published Nov 7, 2023 • 28

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Paper • 2310.12773 • Published Oct 19, 2023 • 28

Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 77

upvoted 4 papers over 1 year ago

OctoPack: Instruction Tuning Code Large Language Models

Paper • 2308.07124 • Published Aug 14, 2023 • 28

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Paper • 2307.15217 • Published Jul 27, 2023 • 36

On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models

Paper • 2307.09793 • Published Jul 19, 2023 • 46

Extending Context Window of Large Language Models via Positional Interpolation

Paper • 2306.15595 • Published Jun 27, 2023 • 53