quyettv (quyettv)

upvoted 4 papers about 1 month ago

upvoted 2 papers about 2 months ago

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 77

Longhorn: State Space Models are Amortized Online Learners

Paper • 2407.14207 • Published Jul 19 • 16

upvoted 2 papers 2 months ago

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1 • 84

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3 • 43

upvoted an article 3 months ago

Article

Welcome Gemma 2 - Google's new open LLM

Jun 27

• 115

upvoted 4 papers 3 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25 • 84

Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs

Paper • 2406.15927 • Published Jun 22 • 13

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21 • 60

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13 • 50

upvoted an article 4 months ago

Article

Uncensor any LLM with abliteration

By

•

Jun 13

• 312

upvoted 4 papers 4 months ago

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 86

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

Paper • 2405.15738 • Published May 24 • 43

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15 • 86

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20 • 45

upvoted 5 papers 5 months ago

KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 108

Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25 • 57

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 250

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 590

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 182

upvoted 3 papers 6 months ago

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3 • 63

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 103

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Paper • 2403.07508 • Published Mar 12 • 75

quyettv

AI & ML interests

Organizations

quyettv's activity

MambaByte: Token-free Selective State Space Model

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Jamba: A Hybrid Transformer-Mamba Language Model