Gabriele Sarti's picture

Gabriele Sarti

gsarti

·

https://gsarti.com

AI & ML interests

Interpretability for generative language models

Organizations

gsarti's activity

upvoted a collection 9 days ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 8 items • Updated 5 days ago • 160

upvoted a paper 10 days ago

The Geometry of Concepts: Sparse Autoencoder Feature Structure

Paper • 2410.19750 • Published about 1 month ago • 1

upvoted 2 papers 11 days ago

Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders

Paper • 2410.20526 • Published 13 days ago • 1

Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics

Paper • 2410.21272 • Published 12 days ago • 1

upvoted a paper 13 days ago

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published 19 days ago • 17

upvoted 3 papers 18 days ago

Automatically Interpreting Millions of Features in Large Language Models

Paper • 2410.13928 • Published 23 days ago • 1

Decomposing The Dark Matter of Sparse Autoencoders

Paper • 2410.14670 • Published 22 days ago • 1

How Do Multilingual Models Remember? Investigating Multilingual Factual Recall Mechanisms

Paper • 2410.14387 • Published 22 days ago • 1

upvoted 4 papers about 1 month ago

Towards Interpreting Visual Information Processing in Vision-Language Models

Paper • 2410.07149 • Published Oct 9 • 1

What Matters for Model Merging at Scale?

Paper • 2410.03617 • Published Oct 4 • 8

Geometric Signatures of Compositionality Across a Language Model's Lifetime

Paper • 2410.01444 • Published Oct 2 • 1

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21 • 27

upvoted a collection about 2 months ago

ITA-Bench: Italian Benchmarks for LLMs

A collection of Italian benchmarks for Large Language Models. See also our Github repo: https://github.com/SapienzaNLP/ita-bench • 19 items • Updated Sep 23 • 6

upvoted a paper about 2 months ago

A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders

Paper • 2409.14507 • Published Sep 22 • 1

upvoted an article 2 months ago

Article

Selective fine-tuning of Language Models with Spectrum

By

•

Sep 3

• 29

upvoted 5 papers 3 months ago

Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models

Paper • 2408.06663 • Published Aug 13 • 15

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Paper • 2408.05147 • Published Aug 9 • 37

Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP

Paper • 2408.04303 • Published Aug 8 • 9

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8 • 154

The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability

Paper • 2408.01416 • Published Aug 2 • 1