Dmitry Ryumin's picture

Dmitry Ryumin

DmitryRyumin

·

https://dmitryryumin.github.io

DmitryRyumin

AI & ML interests

Machine Learning and Applications, Multi-Modal Understanding

Recent Activity

liked a Space about 6 hours ago

modelscope/modelscope-studio-beta

updated a Space about 14 hours ago

DmitryRyumin/NewEraAI-Papers

reacted to TuringsSolutions's post with 🔥 9 days ago

Organizations

DmitryRyumin's activity

upvoted 2 papers about 1 month ago

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published Oct 3 • 24

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 166

upvoted 3 papers about 2 months ago

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Paper • 2410.01036 • Published Oct 1 • 14

HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors

Paper • 2408.06019 • Published Aug 12 • 13

Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

Paper • 2409.18124 • Published Sep 26 • 31

upvoted a collection about 2 months ago

Llama 3.2

Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 20 items • Updated about 13 hours ago • 39

upvoted 3 articles about 2 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18

• 202

Article

Exploring the Daily Papers Page on Hugging Face

Sep 23

• 39

Article

XetHub is joining Hugging Face!

Aug 8

• 80

upvoted a collection 2 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Sep 18 • 370

upvoted 5 papers 3 months ago

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3 • 77

ReMamba: Equip Mamba with Effective Long-Sequence Modeling

Paper • 2408.15496 • Published Aug 28 • 10

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27 • 37

The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design

Paper • 2408.12503 • Published Aug 22 • 23

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22 • 62

upvoted a collection 3 months ago

Jamba-1.5

The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 81

upvoted an article 3 months ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

By

•

Aug 19

• 73

upvoted a paper 3 months ago

Transformer Language Models without Positional Encodings Still Learn Positional Information

Paper • 2203.16634 • Published Mar 30, 2022 • 5

upvoted a collection 3 months ago

Qwen2-Audio

Audio-language model series based on Qwen2 • 4 items • Updated Sep 18 • 45

upvoted a paper 3 months ago

Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15 • 55