belyakoff's picture

belyakoff PRO

belyakoff

·

https://cval.ai

fangorntb

AI & ML interests

NLP/NLU

Recent Activity

updated a dataset about 22 hours ago

belyakoff/unsloth_requirements

liked a model 1 day ago

pykale/llama-2-7b-ocr

Organizations

belyakoff's activity

upvoted an article 4 months ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Jul 23

• 215

upvoted 8 papers 6 months ago

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31 • 63

Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

Paper • 2405.20541 • Published May 30 • 21

Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30 • 29

Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents

Paper • 2310.19923 • Published Oct 30, 2023 • 13

Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

Paper • 2402.17016 • Published Feb 26 • 5

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19 • 150

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13 • 67

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20 • 46

upvoted 2 articles 6 months ago

Article

🕳️ Attention Sinks in LLMs for endless fluency

By

•

Oct 9, 2023

• 8

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

By

•

May 7

• 39

upvoted an article 7 months ago

Article

Inference for PROs

Sep 22, 2023

• 50

upvoted a paper 9 months ago

Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon

Paper • 2401.03462 • Published Jan 7 • 27