melsiddieg (MOHAMMED ABDALLAH)

upvoted a paper 3 months ago

Med42-v2: A Suite of Clinical LLMs

Paper • 2408.06142 • Published Aug 12 • 50

upvoted a collection 3 months ago

🦅 🐍 FalconMamba 7B

Collection

This collection features the FalconMamba 7B base model, the instruction-tuned version, their 4-bit and GGUF variants, and the demo. • 15 items • Updated 26 days ago • 27

upvoted an article 3 months ago

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12

• 101

upvoted a paper 3 months ago

Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?

Paper • 2407.16607 • Published Jul 23 • 21

upvoted a paper 4 months ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10 • 66

upvoted an article 6 months ago

Article

Introducing the Open Arabic LLM Leaderboard

May 14

• 65

upvoted a paper 7 months ago

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 104

upvoted a paper 8 months ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 182

upvoted a collection 8 months ago

💫 StarCoder2

Collection

StarCoder2 models and datasets! • 8 items • Updated Mar 1 • 81

upvoted 2 papers 8 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 602

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22 • 124

upvoted 3 papers 9 months ago

upvoted 4 papers 10 months ago

ChatQA: Building GPT-4 Level Conversational QA Models

Paper • 2401.10225 • Published Jan 18 • 33

AST-T5: Structure-Aware Pretraining for Code Generation and Understanding

Paper • 2401.03003 • Published Jan 5 • 12

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 157

PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation

Paper • 2312.17276 • Published Dec 27, 2023 • 15

upvoted 2 papers 11 months ago

WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation

Paper • 2312.14187 • Published Dec 20, 2023 • 49

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

Paper • 2312.07987 • Published Dec 13, 2023 • 40

MOHAMMED ABDALLAH

AI & ML interests

Organizations

melsiddieg's activity