16 36 72

Alireza Mohammadshahi

alirezamsh

AI & ML interests

AI/NLP (NMT,LLMs)

Recent Activity

updated a Space 15 days ago

alirezamsh/small100

liked a Space about 2 months ago

ArtificialAnalysis/LLM-Performance-Leaderboard

liked a dataset about 2 months ago

openbmb/UltraInteract_pair

Articles

Mergoo: Efficiently Build Your Own MoE LLM

Jun 3

• 41

Orchestration of Experts: The First-Principle Multi-Model System

May 30

• 15

Organizations

alirezamsh's activity

upvoted 2 papers 2 months ago

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14 • 73

PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

Paper • 2406.12430 • Published Jun 18 • 7

upvoted a collection 2 months ago

Probably function calling datasets

Collection

Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17 • 36

upvoted a paper 5 months ago

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

Paper • 2406.12753 • Published Jun 18 • 14

upvoted 6 papers 7 months ago

A Careful Examination of Large Language Model Performance on Grade School Arithmetic

Paper • 2405.00332 • Published May 1 • 30

Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 117

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30 • 73

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29 • 68

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

Paper • 2404.16873 • Published Apr 21 • 28

FlowMind: Automatic Workflow Generation with LLMs

Paper • 2404.13050 • Published Mar 17 • 33

upvoted a collection 7 months ago

OpenMath

Collection

A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated Oct 1 • 37

upvoted a paper 7 months ago

Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87

upvoted 2 articles 7 months ago

Article

Synthetic data: save money, time and carbon with open source

Feb 16

• 50

Article

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

Mar 20

• 66

upvoted 3 papers 7 months ago

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Paper • 2404.14723 • Published Apr 23 • 10

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25 • 74

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22 • 126

upvoted a collection 7 months ago

Top 10% instruction tuning datasets

Collection

Collects datasets with 'instruction' in the name and more than 1 download and in the top 10% for the number of likes • 13 items • Updated Jul 3 • 7

upvoted a paper 7 months ago

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 29

upvoted an article 7 months ago

Article

Mixture of Depth is Vibe

•

Apr 22

• 44