Mert Inan

merterm

http://merterm.github.io/

AI & ML interests

multimodal dialogue, conversational ai, NeuroRoboNLP, sign language processing

Recent Activity

updated a Space 10 days ago

merterm/Learning-Games-Experiment

upvoted a collection 10 days ago

OpenCoder

updated a Space 10 days ago

merterm/Learning-Games-Experiment

Organizations

None yet

merterm's activity

upvoted a collection 10 days ago

OpenCoder

Collection

OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. • 9 items • Updated 4 days ago • 70

upvoted a paper 3 months ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 41

upvoted a collection 4 months ago

OLMo Suite

Collection

Artifacts for the first set of OLMo models. • 18 items • Updated 7 days ago • 66

upvoted 2 collections 5 months ago

Core ML Gallery Models

Collection

7 items • Updated Oct 4 • 31

Nemotron 4 340B

Collection

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 19 days ago • 158

upvoted a paper 5 months ago

The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6 • 55

upvoted 2 collections 6 months ago

PaliGemma Release

Collection

Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Jul 31 • 137

Granite Code Models

Collection

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated 18 days ago • 178

upvoted an article 7 months ago

Article

Releasing Swift Transformers: Run On-Device LLMs in Apple Devices

Aug 8, 2023

• 23

upvoted a collection 7 months ago

OpenELM Instruct Models

Collection

4 items • Updated Oct 4 • 113

upvoted 2 papers 8 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29 • 136

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 124

upvoted a paper 9 months ago

Flamingo: a Visual Language Model for Few-Shot Learning

Paper • 2204.14198 • Published Apr 29, 2022 • 14

upvoted a collection 9 months ago

Gemma release

Collection

Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325

upvoted 2 papers 9 months ago

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Paper • 2402.07456 • Published Feb 12 • 41

LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 30

upvoted 2 papers 10 months ago

Specialized Language Models with Cheap Inference from Limited Domain Data

Paper • 2402.01093 • Published Feb 2 • 45

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Paper • 2401.16380 • Published Jan 29 • 48

upvoted a collection 10 months ago

Mamba

Collection

Mamba SSM Models with hf_integration. • 7 items • Updated Dec 28, 2023 • 7

upvoted a paper over 1 year ago

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn

Paper • 2306.08640 • Published Jun 14, 2023 • 26