Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2406.10227

a collection of algorithmic agents for user interfaces/interactions and program synthesis

Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Paper • 1802.08802 • Published Feb 24, 2018
Mapping Natural Language Commands to Web Elements

Paper • 1808.09132 • Published Aug 28, 2018
Learning to Navigate the Web

Paper • 1812.09195 • Published Dec 21, 2018
Interactive Task and Concept Learning from Natural Language Instructions and GUI Demonstrations

Paper • 1909.00031 • Published Aug 30, 2019

VideoGUI: A Benchmark for GUI Automation from Instructional Videos

Paper • 2406.10227 • Published Jun 14 • 9
unicamp-dl/SurveySum

Viewer • Updated Sep 2 • 79 • 61 • 1

Vietnamese Dataset

nhuvo/MedEV

Viewer • Updated Mar 29 • 718k • 172 • 5
ontocord/viet4all

Viewer • Updated Apr 27 • 26.3k • 217 • 21
uitnlp/OpenViVQA-dataset

Viewer • Updated Dec 13, 2023 • 11.2k • 185 • 8
uitnlp/vietnamese_students_feedback

Viewer • Updated Oct 13, 2022 • 16.2k • 263 • 13

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13 • 50
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Paper • 2406.09406 • Published Jun 13 • 13
VideoGUI: A Benchmark for GUI Automation from Instructional Videos

Paper • 2406.10227 • Published Jun 14 • 9
What If We Recaption Billions of Web Images with LLaMA-3?

Paper • 2406.08478 • Published Jun 12 • 39

BLINK: Multimodal Large Language Models Can See but Not Perceive

Paper • 2404.12390 • Published Apr 18 • 24
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension

Paper • 2404.16790 • Published Apr 25 • 7
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots

Paper • 2405.07990 • Published May 13 • 16
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

Paper • 2406.09411 • Published Jun 13 • 18

NExT-GPT: Any-to-Any Multimodal LLM

Paper • 2309.05519 • Published Sep 11, 2023 • 78
Large Language Model for Science: A Study on P vs. NP

Paper • 2309.05689 • Published Sep 11, 2023 • 20
AstroLLaMA: Towards Specialized Foundation Models in Astronomy

Paper • 2309.06126 • Published Sep 12, 2023 • 16
Large Language Models for Compiler Optimization

Paper • 2309.07062 • Published Sep 11, 2023 • 22

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 22
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 16
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 9
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 8

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs