Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent
Abstract
The search for a general model that can operate seamlessly across multiple domains remains a key goal in machine learning research. The prevailing methodology in Reinforcement Learning (RL) typically limits models to a single task within a unimodal framework, a limitation that contrasts with the broader vision of a versatile, multi-domain model. In this paper, we present Jack of All Trades (JAT), a transformer-based model with a unique design optimized for handling sequential decision-making tasks and multimodal data types. The JAT model demonstrates its robust capabilities and versatility by achieving strong performance on very different RL benchmarks, along with promising results on Computer Vision (CV) and Natural Language Processing (NLP) tasks, all using a single set of weights. The JAT model marks a significant step towards more general, cross-domain AI model design, and notably, it is the first model of its kind to be fully open-sourced (see https://huggingface.co/jat-project/jat), including a pioneering general-purpose dataset.
Community
If this gets further developed then it can open new dimensions in research & models. Great job guys!!
You can probably try, (requires fine-tuning on your data of course) feel free to share if you get anything interesting :)
Models citing this paper 1
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper