qingshan777
commited on
Commit
•
95e76c4
1
Parent(s):
004b6f1
Update README.md
Browse files
README.md
CHANGED
@@ -17,9 +17,9 @@ pipeline_tag: visual-question-answering
|
|
17 |
|
18 |
**[IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities](https://www.arxiv.org/abs/2408.12902)**
|
19 |
|
20 |
-
|
21 |
Bin Wang*, Chunyu Xie*, Dawei Leng†, Yuhui Yin(*Equal Contribution, ✝Corresponding Author)
|
22 |
-
|
23 |
[![arXiv](https://img.shields.io/badge/arXiv-2408.12902-b31b1b.svg)](https://www.arxiv.org/abs/2408.12902)
|
24 |
|
25 |
We propose a MLLM based on Inner-Adaptor Architecture (IAA). IAA demonstrates that training with a frozen language model can surpass the models with fine-tuned LLMs in both multimodal comprehension and visual grounding tasks. Moreover, after deployment, our approach incorporates multiple workflows, thereby preserving the NLP proficiency of the language model. With a single download, the model can be finetuned to cater to various task specifications. Enjoy the seamless experience of utilizing our IAA model.
|
|
|
17 |
|
18 |
**[IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities](https://www.arxiv.org/abs/2408.12902)**
|
19 |
|
20 |
+
|
21 |
Bin Wang*, Chunyu Xie*, Dawei Leng†, Yuhui Yin(*Equal Contribution, ✝Corresponding Author)
|
22 |
+
|
23 |
[![arXiv](https://img.shields.io/badge/arXiv-2408.12902-b31b1b.svg)](https://www.arxiv.org/abs/2408.12902)
|
24 |
|
25 |
We propose a MLLM based on Inner-Adaptor Architecture (IAA). IAA demonstrates that training with a frozen language model can surpass the models with fine-tuned LLMs in both multimodal comprehension and visual grounding tasks. Moreover, after deployment, our approach incorporates multiple workflows, thereby preserving the NLP proficiency of the language model. With a single download, the model can be finetuned to cater to various task specifications. Enjoy the seamless experience of utilizing our IAA model.
|