update
Browse files
README.md
CHANGED
@@ -30,12 +30,6 @@ inference: false
|
|
30 |
- **First generalist model support grounding in Chinese**: Detecting bounding boxes through open-domain language expression in both Chinese and English.
|
31 |
- **Fine-grained recognization and understanding**: Compared to the 224 resolution currently used by other open-source LVLM, the 448 resolution promotes fine-grained text recognition, document QA, and bounding box annotation.
|
32 |
|
33 |
-
<br>
|
34 |
-
<p align="center">
|
35 |
-
<img src="assets/demo_vl.gif" width="400"/>
|
36 |
-
<p>
|
37 |
-
<br>
|
38 |
-
|
39 |
We release two models of the Qwen-VL series:
|
40 |
- Qwen-VL: The pre-trained LVLM model uses Qwen-7B as the initialization of the LLM, and [Openclip ViT-bigG](https://github.com/mlfoundations/open_clip) as the initialization of the visual encoder. And connects them with a randomly initialized cross-attention layer. Qwen-VL was trained on about 1.5B image-text paired data. The final image input resolution is 448.
|
41 |
- Qwen-VL-Chat: A multimodal LLM-based AI assistant, which is trained with alignment techniques.
|
|
|
30 |
- **First generalist model support grounding in Chinese**: Detecting bounding boxes through open-domain language expression in both Chinese and English.
|
31 |
- **Fine-grained recognization and understanding**: Compared to the 224 resolution currently used by other open-source LVLM, the 448 resolution promotes fine-grained text recognition, document QA, and bounding box annotation.
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
We release two models of the Qwen-VL series:
|
34 |
- Qwen-VL: The pre-trained LVLM model uses Qwen-7B as the initialization of the LLM, and [Openclip ViT-bigG](https://github.com/mlfoundations/open_clip) as the initialization of the visual encoder. And connects them with a randomly initialized cross-attention layer. Qwen-VL was trained on about 1.5B image-text paired data. The final image input resolution is 448.
|
35 |
- Qwen-VL-Chat: A multimodal LLM-based AI assistant, which is trained with alignment techniques.
|