stepfun-ai
/

GOT-OCR2_0

Image-Text-to-Text

feature-extraction

vision-language

Model card Files Files and versions Community

ucaslcl commited on Sep 13

Commit

ff68f33

•

1 Parent(s): 215794b

Update README.md

Files changed (1) hide show

README.md +26 -0

README.md CHANGED Viewed

	@@ -57,3 +57,29 @@ print(res)
57
58	```
59

 ```
+## More Multimodal Projects
+👏 Welcome to explore more multimodal projects of our team:
+[Vary](https://github.com/Ucas-HaoranWei/Vary) | [Fox](https://github.com/ucaslcl/Fox) | [OneChart](https://github.com/LingyvKong/OneChart)
+## Citation
+If you find our work helpful, please consider citing our papers 📝 and liking this project ❤️！
+```bib
+@article{wei2024general,
+  title={General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model},
+  author={Wei, Haoran and Liu, Chenglong and Chen, Jinyue and Wang, Jia and Kong, Lingyu and Xu, Yanming and Ge, Zheng and Zhao, Liang and Sun, Jianjian and Peng, Yuang and others},
+  journal={arXiv preprint arXiv:2409.01704},
+  year={2024}
+}
+@article{wei2023vary,
+  title={Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models},
+  author={Wei, Haoran and Kong, Lingyu and Chen, Jinyue and Zhao, Liang and Ge, Zheng and Yang, Jinrong and Sun, Jianjian and Han, Chunrui and Zhang, Xiangyu},
+  journal={arXiv preprint arXiv:2312.06109},
+  year={2023}
+}
+```