nvidia
/

NVLM-D-72B

Image-Text-to-Text

Inference Endpoints

Model card Files Files and versions Community

[email protected] commited on 15 days ago

Commit

ebd553d

•

1 Parent(s): 2cebb4b

Update readme

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -14,10 +14,12 @@ tags:
 ## Model Details
-Today (September 17th, 2024), we introduce [NVLM 1.0](https://arxiv.org/abs/2409.11402), a family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., Llama 3-V 405B and InternVL 2). Remarkably, NVLM 1.0 shows improved text-only performance over its LLM backbone after multimodal training. We are open-sourcing the model weights and code for the community.
 ## Other Resources
-[Inference Code (HF)](https://huggingface.co/nvidia/NVLM-1.0-D-72B/tree/main) &ensp; [Training Code (Coming soon)]() &ensp; [Website](https://nvlm-project.github.io/) &ensp; [Paper](https://arxiv.org/abs/2409.11402)
 ## Benchmark Results
 We train our model with legacy [Megatron-LM](https://github.com/NVIDIA/Megatron-LM/tree/main/megatron/legacy) and adapt the codebase to Huggingface for model hosting, reproducibility, and inference.

 ## Model Details
+Today (September 17th, 2024), we introduce [NVLM 1.0](https://arxiv.org/abs/2409.11402), a family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models (e.g., Llama 3-V 405B and InternVL 2). Remarkably, NVLM 1.0 shows improved text-only performance over its LLM backbone after multimodal training.
+In this repo, we are open-sourcing NVLM-1.0-D-72B (decoder-only architecture), the decoder-only model weights and code for the community.
 ## Other Resources
+[Inference Code (HF)](https://huggingface.co/nvidia/NVLM-D-72B/tree/main) &ensp; [Training Code (Coming soon)]() &ensp; [Website](https://nvlm-project.github.io/) &ensp; [Paper](https://arxiv.org/abs/2409.11402)
 ## Benchmark Results
 We train our model with legacy [Megatron-LM](https://github.com/NVIDIA/Megatron-LM/tree/main/megatron/legacy) and adapt the codebase to Huggingface for model hosting, reproducibility, and inference.