Update README.md
Browse files# Model Card
Veagle is a multimodal AI model excelling in integrating visual and textual data. It uses a specialized
architecture combining an image encoder with a language model, enabling sophisticated interpretation of mixed inputs.
Veagle's training incorporates unique dataset enhancements, resulting in remarkable accuracy in
visual question-answering and related tasks. This model showcases a profound understanding of multimodal interactions,
positioning it as a notable advancement in AI.
Further details about Veagle can be found in our detailed release blog post.
## Key Contributions
- Veagle has shown remarkable performance in visual question answering, outperforming existing models in this domain.
The paper presents detailed comparisons and metrics that highlight Veagle's capabilities.
- Despite using a smaller and more optimized dataset, Veagle achieves high accuracy and efficiency.
This demonstrates the model's effective learning from limited data.
- All fine-tuning data for Veagle come from open-source models,
- showcasing its ability to learn and adapt without relying on larger,
- state-of-the-art models like GPT-3.5 or GPT-4.
## Training
- Trained by: SuperAGI Team
- Hardware: NVIDIA 6 x H100 SxM (80GB)
- Model used: Mistral 7B
- Duration of finetuning: 4 hours
- Number of epochs: 1
- Batch size: 16
- Learning Rate: 2e-5
- Warmup Ratio: 0.1
- Optmizer: AdamW
- Scheduler: Cosine
## Example Prompt
## Evaluation
![Image 18-01-24 at 3.39 PM.jpg](https://cdn-uploads.huggingface.co/production/uploads/65a8fe900dba6b99a0164a47/bBBFaYI6maW_DKci9nl6L.jpeg)
## The SuperAGI team