neuron-compile-jobs
Collection
5 items
•
Updated
This repository contains AWS Inferentia2 and neuronx compatible checkpoints for Mistral-Large-Instruct. You can find detailed information about the base model on its Model Card.
This model has been exported to the neuron format using specific input_shapes and compiler parameters detailed in the paragraphs below.
SEQUENCE_LENGTH = 4096
BATCH_SIZE = 4
NUM_CORES = 24
PRECISION = "bf16"