Astra-v1-12B / README.md
P0x0's picture
Update README.md
dd2e9ce verified
|
raw
history blame
No virus
2.32 kB
metadata
license: apache-2.0
base_model: mistralai/Mistral-Nemo-Base-2407
tags:
  - general-purpose
  - text-generation

Astra-v1-12B

Astra-v1-12B is a fine-tuned version of the base model Mistral-Nemo-Base-2407, developed for general-purpose natural language processing tasks. It was fine-tuned to replicate the quality and style of Claude 3's Sonnet and Opus models.

Model Details

Model Description

Astra-v1-12B is a general-purpose transformer-based language model fine-tuned for instruction-following tasks. The fine-tuning was designed to match the high-quality generation seen in Claude 3's Sonnet and Opus models, optimized for tasks such as text generation, summarization, question answering, and more.

Model Sources

Uses

Direct Use

Astra-v1-12B can be used directly for a wide range of NLP tasks, including:

  • Text generation
  • Summarization
  • Question answering
  • Dialogue systems

Downstream Use

This model can be further fine-tuned for specific tasks such as:

  • Creative writing
  • Instruction-based text completion
  • Automated support systems

Out-of-Scope Use

Astra-v1-12B is not intended for real-time decision-making in critical applications or generating harmful or biased content.

Bias, Risks, and Limitations

As with any large language model, Astra-v1-12B may carry inherent biases from the datasets used in fine-tuning. It is important to monitor and review the outputs when using the model in sensitive applications.

How to Get Started with the Model

Here is a Python code snippet to get started with Astra-v1-12B:

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("P0x0/astra-v1-12b")
model = AutoModelForCausalLM.from_pretrained("P0x0/astra-v1-12b")

input_text = "Explain the theory of relativity in simple terms."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))