Edit model card

This is an experimental model.

The idea is :

  • Calculate the difference in weights between a donor model(meta-math/MetaMath-Mistral-7B) and the base model(mistralai/Mistral-7B-v0.1). This difference represents how much each parameter needs to be adjusted to go from the base state to the donor state.
vector = math_model.state_dict()[k] - base_model.state_dict()[k]
  • Vector retrieved from the result of step one, is added to third model(lex-hue/Delexa-7b). This should transfer math skills to our third model.
vector = new_math_model.state_dict()[k]
new_v = v + vector.to(v.device)
v.copy_(new_v)

Example:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "aloobun/CosmicNoodle-7B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")


prompt = "For the natural number A, the quotient of A divided by 9 is 6 and the remainder is 5. What is the value of A?\n"
input_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
tokens = model.generate(input_ids.to(device=model.device), max_new_tokens=128, temperature=0.99, top_p=0.95, do_sample=True)
out = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(out)
Downloads last month
22
Safetensors
Model size
7.24B params
Tensor type
FP16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for aloobun/CosmicNoodle-7B

Quantizations
1 model

Spaces using aloobun/CosmicNoodle-7B 5

Collection including aloobun/CosmicNoodle-7B