task vector optimization checkpoint ready for merging.

trained on MFANN for 12000 steps, however due to a slightly higher training loss, im going to merge this model with the last version and retrain, the goal was to use DARE-TIES to reduce the parameters used per vector, and this model will now be merged with the last model before DARE using TIES alone, and will be subsequently retrained.

Downloads last month: 16

Safetensors

Model size

2.78B params

Tensor type

F32

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for netcat420/MFANN3bv0.16.11

Merges

3 models

netcat420
/

MFANN3bv0.16.11

Model tree for netcat420/MFANN3bv0.16.11

Dataset used to train netcat420/MFANN3bv0.16.11