Edit model card

Llama-3.2-Kapusta-3B-v8

Small and useful.

KapustaLogo256.png

This is an interesting merge of 14 cool models, created using mergekit. Enjoy exploring :)

Merge Details

Method

This model was merged using the multistep process and remerge with some model variations for best result.

Models

The following models were included in the merge:

Configuration

The following YAML configurations was used to produce this model:

# A-3B-v1
models:
  - model: Hastagaras/L3.2-JametMini-3B-MK.III
  - model: huihui-ai/Llama-3.2-3B-Instruct-abliterated
merge_method: model_stock
base_model: Lyte/Llama-3.2-3B-Overthinker
dtype: bfloat16

# B-3B-v1
models:
  - model: ValiantLabs/Llama3.2-3B-ShiningValiant2
  - model: bunnycore/Llama-3.2-3B-Stock
merge_method: model_stock
base_model: Lyte/Llama-3.2-3B-Overthinker
dtype: bfloat16

# C-3B-v1
models:
  - model: bunnycore/Llama-3.2-3B-Pure-RP
  - model: CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct
merge_method: model_stock
base_model: ValiantLabs/Llama3.2-3B-ShiningValiant2
dtype: bfloat16

# Llama-3.2-Kapusta-3B-v1
models:
  - model: A-3B-v1
    parameters:
      density: [0.8, 0.5, 0.2]
      weight:  0.8
  - model: B-3B-v1
    parameters:
      density: [0.2, 0.8, 0.2]
      weight:  0.25
  - model: C-3B-v1
    parameters:
      density: [0.2, 0.5, 0.8]
      weight:  0.6
merge_method: ties
base_model: bunnycore/Llama-3.2-3B-Mix
dtype: bfloat16

# Llama-3.2-Kapusta-3B-v2
models:
  - model: SaisExperiments/Evil-Alpaca-3B-L3.2
  - model: bunnycore/Llama-3.2-3B-TitanFusion-v2
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v1
dtype: bfloat16

# Llama-3.2-Kapusta-3B-v3
models:
  - model: F:/3b/Llama-3.2-Kapusta-3B-v2
    parameters:
      weight:  [0.5, 0.6, 0.4, 0.7, 0.3, 0.8, 0.2, 0.9, 0.1,  0.9, 0.1,  0.9, 0.1, 0.8, 0.2, 0.7, 0.3, 0.6, 0.4, 0.5]
      density: [0.2, 0.8, 0.2]
merge_method: della
parameters:
  epsilon: 0.1
  lambda: 0.5
base_model: F:/3b/Llama-3.2-Kapusta-3B-v1
dtype: bfloat16

# Llama-3.2-Kapusta-3B-v4A | della
models:
  - model: F:/3b/Llama-3.2-Kapusta-3B-v1
    parameters:
      weight:  0.6
      density: 0.5
  - model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
    parameters:
      weight:  0.5
      density: [0.4, 0.3, 0.3, 0.3]
  - model: bunnycore/Llama-3.2-3B-Creative
    parameters:
      weight:  0.3
      density: [0.3, 0.4, 0.3, 0.3]
  - model: ValiantLabs/Llama3.2-3B-Enigma
    parameters:
      weight:  0.3
      density: [0.3, 0.3, 0.4, 0.3]
  - model: passing2961/Thanos-3B
    parameters:
      weight:  0.3
      density: [0.3, 0.3, 0.3, 0.4]
merge_method: della
parameters:
  epsilon: 0.2
  lambda:  0.5
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16

# Llama-3.2-Kapusta-3B-v4B | breadcrumbs
models:
  - model: F:/3b/Llama-3.2-Kapusta-3B-v1
    parameters:
      weight:  0.6
      density: 0.5
  - model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
    parameters:
      weight:  0.5
      density: [0.4, 0.3, 0.3, 0.3]
  - model: bunnycore/Llama-3.2-3B-Creative
    parameters:
      weight:  0.3
      density: [0.3, 0.4, 0.3, 0.3]
  - model: ValiantLabs/Llama3.2-3B-Enigma
    parameters:
      weight:  0.3
      density: [0.3, 0.3, 0.4, 0.3]
  - model: passing2961/Thanos-3B
    parameters:
      weight:  0.3
      density: [0.3, 0.3, 0.3, 0.4]
merge_method: breadcrumbs
parameters:
  gamma: 0.02
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16

# Llama-3.2-Kapusta-3B-v4C | dare_ties
models:
  - model: F:/3b/Llama-3.2-Kapusta-3B-v1
    parameters:
      weight:  0.6
      density: 0.5
  - model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
    parameters:
      weight:  0.5
      density: [0.4, 0.3, 0.3, 0.3]
  - model: bunnycore/Llama-3.2-3B-Creative
    parameters:
      weight:  0.3
      density: [0.3, 0.4, 0.3, 0.3]
  - model: ValiantLabs/Llama3.2-3B-Enigma
    parameters:
      weight:  0.3
      density: [0.3, 0.3, 0.4, 0.3]
  - model: passing2961/Thanos-3B
    parameters:
      weight:  0.3
      density: [0.3, 0.3, 0.3, 0.4]
merge_method: dare_ties
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16

# Llama-3.2-Kapusta-3B-v5
models:
  - model: F:/3b/Llama-3.2-Kapusta-3B-v4A
  - model: F:/3b/Llama-3.2-Kapusta-3B-v4B
  - model: F:/3b/Llama-3.2-Kapusta-3B-v4C
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16

# Llama-3.2-Kapusta-3B-v6
models:
  - model: F:/3b/Llama-3.2-Kapusta-3B-v3
merge_method: slerp
base_model: F:/3b/Llama-3.2-Kapusta-3B-v5
dtype: bfloat16
parameters:
  t: [0.5, 0.6, 0.4, 0.7, 0.3, 0.8, 0.2, 0.8, 0.2, 0.7, 0.3, 0.6, 0.4, 0.5]

# Llama-3.2-Kapusta-3B-v7
models:
  - model: F:/3b/Llama-3.2-Kapusta-3B-v1
  - model: F:/3b/Llama-3.2-Kapusta-3B-v2
  - model: F:/3b/Llama-3.2-Kapusta-3B-v3
  - model: F:/3b/Llama-3.2-Kapusta-3B-v4A
  - model: F:/3b/Llama-3.2-Kapusta-3B-v4B
  - model: F:/3b/Llama-3.2-Kapusta-3B-v4C
  - model: F:/3b/Llama-3.2-Kapusta-3B-v5
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v6
dtype: bfloat16

# Llama-3.2-Kapusta-3B-v8
models:
  - model: F:/3b/Llama-3.2-Kapusta-3B-v1
  - model: F:/3b/Llama-3.2-Kapusta-3B-v3
  - model: F:/3b/Llama-3.2-Kapusta-3B-v5
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v7
dtype: bfloat16

My thanks to the authors of the original models, your work is incredible. Have a good time 🖤

Downloads last month
10
Safetensors
Model size
3.61B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Khetterman/Llama-3.2-Kapusta-3B-v8