Edit model card

L3-8B-Lunar-Stheno-GGUF

This is quantrized version of HiroseKoichi/L3-8B-Lunar-Stheno created using llama.cpp

Model Description

L3-8B-Lunaris-v1 is definitely a significant improvement over L3-8B-Stheno-v3.2 in terms of situational awareness and prose, but it's not without issues: the response length can sometimes be very long, causing it to go on a rant; it tends to not take direct action, saying that it will do something but never actually doing it; and its performance outside of roleplay took a hit.

This merge fixes all of those issues, and I'm genuinely impressed with the results. While I did use a SLERP merge to create this model, there was no blending of the models; all I did was replace L3-8B-Stheno-v3.2's weights with L3-8B-Lunaris-v1's.

Details

Models Used

Merge Config

models:
    - model: Sao10K/L3-8B-Stheno-v3.2
    - model: Sao10K/L3-8B-Lunaris-v1
merge_method: slerp
base_model: Sao10K/L3-8B-Stheno-v3.2
parameters:
  t:
    - filter: self_attn
      value: 0
    - filter: mlp
      value: 1
    - value: 0
dtype: bfloat16
Downloads last month
47
GGUF
Model size
8.03B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for QuantFactory/L3-8B-Lunar-Stheno-GGUF

Quantized
(7)
this model