File size: 4,394 Bytes
79ab6be
b6fb07c
 
be0d7cd
 
 
 
 
 
 
 
 
b6fb07c
79ab6be
0bc80e5
be0d7cd
0bc80e5
e73ff10
0bc80e5
be0d7cd
0bc80e5
 
 
be0d7cd
 
 
 
 
 
0bc80e5
1fe4427
 
 
 
 
 
 
 
 
 
be0d7cd
0bc80e5
be0d7cd
0bc80e5
be0d7cd
0bc80e5
be0d7cd
0bc80e5
be0d7cd
c539307
 
 
 
 
 
7c4af07
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
language:
- en
license: apache-2.0
tags:
- open-source
- code
- math
- chemistry
- biology
- text-generation
- question-answering
pipeline_tag: text-generation
---

# OpenCerebrum-2.0-7B

OpenCerebrum-2.0-7B is an open-source language model fine-tuned from the alpindale/Mistral-7B-v0.2-hf base model on a diverse dataset aimed at replicating capabilities of Aether Research's proprietary Cerebrum model. 

The model was fine-tuned with SFT and DPO on approximately 7,000 examples across 15 data sources spanning coding, math, science, multi-turn conversation, RAG, reasoning, and general instruction-following. The goal was to assemble public datasets that could help the model achieve strong performance on benchmarks where Cerebrum excels.

## Model Details

- **Base Model:** alpindale/Mistral-7B-v0.2-hf
- **Parameters:** 7 billion 
- **Fine-Tuning Dataset Size:** ~7,000 examples
- **Fine-Tuning Data:** Advanced in-house curation techniques at Cognitive Computations, with 15 different data sources for DPO and SFT.
- **Language:** English
- **License:** Apache 2.0

## Quants

### EXL2 [@bartowski](https://huggingface.co/bartowski/)

- https://huggingface.co/bartowski/OpenCerebrum-2.0-7B-exl2

### GGUF [@bartowski](https://huggingface.co/bartowski/)

- https://huggingface.co/bartowski/OpenCerebrum-2.0-7B-GGUF

## Intended Use

OpenCerebrum-2.0-7B is intended to be a powerful open-source model for coding, math, science, and general question-answering and text generation tasks. Its diverse fine-tuning data aims to equip it with broad knowledge and reasoning capabilities.

However, as an open-source replica trained on a subset of data compared to the original Cerebrum, it may not match Cerebrum's full performance. Additionally, biases and limitations of the fine-tuning data may be reflected in the model's outputs.

## Limitations and Biases

- The model may have biases and limitations inherited from its fine-tuning datasets. Thorough testing is needed to characterize these.
- As the model is based on a 7B parameter model, it has computational and memory constraints compared to larger models.

## Evaluations

|    Tasks     |Version|Filter|n-shot|Metric|Value |   |Stderr|
|--------------|------:|------|-----:|------|-----:|---|-----:|
|truthfulqa_mc2|      2|none  |     0|acc   |0.5182|±  |0.0152|
|ai2_arc                          |N/A    |none  |     0|acc     |0.7060|±  |0.0073|
|                                 |       |none  |     0|acc_norm|0.7049|±  |0.0074|
| - arc_challenge                 |      1|none  |     0|acc     |0.5000|±  |0.0146|
|                                 |       |none  |     0|acc_norm|0.5299|±  |0.0146|
| - arc_easy                      |      1|none  |     0|acc     |0.8077|±  |0.0081|
|                                 |       |none  |     0|acc_norm|0.7912|±  |0.0083|
|agieval_nous                     |N/A    |none  |     0|acc     |0.3778|±  |0.0093|
|                                 |       |none  |     0|acc_norm|0.3574|±  |0.0093|
| - agieval_aqua_rat              |      1|none  |     0|acc     |0.2402|±  |0.0269|
|                                 |       |none  |     0|acc_norm|0.2205|±  |0.0261|
| - agieval_logiqa_en             |      1|none  |     0|acc     |0.3164|±  |0.0182|
|                                 |       |none  |     0|acc_norm|0.3656|±  |0.0189|
| - agieval_lsat_ar               |      1|none  |     0|acc     |0.2130|±  |0.0271|
|                                 |       |none  |     0|acc_norm|0.1913|±  |0.0260|
| - agieval_lsat_lr               |      1|none  |     0|acc     |0.4078|±  |0.0218|
|                                 |       |none  |     0|acc_norm|0.3647|±  |0.0213|
| - agieval_lsat_rc               |      1|none  |     0|acc     |0.4981|±  |0.0305|
|                                 |       |none  |     0|acc_norm|0.4498|±  |0.0304|
| - agieval_sat_en                |      1|none  |     0|acc     |0.6650|±  |0.0330|
|                                 |       |none  |     0|acc_norm|0.5922|±  |0.0343|
| - agieval_sat_en_without_passage|      1|none  |     0|acc     |0.4612|±  |0.0348|
|                                 |       |none  |     0|acc_norm|0.3932|±  |0.0341|
| - agieval_sat_math              |      1|none  |     0|acc     |0.3273|±  |0.0317|
|                                 |       |none  |     0|acc_norm|0.2818|±  |0.0304|