metadata
language:
- pt
- en
license: cc
tags:
- text-generation-inference
- transformers
- mistral
- mixtral
- gguf
- brazil
- brasil
- portuguese
model-index:
- name: CabraMixtral-8x7b
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: ENEM Challenge (No Images)
type: eduagarcia/enem_challenge
split: train
args:
num_few_shot: 3
metrics:
- type: acc
value: 78.17
name: accuracy
source:
url: >-
https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=botbot-ai/CabraMixtral-8x7b
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BLUEX (No Images)
type: eduagarcia-temp/BLUEX_without_images
split: train
args:
num_few_shot: 3
metrics:
- type: acc
value: 64.12
name: accuracy
source:
url: >-
https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=botbot-ai/CabraMixtral-8x7b
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: OAB Exams
type: eduagarcia/oab_exams
split: train
args:
num_few_shot: 3
metrics:
- type: acc
value: 55.49
name: accuracy
source:
url: >-
https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=botbot-ai/CabraMixtral-8x7b
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Assin2 RTE
type: assin2
split: test
args:
num_few_shot: 15
metrics:
- type: f1_macro
value: 90.95
name: f1-macro
source:
url: >-
https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=botbot-ai/CabraMixtral-8x7b
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Assin2 STS
type: eduagarcia/portuguese_benchmark
split: test
args:
num_few_shot: 15
metrics:
- type: pearson
value: 77.63
name: pearson
source:
url: >-
https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=botbot-ai/CabraMixtral-8x7b
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: FaQuAD NLI
type: ruanchaves/faquad-nli
split: test
args:
num_few_shot: 15
metrics:
- type: f1_macro
value: 78.93
name: f1-macro
source:
url: >-
https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=botbot-ai/CabraMixtral-8x7b
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HateBR Binary
type: ruanchaves/hatebr
split: test
args:
num_few_shot: 25
metrics:
- type: f1_macro
value: 78
name: f1-macro
source:
url: >-
https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=botbot-ai/CabraMixtral-8x7b
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: PT Hate Speech Binary
type: hate_speech_portuguese
split: test
args:
num_few_shot: 25
metrics:
- type: f1_macro
value: 69.54
name: f1-macro
source:
url: >-
https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=botbot-ai/CabraMixtral-8x7b
name: Open Portuguese LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: tweetSentBR
type: eduagarcia/tweetsentbr_fewshot
split: test
args:
num_few_shot: 25
metrics:
- type: f1_macro
value: 72.83
name: f1-macro
source:
url: >-
https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=botbot-ai/CabraMixtral-8x7b
name: Open Portuguese LLM Leaderboard
BotBot Cabra Mixtral 8x7b
Esse modelo é um finetune do Mixtral 8x7b com o dataset Cabra 30k. Esse modelo é optimizado para português. Ele apresenta melhoria em varios benchmarks brasileiros em comparação com o modelo base.
Conheça os nossos outros modelos: Cabra.
dataset: Cabra 30k
Dataset interno para finetuning. Vamos lançar em breve.
Quantização / GGUF
Colocamos diversas versões (GGUF) quantanizadas no branch "quantanization".
Exemplo
<s> [INST] who is Elon Musk? [/INST]Elon Musk é um empreendedor, inventor e capitalista americano. Ele é o fundador, CEO e CTO da SpaceX, CEO da Neuralink e fundador do The Boring Company. Musk também é o proprietário do Twitter.</s>
Uso
O modelo é destinado, por agora, a fins de pesquisa. As áreas e tarefas de pesquisa possíveis incluem:
- Pesquisa sobre modelos gerativos.
- Investigação e compreensão das limitações e viéses de modelos gerativos.
Proibido para uso comercial. Somente pesquisa.
Evals
Open Portuguese LLM Leaderboard Evaluation Results
Detailed results can be found here and on the 🚀 Open Portuguese LLM Leaderboard
Metric | Value |
---|---|
Average | 73.96 |
ENEM Challenge (No Images) | 78.17 |
BLUEX (No Images) | 64.12 |
OAB Exams | 55.49 |
Assin2 RTE | 90.95 |
Assin2 STS | 77.63 |
FaQuAD NLI | 78.93 |
HateBR Binary | 78 |
PT Hate Speech Binary | 69.54 |
tweetSentBR | 72.83 |