myxik commited on
Commit
3529818
1 Parent(s): 49e28d2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md CHANGED
@@ -1,3 +1,64 @@
1
  ---
2
  license: llama2
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
  ---
4
+
5
+ # GOAT-7B-Community model
6
+
7
+ ![GOAT-7B-Community](https://api-adaptive-li.s3.us-west-2.amazonaws.com/goat-ai/Comp+2_00000.png)
8
+
9
+ GOAT-7B-Community model is supervised finetuned (SFT) version of LLaMA 2 developed by GOAT.AI lab on user-shared conversations from GoatChat app.
10
+
11
+ # Model description
12
+ - **Base Architecture:** LLaMA 2 7B flavour
13
+ - **Dataset size:** 72K multi-turn dialogues
14
+ - **License:** llama2
15
+ - **Context window length:** 4096 tokens
16
+
17
+ ### Learn more
18
+
19
+ - **Blog:** https://www.blog.goat.ai/goat-7b-community-tops-among-7b-models/
20
+ - **Paper:** Coming soon
21
+ - **Demo:** https://huggingface.co/spaces/goatai/GOAT-7B-Community
22
+
23
+ ## Uses
24
+
25
+ The main purpose of GOAT-7B-Community is to facilitate research on large language models and chatbots. It is specifically designed for researchers and hobbyists working in the fields of natural language processing, machine learning, and artificial intelligence.
26
+
27
+ ## Usage
28
+
29
+ Usage can be either self-hosted via `transformers` or used with Spaces
30
+
31
+ ```
32
+ import torch
33
+
34
+ from transformers import AutoTokenizer, AutoModelForCausalLM
35
+
36
+ model_name = "GOAT-7B-Community model"
37
+
38
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
39
+ model = AutoModelForCausalLM.from_pretrained(
40
+ model_name,
41
+ torch_dtype=torch.bfloat16
42
+ )
43
+ ```
44
+
45
+ ## Training dataset
46
+
47
+ Training dataset was collected from users conversations with GoatChat app and OpenAssistant. We will not release the dataset.
48
+
49
+ ## Evaluation
50
+
51
+ GOAT-7B-Community model is evaluated against common metrics for evaluating language models, including MMLU and BigBench Hard. We still continue to evaluate all our models and will share details soon.
52
+
53
+ - **MMLU:** 49.31
54
+ - **BBH:** 35.7
55
+
56
+ ## License
57
+
58
+ GOAT-7B-Community model is based on [Meta's LLaMA-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), and using own datasets.
59
+
60
+ GOAT-7B-Community model weights are available under LLAMA-2 license. Note that the GOAT-7B-Community model weights require access to the LLaMA-2 model weighs. The GOAT-7B-Community model is based on LLaMA-2 and should be used according to the LLaMA-2 license.
61
+
62
+ ### Risks and Biases
63
+
64
+ GOAT-7B-Community model can produce factually incorrect output and should not be relied on to deliver factually accurate information. The model was trained on various private and public datasets. Therefore, the GOAT-7B-Community model could possibly generate wrong, biased, or otherwise offensive outputs.