CryogenicPlanet commited on
Commit
0d05c65
1 Parent(s): c385830

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ language:
5
+ - en
6
+ tags:
7
+ - pretrained
8
+ inference:
9
+ parameters:
10
+ temperature: 0.7
11
+ ---
12
+
13
+ # Model Card for Mistral-7B-v0.1
14
+
15
+ The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.
16
+ Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
17
+
18
+ For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).
19
+
20
+ ## Model Architecture
21
+
22
+ Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
23
+ - Grouped-Query Attention
24
+ - Sliding-Window Attention
25
+ - Byte-fallback BPE tokenizer
26
+
27
+ ## Troubleshooting
28
+
29
+ - If you see the following error:
30
+ ```
31
+ KeyError: 'mistral'
32
+ ```
33
+ - Or:
34
+ ```
35
+ NotImplementedError: Cannot copy out of meta tensor; no data!
36
+ ```
37
+
38
+ Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer.
39
+
40
+ ## Notice
41
+
42
+ Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
43
+
44
+ ## The Mistral AI Team
45
+
46
+ Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.