Commit
•
78da677
1
Parent(s):
868dab5
Update README.md
Browse files
README.md
CHANGED
@@ -4,3 +4,57 @@ tags:
|
|
4 |
- text-classification
|
5 |
- language-identification
|
6 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- text-classification
|
5 |
- language-identification
|
6 |
---
|
7 |
+
|
8 |
+
# OpenLID
|
9 |
+
|
10 |
+
|
11 |
+
## Model description
|
12 |
+
|
13 |
+
fastText is a library for efficient learning of word representations and sentence classification. fastText is designed to be simple to use for developers, domain experts, and students. It's dedicated to text classification and learning word representations, and was designed to allow for quick model iteration and refinement without specialized hardware. fastText models can be trained on more than a billion words on any multicore CPU in less than a few minutes.
|
14 |
+
|
15 |
+
|
16 |
+
## Intended uses & limitations
|
17 |
+
|
18 |
+
You can use pre-trained word vectors for text classification or language identification. See the [tutorials](https://fasttext.cc/docs/en/supervised-tutorial.html) and [resources](https://fasttext.cc/docs/en/english-vectors.html) on its official website to look for tasks that interest you.
|
19 |
+
|
20 |
+
### How to use
|
21 |
+
|
22 |
+
Here is how to use this model to detect the language of a given text:
|
23 |
+
|
24 |
+
```python
|
25 |
+
>>> import fasttext
|
26 |
+
>>> from huggingface_hub import hf_hub_download
|
27 |
+
|
28 |
+
>>> model_path = hf_hub_download(repo_id="davanstrien/OpenLID", filename="model.bin")
|
29 |
+
>>> model = fasttext.load_model(model_path)
|
30 |
+
>>> model.predict("Hello, world!")
|
31 |
+
|
32 |
+
(('__label__eng_Latn',), array([0.81148803]))
|
33 |
+
|
34 |
+
>>> model.predict("Hello, world!", k=5)
|
35 |
+
|
36 |
+
(('__label__eng_Latn', '__label__vie_Latn', '__label__nld_Latn', '__label__pol_Latn', '__label__deu_Latn'),
|
37 |
+
array([0.61224753, 0.21323682, 0.09696738, 0.01359863, 0.01319415]))
|
38 |
+
```
|
39 |
+
|
40 |
+
### Limitations and bias
|
41 |
+
|
42 |
+
[More Information needed]
|
43 |
+
## Training data
|
44 |
+
|
45 |
+
|
46 |
+
[More Information needed]
|
47 |
+
## Training procedure
|
48 |
+
|
49 |
+
|
50 |
+
### License
|
51 |
+
|
52 |
+
[More Information needed]
|
53 |
+
|
54 |
+
### Evaluation datasets
|
55 |
+
|
56 |
+
[More Information needed]
|
57 |
+
|
58 |
+
### BibTeX entry and citation info
|
59 |
+
|
60 |
+
[More Information needed]
|