initial model commit

Browse files

Files changed (4) hide show

README.md +163 -0
loss.tsv +132 -0
pytorch_model.bin +3 -0
training.log +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,163 @@

+---
+tags:
+- flair
+- token-classification
+- sequence-tagger-model
+language: en
+datasets:
+- ontonotes
+inference: false
+---
+## English Universal Part-of-Speech Tagging in Flair (fast model)
+This is the fast universal part-of-speech tagging model for English that ships with [Flair](https://github.com/flairNLP/flair/).
+F1-Score: **98,47** (Ontonotes)
+Predicts universal POS tags:
+| **tag**                        | **meaning** |
+|---------------------------------|-----------|
+|ADJ |  adjective |
+ |   ADP |  adposition |
+ |   ADV |  adverb |
+ |   AUX |  auxiliary |
+ |   CCONJ |  coordinating conjunction |
+ |   DET |  determiner |
+ |   INTJ |  interjection |
+ |   NOUN |  noun |
+ |   NUM |  numeral |
+ |   PART |  particle |
+ |   PRON |  pronoun |
+ |   PROPN |  proper noun |
+ |   PUNCT |  punctuation |
+ |   SCONJ |  subordinating conjunction |
+ |   SYM |  symbol |
+ |   VERB |  verb |
+ |   X |  other |
+Based on [Flair embeddings](https://www.aclweb.org/anthology/C18-1139/) and LSTM-CRF.
+---
+### Demo: How to use in Flair
+Requires: **[Flair](https://github.com/flairNLP/flair/)** (`pip install flair`)
+```python
+from flair.data import Sentence
+from flair.models import SequenceTagger
+# load tagger
+tagger = SequenceTagger.load("flair/upos-english-fast")
+# make example sentence
+sentence = Sentence("I love Berlin.")
+# predict NER tags
+tagger.predict(sentence)
+# print sentence
+print(sentence)
+# print predicted NER spans
+print('The following NER tags are found:')
+# iterate over entities and print
+for entity in sentence.get_spans('pos'):
+    print(entity)
+```
+This yields the following output:
+```
+Span [1]: "I"   [− Labels: PRON (0.9996)]
+Span [2]: "love"   [− Labels: VERB (1.0)]
+Span [3]: "Berlin"   [− Labels: PROPN (0.9986)]
+Span [4]: "."   [− Labels: PUNCT (1.0)]
+```
+So, the word "*I*" is labeled as a **pronoun** (PRON),  "*love*" is labeled as a **verb** (VERB) and "*Berlin*" is labeled as a **proper noun** (PROPN) in the sentence "*TheI love Berlin*".
+---
+### Training: Script to train this model
+The following Flair script was used to train this model:
+```python
+from flair.data import Corpus
+from flair.datasets import ColumnCorpus
+from flair.embeddings import WordEmbeddings, StackedEmbeddings, FlairEmbeddings
+# 1. load the corpus (Ontonotes does not ship with Flair, you need to download and reformat into a column format yourself)
+corpus: Corpus = ColumnCorpus(
+                "resources/tasks/onto-ner",
+                column_format={0: "text", 1: "pos", 2: "upos", 3: "ner"},
+                tag_to_bioes="ner",
+            )
+# 2. what tag do we want to predict?
+tag_type = 'upos'
+# 3. make the tag dictionary from the corpus
+tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
+# 4. initialize each embedding we use
+embedding_types = [
+    # contextual string embeddings, forward
+    FlairEmbeddings('news-forward-fast'),
+    # contextual string embeddings, backward
+    FlairEmbeddings('news-backward-fast'),
+]
+# embedding stack consists of Flair and GloVe embeddings
+embeddings = StackedEmbeddings(embeddings=embedding_types)
+# 5. initialize sequence tagger
+from flair.models import SequenceTagger
+tagger = SequenceTagger(hidden_size=256,
+                        embeddings=embeddings,
+                        tag_dictionary=tag_dictionary,
+                        tag_type=tag_type)
+# 6. initialize trainer
+from flair.trainers import ModelTrainer
+trainer = ModelTrainer(tagger, corpus)
+# 7. run training
+trainer.train('resources/taggers/upos-english-fast',
+              train_with_dev=True,
+              max_epochs=150)
+```
+---
+### Cite
+Please cite the following paper when using this model.
+```
+@inproceedings{akbik2018coling,
+  title={Contextual String Embeddings for Sequence Labeling},
+  author={Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},
+  booktitle = {{COLING} 2018, 27th International Conference on Computational Linguistics},
+  pages     = {1638--1649},
+  year      = {2018}
+}
+```
+---
+### Issues?
+The Flair issue tracker is available [here](https://github.com/flairNLP/flair/issues/).

loss.tsv ADDED Viewed

	@@ -0,0 +1,132 @@

+EPOCH	TIMESTAMP	BAD_EPOCHS	LEARNING_RATE	TRAIN_LOSS
+0	17:03:39	0	0.1000	3.8983645198929984
+1	17:11:32	0	0.1000	2.396464760978267
+2	17:19:17	0	0.1000	2.1225894081367636
+3	17:27:11	0	0.1000	1.9884647013443821
+4	17:35:04	0	0.1000	1.885558801484558
+5	17:42:58	0	0.1000	1.8396147863370067
+6	17:50:52	0	0.1000	1.7672407321210177
+7	17:58:33	0	0.1000	1.7209206173217522
+8	18:06:14	0	0.1000	1.684758366188913
+9	18:13:54	0	0.1000	1.675738244360348
+10	18:21:33	0	0.1000	1.6346709968683855
+11	18:29:15	0	0.1000	1.5976093570016465
+12	18:37:02	0	0.1000	1.572903771839052
+13	18:44:47	0	0.1000	1.5621724839255495
+14	18:52:31	0	0.1000	1.5344491067360033
+15	19:00:09	0	0.1000	1.5280954311141428
+16	19:07:50	0	0.1000	1.5018526020027556
+17	19:15:29	1	0.1000	1.505718067726999
+18	19:23:08	0	0.1000	1.4736719750233416
+19	19:30:50	1	0.1000	1.4839579892720816
+20	19:38:35	0	0.1000	1.4585183224250686
+21	19:46:19	1	0.1000	1.4610416818902177
+22	19:54:12	0	0.1000	1.437192275220493
+23	20:01:52	0	0.1000	1.4222804964825793
+24	20:09:33	0	0.1000	1.4003380132058882
+25	20:17:17	1	0.1000	1.4161622376374479
+26	20:24:54	2	0.1000	1.4020970809572149
+27	20:32:32	0	0.1000	1.3987249116852598
+28	20:40:11	0	0.1000	1.3625548289969283
+29	20:47:51	1	0.1000	1.381220668767983
+30	20:55:30	2	0.1000	1.3701620033552062
+31	21:03:17	3	0.1000	1.3630763605293237
+32	21:11:05	0	0.1000	1.3467498509051665
+33	21:18:54	1	0.1000	1.3495412202095085
+34	21:26:44	0	0.1000	1.340426192193661
+35	21:34:35	0	0.1000	1.3255774740228112
+36	21:42:28	1	0.1000	1.341141459649464
+37	21:50:16	2	0.1000	1.3301069232652771
+38	21:58:05	0	0.1000	1.3155438300011293
+39	22:05:53	1	0.1000	1.3180485034101415
+40	22:13:33	0	0.1000	1.3101363613583008
+41	22:21:15	1	0.1000	1.3239353564212908
+42	22:28:55	0	0.1000	1.2985683835677382
+43	22:36:36	1	0.1000	1.2987655120300796
+44	22:44:16	0	0.1000	1.293294859140549
+45	22:51:55	1	0.1000	1.2934898600825724
+46	22:59:35	0	0.1000	1.2742974282997959
+47	23:07:16	0	0.1000	1.257929092420722
+48	23:14:57	1	0.1000	1.2636124875410548
+49	23:22:35	2	0.1000	1.2605103574271472
+50	23:30:15	3	0.1000	1.2628181801202163
+51	23:37:55	4	0.1000	1.2682071375397017
+52	23:45:35	0	0.0500	1.2192658351502328
+53	23:53:15	0	0.0500	1.189723878941446
+54	00:00:56	0	0.0500	1.181310292977207
+55	00:08:36	1	0.0500	1.1813142526599596
+56	00:16:15	0	0.0500	1.1490525012646082
+57	00:24:04	1	0.0500	1.150567943037681
+58	00:31:59	2	0.0500	1.153844450498527
+59	00:39:53	3	0.0500	1.1547138257521503
+60	00:47:36	0	0.0500	1.138099388097817
+61	00:55:17	1	0.0500	1.1522783655265592
+62	01:03:07	0	0.0500	1.1201619118114687
+63	01:11:03	1	0.0500	1.140103389699504
+64	01:18:57	2	0.0500	1.1306282293909
+65	01:26:43	3	0.0500	1.1392165621946442
+66	01:34:30	4	0.0500	1.1320033756404553
+67	01:42:13	0	0.0250	1.0931724692290683
+68	01:50:05	1	0.0250	1.093446401728774
+69	01:57:59	0	0.0250	1.0766996851900839
+70	02:05:47	1	0.0250	1.085443768231374
+71	02:13:32	2	0.0250	1.0840452198824793
+72	02:21:24	3	0.0250	1.0943272675770634
+73	02:29:13	0	0.0250	1.0741095490050765
+74	02:36:59	1	0.0250	1.0775160627657512
+75	02:44:45	0	0.0250	1.0723835660601562
+76	02:52:28	0	0.0250	1.0675190647593085
+77	03:00:14	0	0.0250	1.062752323026927
+78	03:07:55	1	0.0250	1.0638396440924338
+79	03:15:34	0	0.0250	1.0551368798849718
+80	03:23:13	0	0.0250	1.0540687316993498
+81	03:30:54	0	0.0250	1.0486293900575279
+82	03:38:39	1	0.0250	1.0578650972190893
+83	03:46:29	2	0.0250	1.050876642150699
+84	03:54:21	0	0.0250	1.0444189010476166
+85	04:02:17	0	0.0250	1.036741197986423
+86	04:10:04	1	0.0250	1.0422700380716683
+87	04:17:57	2	0.0250	1.053200504015077
+88	04:25:50	3	0.0250	1.0567198398428144
+89	04:33:41	4	0.0250	1.038592992784842
+90	04:41:31	1	0.0125	1.0402668333278513
+91	04:49:15	0	0.0125	1.0200082490354214
+92	04:56:57	1	0.0125	1.0332945613703637
+93	05:04:44	2	0.0125	1.0235844095023172
+94	05:12:38	3	0.0125	1.030887721619516
+95	05:20:33	4	0.0125	1.03034149728856
+96	05:28:29	1	0.0063	1.032665410379194
+97	05:36:24	0	0.0063	1.0145184545584445
+98	05:44:18	0	0.0063	1.004028284752144
+99	05:52:14	1	0.0063	1.0066242653356408
+100	06:00:01	2	0.0063	1.0042478304876472
+101	06:07:45	3	0.0063	1.0221682896141735
+102	06:15:28	4	0.0063	1.0171712939975397
+103	06:23:10	1	0.0031	1.0051458630696783
+104	06:30:52	0	0.0031	0.9894583106828185
+105	06:38:35	1	0.0031	0.9949013568826441
+106	06:46:21	2	0.0031	1.0099847611166397
+107	06:54:17	3	0.0031	1.0110677263646755
+108	07:02:09	4	0.0031	0.9900631292529826
+109	07:09:53	1	0.0016	0.9965992866034777
+110	07:17:39	0	0.0016	0.988067799041856
+111	07:25:26	1	0.0016	1.002729972747137
+112	07:33:17	2	0.0016	1.0075599195597307
+113	07:40:59	3	0.0016	0.9934051743318449
+114	07:48:47	4	0.0016	0.9908848639141838
+115	07:56:32	1	0.0008	0.9981947860515342
+116	08:04:19	2	0.0008	0.9895183097191577
+117	08:11:59	3	0.0008	0.9920743883836944
+118	08:19:38	4	0.0008	0.9939175610103698
+119	08:27:18	1	0.0004	1.0007407332141445
+120	08:34:58	2	0.0004	1.0047922001807195
+121	08:42:37	0	0.0004	0.9872947578835037
+122	08:50:15	0	0.0004	0.9852443703671672
+123	08:57:54	1	0.0004	0.9936418686610348
+124	09:05:37	2	0.0004	0.9901605238104766
+125	09:13:23	3	0.0004	0.9907275987008832
+126	09:21:09	4	0.0004	0.99081547033112
+127	09:28:51	1	0.0002	0.9894531191295048
+128	09:36:34	2	0.0002	0.9955960737309366
+129	09:44:20	3	0.0002	0.9950949703578679
+130	09:52:13	4	0.0002	1.0062029107730344

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4395e8af6f93dab948fb498ab07906a987c2cb63c2fef94472a0ee26092a2023
+size 75175004

training.log ADDED Viewed

The diff for this file is too large to render. See raw diff