Results and training data
Thanks for publicizing this.!
In the paper (table 2) you reported a nice F1 score of 0.8701, and it was also mentioned that the training was done on NEMO corpus.
was there any changes in this since the paper publication? (Im asking because i was on whatsapp giving credit to NEMO for providing over 90% of the training data - is there more that you are able to share?)
The reported scores in the paper were for a model trained and tested solely on the NEMO corpus.
For the training of this model we trained it on a much larger corpus, where NEMO was actually a very small percentage of it, and most of the training data was provided by the IAHLT project.
may i ask about the F1 results now?
Results are similar but harder to estimate, since the IAHLT corpus includes additional tags which aren't included in the NEMO corpus.
We are going to release a detailed document with experiments in the coming weeks. On a much larger test corpus (a subset of the IAHLT corpus) with more domains the overall F1 reaches 0.84.
(to contrast, the model trained on NEMO alone does significantly worse on this test corpus.)
Thank You!