eval results reproducibility

by diana-onutu - opened Sep 11, 2023

Sep 11, 2023

•

edited Sep 12, 2023

How did you deal with the misalignment that appears after tokenization between the tokens and the ner tags? If the word "Japan" has as ner tag "B-LOC", how does it look like after it is tokenized as follows: "JA", "#PA", "#N"? Do you for example re-align the ner tags as "B-LOC", "I-LOC", "I-LOC"? I'm trying to reproduce your evaluation results, but most of them are between 0.5-0.7 (except accuracy). In the calculation of these metrics, do we also evaluate the performance on the "O" label?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment