# FSNER Implemented by [sayef](https://huggingface.co/sayef). ## Overview The FSNER model was proposed in [Example-Based Named Entity Recognition](https://arxiv.org/abs/2008.10570) by Morteza Ziyadi, Yuting Sun, Abhishek Goswami, Jade Huang, Weizhu Chen. To identify entity spans in a new domain, it uses a train-free few-shot learning approach inspired by question-answering. ## Abstract ---- > We present a novel approach to named entity recognition (NER) in the presence of scarce data that we call example-based NER. Our train-free few-shot learning approach takes inspiration from question-answering to identify entity spans in a new and unseen domain. In comparison with the current state-of-the-art, the proposed method performs significantly better, especially when using a low number of support examples. ## Model Training Details ----- | identifier | epochs | datasets | | ---------- |:------:|:-----------------------------------------------------------------------------------------------:| | [sayef/fsner-bert-base-uncased](https://huggingface.co/sayef/fsner-bert-base-uncased) | 25 | ontonotes5, conll2003, wnut2017, mit_movie_trivia, mit_restaurant and fin (Alvarado et al.). | ## Installation and Example Usage ------ You can use the FSNER model in 3 ways: 1. Install directly from PyPI: `pip install fsner` and import the model as shown in the code example below or 2. Install from source: `python setup.py install` and import the model as shown in the code example below or 3. Clone [repo](https://github.com/sayef/fsner) and add absolute path of `fsner/src` directory to your PYTHONPATH and import the model as shown in the code example below ```python import json from fsner import FSNERModel, FSNERTokenizerUtils, pretty_embed query_texts = [ "Does Luke's serve lunch?", "Chang does not speak Taiwanese very well.", "I like Berlin." ] # Each list in supports are the examples of one entity type # Wrap entities around with [E] and [/E] in the examples. # Each sentence should have only one pair of [E] ... [/E] support_texts = { "Restaurant": [ "What time does [E] Subway [/E] open for breakfast?", "Is there a [E] China Garden [/E] restaurant in newark?", "Does [E] Le Cirque [/E] have valet parking?", "Is there a [E] McDonalds [/E] on main street?", "Does [E] Mike's Diner [/E] offer huge portions and outdoor dining?" ], "Language": [ "Although I understood no [E] French [/E] in those days , I was prepared to spend the whole day with Chien - chien .", "like what the hell 's that called in [E] English [/E] ? I have to register to be here like since I 'm a foreigner .", "So , I 'm also working on an [E] English [/E] degree because that 's my real interest .", "Al - Jazeera TV station , established in November 1996 in Qatar , is an [E] Arabic - language [/E] news TV station broadcasting global news and reports nonstop around the clock .", "They think it 's far better for their children to be here improving their [E] English [/E] than sitting at home in front of a TV . \"", "The only solution seemed to be to have her learn [E] French [/E] .", "I have to read sixty pages of [E] Russian [/E] today ." ] } device = 'cpu' tokenizer = FSNERTokenizerUtils("checkpoints/model") queries = tokenizer.tokenize(query_texts).to(device) supports = tokenizer.tokenize(list(support_texts.values())).to(device) model = FSNERModel("checkpoints/model") model.to(device) p_starts, p_ends = model.predict(queries, supports) # One can prepare supports once and reuse multiple times with different queries # ------------------------------------------------------------------------------ # start_token_embeddings, end_token_embeddings = model.prepare_supports(supports) # p_starts, p_ends = model.predict(queries, start_token_embeddings=start_token_embeddings, # end_token_embeddings=end_token_embeddings) output = tokenizer.extract_entity_from_scores(query_texts, queries, p_starts, p_ends, entity_keys=list(support_texts.keys()), thresh=0.50) print(json.dumps(output, indent=2)) # install displacy for pretty embed pretty_embed(query_texts, output, list(support_texts.keys())) ```