ONNX-Demo / pyserini /resources /index-metadata /faiss-flat.wikipedia.dkrr-dpr-tqa-retriever.20220217.25ed1f.cc91b2.README.md
ArthurChen189's picture
upload pyserini
62977bb
|
raw
history blame
1.11 kB

wikipedia-dpr-dkrr-tqa

Faiss FlatIP index of Wikipedia DPR encoded by the retriever model from Distilling Knowledge from Reader to Retriever for Question Answering trained on TriviaQA. This index was generated on 2022/02/17 on orca at commits:

with the following command to generate the embeddings (from FiD repo):

python generate_passage_embeddings.py \
  --model_path tqa_retriever \
  --passages passages.tsv \
  --output_path wikipedia_embeddings_tqa \
  --shard_id 0 \
  --num_shards 1 \
  --per_gpu_batch_size 500

and the following command to convert the embeddings to faiss IndexFlatIP form:

python convert_dkrr_embeddings_to_faiss.py \
  --embeddings wikipedia_embeddings_tqa \
  --output faiss-flat.wikipedia.dkrr-dpr-tqa-retriever