File size: 564 Bytes
62977bb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
# BEIR v1.0.0 contriever-msmarco
This index was generated on 20230124 using Tevatron with following command:
```
python -m tevatron.driver.encode \
--output_dir=temp \
--model_name_or_path facebook/contriever-msmarco \
--fp16 \
--tokenizer_name bert-base-uncased \
--per_device_eval_batch_size 156 \
--p_max_len 512 \
--dataset_name Tevatron/beir-corpus:$subdataset \
--encoded_save_path beir_embeddings/corpus_emb.$subdataset.pkl
```
where the `subdataset` is one of the BEIR dataset, e.g. `scifact`.
The Embedding is then converted to Pyserini index format. |