naver
/

trecdl22-crossencoder-rankT53b-repro

Model card Files Files and versions Community

cadurosar commited on Feb 10, 2023

Commit

70f429b

•

1 Parent(s): 47c0876

Add model

Files changed (2) hide show

README.md +52 -0
pytorch_model.bin +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,55 @@
 ---
 license: cc-by-nc-sa-4.0
 ---

 ---
 license: cc-by-nc-sa-4.0
 ---
+Our best attempt at reproducing [RankT5 Enc-Softmax](https://arxiv.org/pdf/2210.10634.pdf), with a few important differences:
+1. We use a SPLADE first stage for the negatives vs GTR on the paper
+2. We train using Pytorch vs Flaxx on the paper
+3. We use the original t5-3b vs Flan T5-3b on the paper
+This leads to what seems to be a slightly worse performance (42.8 vs 43.? on the paper) and seems slightly worse on BEIR as well.
+To use this model, first clone the huggingface repo
+```
+```
+```
+import torch
+from transformers import T5EncoderModel
+class T5EncoderRerank(torch.nn.Module):
+    def __init__(self, model_type_or_dir,fp16=False, bf16=False):
+        """
+        model_type_or_dir is either the name of a pre-trained model (e.g. bert-base-uncased), or the path to
+        directory containing model weights, vocab etc.
+        """
+        super().__init__()
+        self.model = T5EncoderModel.from_pretrained(model_type_or_dir)
+        self.config = self.model.config
+        self.first_transform = torch.nn.Linear(self.config.d_model, self.config.d_model)
+        self.layer_norm = torch.nn.LayerNorm(self.config.d_model, eps=1e-12)
+        self.linear = torch.nn.Linear(self.config.d_model,1)
+    def forward(self, **kwargs):
+        result = self.model(**kwargs).last_hidden_state[:,0,:]
+        first_transformed = self.first_transform(result)
+        layer_normed = self.layer_norm(first_transformed)
+        logits = self.linear(layer_normed)
+        return SequenceClassifierOutput(
+            logits=logits
+        )
+original_model="t5-3b"
+path_checkpoint="trecdl22-crossencoder-rankT53b-repro/pytorch_model.bin"
+print("Loading")
+model = T5EncoderRerank(original_model,bf16=True)
+model.load_state_dict(torch.load(path_checkpoint,map_location=torch.device("cpu")))
+device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
+model.to(device)
+```

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e54e0aa04314ffc5aa2a07a23e890a932078254b88d2570797ffd15e0057e64e
+size 4967947259