OWG
/

ONNX
English
chainyo commited on
Commit
1371dfa
1 Parent(s): 586a876

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md CHANGED
@@ -1,3 +1,49 @@
1
  ---
 
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
  license: apache-2.0
4
+ datasets:
5
+ - bookcorpus
6
+ - wikipedia
7
+ - cc_news
8
  ---
9
+
10
+ # BigBird base model
11
+
12
+ BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.
13
+
14
+ It is a pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in this [paper](https://arxiv.org/abs/2007.14062) and first released in this [repository](https://github.com/google-research/bigbird).
15
+
16
+ ## Model description
17
+
18
+ BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts.
19
+
20
+ ## Original implementation
21
+
22
+ Follow [this link](https://huggingface.co/google/bigbird-roberta-base) to see the original implementation.
23
+
24
+ ## How to use
25
+
26
+ Download the model by cloning the repository via `git clone https://huggingface.co/OWG/bigbird-roberta-base`.
27
+
28
+ Then you can use the model with the following code:
29
+
30
+ ```python
31
+ from onnxruntime import InferenceSession, SessionOptions, GraphOptimizationLevel
32
+ from transformers import BertTokenizer
33
+
34
+ tokenizer = BertTokenizer.from_pretrained("google/bigbird-roberta-base")
35
+
36
+ options = SessionOptions()
37
+ options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL
38
+
39
+ session = InferenceSession("path/to/model.onnx", sess_options=options)
40
+ session.disable_fallback()
41
+
42
+ text = "Replace me by any text you want to encode."
43
+
44
+ input_ids = tokenizer(text, return_tensors="pt", return_attention_mask=True)
45
+ inputs = {k: v.cpu().detach().numpy() for k, v in input_ids.items()}
46
+
47
+ outputs_name = session.get_outputs()[0].name
48
+ outputs = session.run(output_names=[outputs_name], input_feed=inputs)
49
+ ```