batterydata commited on
Commit
5cf7334
1 Parent(s): 10ccdcc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md CHANGED
@@ -1,3 +1,63 @@
1
  ---
 
 
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
+ tags: question answering
4
  license: apache-2.0
5
+ datasets:
6
+ - squad
7
+ - batterydata/battery-device-data-qa
8
+ metrics: squad
9
  ---
10
+
11
+ # BatteryBERT-uncased for QA
12
+ **Language model:** batterybert-uncased
13
+ **Language:** English
14
+ **Downstream-task:** Extractive QA
15
+ **Training data:** SQuAD v1
16
+ **Eval data:** SQuAD v1
17
+ **Code:** See [example](https://github.com/ShuHuang/batterybert)
18
+ **Infrastructure**: 8x DGX A100
19
+ ## Hyperparameters
20
+ ```
21
+ batch_size = 32
22
+ n_epochs = 3
23
+ base_LM_model = "batterybert-uncased"
24
+ max_seq_len = 386
25
+ learning_rate = 3e-5
26
+ doc_stride=128
27
+ max_query_length=64
28
+ ```
29
+ ## Performance
30
+ Evaluated on the SQuAD v1.0 dev set.
31
+ ```
32
+ "exact": 81.08,
33
+ "f1": 88.41,
34
+ ```
35
+ Evaluated on the battery device dataset.
36
+ ```
37
+ "precision": 68.27,
38
+ "recall": 80.88,
39
+ ```
40
+ ## Usage
41
+ ### In Transformers
42
+ ```python
43
+ from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
44
+
45
+ model_name = "batterydata/batterybert-uncased-squad-v1"
46
+ # a) Get predictions
47
+ nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
48
+ QA_input = {
49
+ 'question': 'What is the electrolyte?',
50
+ 'context': 'The typical non-aqueous electrolyte for commercial Li-ion cells is a solution of LiPF6 in linear and cyclic carbonates.'
51
+ }
52
+ res = nlp(QA_input)
53
+ # b) Load model & tokenizer
54
+ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
55
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
56
+ ```
57
+ ## Authors
58
+ Shu Huang: `sh2009 [at] cam.ac.uk`
59
+
60
+ Jacqueline Cole: `jmc61 [at] cam.ac.uk`
61
+
62
+ ## Citation
63
+ BatteryBERT: A Pre-trained Language Model for Battery Database Enhancement