MatanBenChorin commited on
Commit
7f8f579
1 Parent(s): 487f9e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -4
README.md CHANGED
@@ -36,9 +36,6 @@ should probably proofread and complete it, then remove this comment. -->
36
  # hebert-finetuned-hebrew-squad
37
 
38
  This model fine-tunes avichr/heBERT model on SQuAD dataset auto-translated to Hebrew.
39
- ## Model description
40
-
41
- More information needed
42
 
43
  ## Intended uses & limitations
44
 
@@ -46,7 +43,12 @@ Hebrew SQuAD
46
 
47
  ## Training and evaluation data
48
 
49
- More information needed
 
 
 
 
 
50
 
51
  ## Training procedure
52
 
@@ -61,6 +63,8 @@ The following hyperparameters were used during training:
61
  - lr_scheduler_type: linear
62
  - num_epochs: 15
63
 
 
 
64
  ### Framework versions
65
 
66
  - Transformers 4.17.0
@@ -68,6 +72,36 @@ The following hyperparameters were used during training:
68
  - Datasets 1.18.4
69
  - Tokenizers 0.11.6
70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  ### About Us
72
  Created by Matan Ben-chorin, May Flaster, Guided by Dr. Oren Mishali.
73
  This is our final project as part of computer engineering B.Sc studies in the Faculty of Electrical Engineering combined with Computer Science at Technion, Israel Institute of Technology.
 
36
  # hebert-finetuned-hebrew-squad
37
 
38
  This model fine-tunes avichr/heBERT model on SQuAD dataset auto-translated to Hebrew.
 
 
 
39
 
40
  ## Intended uses & limitations
41
 
 
43
 
44
  ## Training and evaluation data
45
 
46
+
47
+ | Dataset | Split | # samples |
48
+ | -------- | ----- | --------- |
49
+ | Hebrew_Squad_v1| train | 52,405 |
50
+ | Hebrew_Squad_v1| validation| 7,455 |
51
+
52
 
53
  ## Training procedure
54
 
 
63
  - lr_scheduler_type: linear
64
  - num_epochs: 15
65
 
66
+ It took about 8 hours to finish training.
67
+
68
  ### Framework versions
69
 
70
  - Transformers 4.17.0
 
72
  - Datasets 1.18.4
73
  - Tokenizers 0.11.6
74
 
75
+ ### Results
76
+
77
+ **Model size**: `418M`
78
+
79
+ | Metric | # Value | # Original ([Table 2](https://www.aclweb.org/anthology/N19-1423.pdf))|
80
+ | ------ | --------- | --------- |
81
+ | **Exact Match** | **42.6** | **80.8** |
82
+ | **F1** | **55.9** | **88.5** |
83
+ ## Example Usage
84
+
85
+
86
+ ```python
87
+ from transformers import pipeline
88
+
89
+ model_checkpoint = "tdklab/hebert-finetuned-hebrew-squad"
90
+ qa_pipeline = pipeline(
91
+ "question-answering",
92
+ model=model_checkpoint,
93
+ )
94
+
95
+ predictions = qa_pipeline({
96
+ 'context': "ירושלים היא עיר הבירה של מדינת ישראל , והעיר הגדולה ביותר בישראל בגודל האוכלוסייה. נכון לשנת 2021, מתגוררים בה כ-957 אלף תושבים. בירושלים שוכנים מוסדות הממשל של ישראל: הכנסת, בית המשפט העליון, משכן הנשיא, בית ראש הממשלה ורוב משרדי הממשלה. ירושלים שוכנת בהרי יהודה, על קו פרשת המים הארצי של ארץ ישראל, בין הים התיכון וים המלח, ברום של 570 עד 857 מטרים מעל פני הים.",
97
+ 'question': "מהי עיר הבירה של מדינת ישראל?"
98
+ })
99
+
100
+ print(predictions)
101
+ # output:
102
+ # {'score': 0.9999890327453613, 'start': 0, 'end': 7, 'answer': 'ירושלים'}
103
+ ```
104
+
105
  ### About Us
106
  Created by Matan Ben-chorin, May Flaster, Guided by Dr. Oren Mishali.
107
  This is our final project as part of computer engineering B.Sc studies in the Faculty of Electrical Engineering combined with Computer Science at Technion, Israel Institute of Technology.