Text Classification
TF-Keras
English
stormsidali2001 commited on
Commit
17d0414
1 Parent(s): e9ce464

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -3
README.md CHANGED
@@ -1,3 +1,93 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: tf-keras
3
+ pipeline_tag: text-classification
4
+ widget:
5
+ - text: "Climate change is a pressing global issue with far-reaching consequences for ecosystems and human societies."
6
+ output:
7
+ - label: Show that the research area is important, problematic, or relevant in some way
8
+ score: 0.95
9
+ - label: Introduce and review previous research in the field
10
+ score: 0.05
11
+ - text: "Numerous studies have investigated the impact of rising temperatures on marine biodiversity."
12
+ output:
13
+ - label: Show that the research area is important, problematic, or relevant in some way
14
+ score: 0.1
15
+ - label: Introduce and review previous research in the field
16
+ score: 0.9
17
+ - text: "Despite its importance, the specific role of ocean currents in mitigating climate change remains poorly understood."
18
+ output:
19
+ - label: Show that the research area is important, problematic, or relevant in some way
20
+ score: 0.55
21
+ - label: Introduce and review previous research in the field
22
+ score: 0.45
23
+ license: mit
24
+ datasets:
25
+ - stormsidali2001/IMRAD-introduction-sentences-moves-sub-moves-dataset
26
+ language:
27
+ - en
28
+ metrics:
29
+ - f1
30
+ - accuracy
31
+ base_model: google/bert-base-cased
32
+ ---
33
+
34
+ ## IMRaD Introduction Move 0 Sub-move Classifier
35
+
36
+ This model is a fine-tuned BERT model specialized in classifying sentences from the "Establishing a Research Territory" (Move 0) section of scientific research paper introductions into their corresponding sub-moves:
37
+
38
+ * **Show that the research area is important, problematic, or relevant in some way:** Highlighting the significance, issues, or relevance of the research topic.
39
+ * **Introduce and review previous research in the field:** Presenting a brief overview of existing work and studies related to the topic.
40
+
41
+ **Parent Classifier:**
42
+
43
+ This model is designed to be used in conjunction with the main IMRaD Introduction Move Classifier: [https://huggingface.co/stormsidali2001/IMRAD_introduction_moves_classifier](https://huggingface.co/stormsidali2001/IMRAD_introduction_moves_classifier).
44
+
45
+ The parent classifier identifies the overall IMRaD move for each sentence. If a sentence is classified as "Establishing a Research Territory" (Move 0), this sub-move classifier can be used to further analyze the specific purpose of that sentence within Move 0.
46
+
47
+ ## Intended Uses & Limitations
48
+
49
+ **Intended Uses:**
50
+
51
+ * **Scientific Writing Assistance:** Help researchers and students understand and refine the structure of their "Establishing a Research Territory" section.
52
+ * **Literature Review Analysis:** Quickly identify how authors establish the context and background in research paper introductions.
53
+ * **Educational Tool:** Illustrate the different sub-moves used to establish a research territory in scientific writing.
54
+
55
+ **Limitations:**
56
+
57
+ * **Domain Specificity:** The model was trained on scientific research papers and may not be as accurate on other types of text.
58
+ * **Accuracy:** While the model has good performance, it is not perfect. Predictions should be carefully reviewed.
59
+ * **Sentence-Level Classification:** The model classifies individual sentences and does not provide an analysis of the entire "Establishing a Research Territory" section as a whole.
60
+
61
+ ## Training and Evaluation Data
62
+
63
+ This model was trained and evaluated on a subset of the "IMRAD Introduction Sentences Moves & Sub-moves Dataset" available on Hugging Face: [https://huggingface.co/datasets/stormsidali2001/IMRAD-introduction-sentences-moves-sub-moves-dataset](https://huggingface.co/datasets/stormsidali2001/IMRAD-introduction-sentences-moves-sub-moves-dataset)
64
+
65
+ The dataset includes sentences specifically from Move 0 of introductions, labeled with their respective sub-moves.
66
+
67
+ **Training Details:**
68
+
69
+ * **Base Model:** `google/bert-base-cased`
70
+ * **Implementation:** TensorFlow/Keras
71
+ * **Evaluation Metrics:** F1 score and accuracy
72
+
73
+ ## How to Use
74
+
75
+ ```python
76
+ from transformers import pipeline
77
+
78
+ # Load the parent classifier
79
+ move_classifier = pipeline("text-classification", model="stormsidali2001/IMRAD_introduction_moves_classifier")
80
+
81
+ # Load the sub-move classifier for Move 0
82
+ submove_classifier_0 = pipeline("text-classification", model="stormsidali2001/IMRAD-introduction-move-zero-sub-moves-classifier")
83
+
84
+ sentence = "Electronic cigarettes were introduced into the US market in 2007."
85
+
86
+ # First, classify the move
87
+ move_result = move_classifier(sentence)
88
+ move = move_result[0]['label']
89
+
90
+ if move == "Establishing a Research Territory":
91
+ # If Move 0, classify the sub-move
92
+ submove_result = submove_classifier_0(sentence)
93
+ print(submove_result)