Spaces:
Paused
Paused
File size: 9,104 Bytes
caa49cc e90a1b6 2914830 caa49cc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
---
title: AI Tutor BERT
emoji: π
colorFrom: red
colorTo: indigo
sdk: gradio
sdk_version: 4.1.2
app_file: app.py
pinned: false
license: apache-2.0
---
# AI Tutor BERT
This model is a BERT model fine-tuned on artificial intelligence (AI) related terms and explanations.
With the increasing interest in artificial intelligence, many people are taking AI-related courses and projects. However, as a graduate student in artificial intelligence, it's not common to find useful resources that are easy for AI beginners to understand. Furthermore, personalized lessons tailored to individual levels and fields are often lacking, making it difficult for many people to start learning about artificial intelligence. To address these challenges, our team has created a language model that plays the role of a tutor in the field of AI terminology. Details about the model type, training dataset, and usage are explained below, so please read them carefully and be sure to try it out.
## Model
https://huggingface.co/bert-base-uncased
For the model, I used BERT, which is one of the most famous natural language processing models developed by Google. For more detailed information, please refer to the website mentioned above. To make the question-answering more like a private tutoring experience, I utilized a specialized Question and Answering model within BERT. Here's how you can load it:
```
from transformers import BertForQuestionAnswering
model = BertForQuestionAnswering.from_pretrained("bert-base-uncased")
```
## Dataset
### Wikipedia
https://en.wikipedia.org/wiki/Main_Page
### activeloop
https://www.activeloop.ai/resources/glossary/arima-models/
### Adrien Beaulieu
https://product.house/100-ai-glossary-terms-explained-to-the-rest-of-us/
```
Context: 'Feature engineering or feature extraction or feature discovery is the process of extracting features (characteristics, properties, attributes) from raw data. Due to deep learning networks, such as convolutional neural networks, that are able to learn features by themselves, domain-specific-based feature engineering has become obsolete for vision and speech processing. Other examples of features in physics include the construction of dimensionless numbers such as Reynolds number in fluid dynamics; then Nusselt number in heat transfer; Archimedes number in sedimentation; construction of first approximations of the solution such as analytical strength of materials solutions in mechanics, etc..'
Question: 'What is large language model?'
Answer: 'A large language model (LLM) is a type of language model notable for its ability to achieve general-purpose language understanding and generation.'
```
The training dataset consists of three components: context, questions, and answers, all related to artificial intelligence. The response (correct answer) data is included within the context data, and the sentence order in the context data has been rearranged to augment the dataset. The question data is focused on artificial intelligence terms as the topic. You can refer to the example above for better understanding. In total, there are over 3,300 data points, stored in pickle files in the 'data' folder. The data has been extracted and processed using HTML from sources such as Wikipedia and other websites. The sources are as mentioned above.
## Training and Result
https://github.com/CountingMstar/AI_BERT/blob/main/MY_AI_BERT_final.ipynb
The training process involves loading data from the 'data' folder and utilizing the BERT Question and Answering model. Detailed instructions for model training and usage can be found in the link provided above.
```
N_EPOCHS = 10
optim = AdamW(model.parameters(), lr=5e-5)
```
I used 10 epochs for training, and I employed the Adam optimizer with a learning rate of 5e-5.
<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/72142ff8-f5c8-47ea-9f19-1e6abb4072cd" width="500" height="400"/>
<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/2dd78573-34eb-4ce9-ad4d-2237fc7a5b1e" width="500" height="400"/>
The results, as shown in the graphs above, indicate that, at the last epoch, the loss is 6.917126256477786, and the accuracy is 0.9819078947368421, demonstrating that the model has been trained quite effectively.
## How to use?
https://github.com/CountingMstar/AI_BERT/blob/main/MY_AI_BERT_final.ipynb
You can load the trained model through the training process described above and use it as needed.
Thank you.
---
# AI Tutor BERT (μΈκ³΅μ§λ₯ κ³ΌμΈ μ μλ BERT)
μ΄ λͺ¨λΈμ μΈκ³΅μ§λ₯(AI) κ΄λ ¨ μ©μ΄ λ° μ€λͺ
μ νμΈνλ(fine-tuning)ν BERT λͺ¨λΈμ
λλ€.
μ΅κ·Ό μΈκ³΅μ§λ₯μ κ΄ν κ΄μ¬μ΄ λμμ§λ©΄μ λ§μ μ¬λμ΄ μΈκ³΅μ§λ₯ κ΄λ ¨ μμ
λ° νλ‘μ νΈλ₯Ό μ§ννκ³ μμ΅λλ€. κ·Έλ¬λ μΈκ³΅μ§λ₯ κ΄λ ¨ λνμμμΌλ‘μ μ΄λ¬ν μμμ λΉν΄ μΈκ³΅μ§λ₯ μ΄λ³΄μλ€μ΄ μ μμλ€μ μ μλ μ μ©ν μλ£λ νμΉ μμ΅λλ€. λλΆμ΄ κ°μμ μμ€κ³Ό λΆμΌμ κ°μΈνλ κ°μ λν λΆμ‘±ν μν©μ΄μ΄μ λ§μ μ¬λλ€μ΄ μΈκ³΅μ§λ₯ νμ΅μ μμνκΈ° μ΄λ €μνκ³ μμ΅λλ€. μ΄λ¬ν λ¬Έμ λ₯Ό ν΄κ²°νκ³ μ, μ ν¬ νμ μΈκ³΅μ§λ₯ μ©μ΄ λλ©μΈμμ κ³ΌμΈ μ μλ μν μ νλ μΈμ΄λͺ¨λΈμ λ§λ€μμ΅λλ€. λͺ¨λΈμ μ’
λ₯, νμ΅ λ°μ΄ν°μ
, μ¬μ©λ² λ±μ΄ μλμ μ€λͺ
λμ΄ μμΌλ μμΈν μ½μ΄λ³΄μκ³ , κΌ μ¬μ©ν΄ 보μκΈ° λ°λλλ€.
## Model
https://huggingface.co/bert-base-uncased
λͺ¨λΈμ κ²½μ° μμ°μ΄ μ²λ¦¬ λͺ¨λΈ μ€ κ°μ₯ μ λͺ
ν Googleμμ κ°λ°ν BERTλ₯Ό μ¬μ©νμ΅λλ€. μμΈν μ€λͺ
μ μ μ¬μ΄νΈλ₯Ό μ°Έκ³ νμκΈ° λ°λλλ€. μ§μμλ΅μ΄ μ£ΌμΈ κ³ΌμΈ μ μλλ΅κ², BERT μ€μμλ μ§μμλ΅μ νΉνλ Question and Answering λͺ¨λΈμ μ¬μ©νμμ΅λλ€. λΆλ¬μ€λ λ²μ λ€μκ³Ό κ°μ΅λλ€.
```
from transformers import BertForQuestionAnswering
model = BertForQuestionAnswering.from_pretrained("bert-base-uncased")
```
## Dataset
### Wikipedia
https://en.wikipedia.org/wiki/Main_Page
### activeloop
https://www.activeloop.ai/resources/glossary/arima-models/
### Adrien Beaulieu
https://product.house/100-ai-glossary-terms-explained-to-the-rest-of-us/
```
Context: 'Feature engineering or feature extraction or feature discovery is the process of extracting features (characteristics, properties, attributes) from raw data. Due to deep learning networks, such as convolutional neural networks, that are able to learn features by themselves, domain-specific-based feature engineering has become obsolete for vision and speech processing. Other examples of features in physics include the construction of dimensionless numbers such as Reynolds number in fluid dynamics; then Nusselt number in heat transfer; Archimedes number in sedimentation; construction of first approximations of the solution such as analytical strength of materials solutions in mechanics, etc..'
Question: 'What is large language model?'
Answer: 'A large language model (LLM) is a type of language model notable for its ability to achieve general-purpose language understanding and generation.'
```
νμ΅ λ°μ΄ν°μ
μ μΈκ³΅μ§λ₯ κ΄λ ¨ λ¬Έλ§₯, μ§λ¬Έ, κ·Έλ¦¬κ³ μλ΅ μ΄λ κ² 3κ°μ§λ‘ ꡬμ±μ΄ λμ΄μμ΅λλ€. μλ΅(μ λ΅) λ°μ΄ν°λ λ¬Έλ§₯ λ°μ΄ν° μμ ν¬ν¨λμ΄ μκ³ , λ¬Έλ§₯ λ°μ΄ν°μ λ¬Έμ₯ μμλ₯Ό λ°κΏμ£Όμ΄ λ°μ΄ν°λ₯Ό μ¦κ°νμμ΅λλ€. μ§λ¬Έ λ°μ΄ν°λ μ£Όμ κ° λλ μΈκ³΅μ§λ₯ μ©μ΄λ‘ μ€μ νμ΅λλ€. μμ μμλ₯Ό 보μλ©΄ μ΄ν΄νμκΈ° νΈνμ€ κ²λλ€. μ΄ λ°μ΄ν° μλ 3300μ¬ κ°λ‘ data ν΄λμ pickle νμΌ ννλ‘ μ μ₯λμ΄ μκ³ , λ°μ΄ν°λ Wikipedia λ° λ€λ₯Έ μ¬μ΄νΈλ€μ μμ htmlμ μ΄μ©νμ¬ μΆμΆ λ° κ°κ³΅νμ¬ μ μνμμ΅λλ€. ν΄λΉ μΆμ²λ μμ κ°μ΅λλ€.
## Training and Result
https://github.com/CountingMstar/AI_BERT/blob/main/MY_AI_BERT_final.ipynb
νμ΅ λ°©μμ data ν΄λμ λ°μ΄ν°μ BERT Question and Answering λͺ¨λΈμ λΆμ΄μ μ§νλ©λλ€. μμΈν λͺ¨λΈ νμ΅ λ° μ¬μ©λ²μ μμ λ§ν¬μ μ€λͺ
λμ΄ μμ΅λλ€.
```
N_EPOCHS = 10
optim = AdamW(model.parameters(), lr=5e-5)
```
μν¬ν¬(epoch)λ 10μ μ¬μ©νμΌλ©°, μλ΄ μ΅ν°λ§μ΄μ Έμ λ¬λλ μ΄νΈλ 5e-5λ₯Ό μ¬μ©νμ΅λλ€.
<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/72142ff8-f5c8-47ea-9f19-1e6abb4072cd" width="500" height="400"/>
<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/2dd78573-34eb-4ce9-ad4d-2237fc7a5b1e" width="500" height="400"/>
κ²°κ³Όλ μ κ·Έλνλ€κ³Ό κ°μ΄ λ§μ§λ§ μν¬ν¬ κΈ°μ€ loss = 6.917126256477786, accuracy = 0.9819078947368421λ‘ μλΉν νμ΅μ΄ μ λ λͺ¨μ΅μ 보μ¬μ€λλ€.
## How to use?
```
model = torch.load("./models/AI_BERT_final_10.pth")
```
μ νμ΅ κ³Όμ μ ν΅ν΄ νμ΅λ λͺ¨λΈμ λΆλ¬μ μ¬μ©νμλ©΄ λ©λλ€.
κ°μ¬ν©λλ€.
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|