File size: 3,994 Bytes

9e4aca3
 
 
 
 
ca457ce
 
 
9e4aca3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ca457ce
 
9e4aca3
 
 
 
ca457ce
 
9e4aca3
 
 
d1e90af
 
 
 
 
 
 
 
 
 
 
 
9e4aca3
 
 
ca457ce
9e4aca3
 
 
ca457ce
9e4aca3
 
 
ca457ce
9e4aca3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d1e90af
9e4aca3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ca457ce

---
tags:
- generated_from_trainer
metrics:
- accuracy
- f1
- recall
- precision
model-index:
- name: dit-base-Document_Classification-Desafio_1
  results:
  - task:
      name: Image Classification
      type: image-classification
    dataset:
      name: imagefolder
      type: imagefolder
      config: validation
      split: train
      args: validation
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.9865
language:
- en
---

# dit-base-Document_Classification-Desafio_1

This model is a fine-tuned version of [microsoft/dit-base](https://huggingface.co/microsoft/dit-base).

It achieves the following results on the evaluation set:
- Loss: 0.0436
- Accuracy: 0.9865
- F1
  - Weighted: 0.9865
  - Micro: 0.9865
  - Macro: 0.9863
- Recall
  - Weighted: 0.9865
  - Micro: 0.9865
  - Macro: 0.9861
- Precision
  - Weighted: 0.9869
  - Micro: 0.9865
  - Macro: 0.9870

## Model description

For more information on how it was created, check out the following link: https://github.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/blob/main/Document%20AI/Multiclass%20Classification/Document%20Classification%20-%20Desafio%201/Document%20Classification%20-%20Desafio%201.ipynb

## Intended uses & limitations

This model is intended to demonstrate my ability to solve a complex problem using technology.

## Training and evaluation data

Dataset Source: https://www.kaggle.com/datasets/rywgar/document-classification-desafio-1

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 8

### Training results

| Training Loss | Epoch | Step | Validation Loss | Accuracy | Weighted F1 | Micro F1 | Macro F1 | Weighted Recall | Micro Recall | Macro Recall | Weighted Precision | Micro Precision | Macro Precision |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:-----------:|:--------:|:--------:|:---------------:|:------------:|:------------:|:------------------:|:---------------:|:---------------:|
| 0.8316        | 0.99  | 62   | 0.7519          | 0.743    | 0.7020      | 0.743    | 0.7015   | 0.743           | 0.743        | 0.7430       | 0.6827             | 0.743           | 0.6819          |
| 0.3561        | 2.0   | 125  | 0.2302          | 0.9395   | 0.9401      | 0.9395   | 0.9400   | 0.9395          | 0.9395       | 0.9394       | 0.9482             | 0.9395          | 0.9480          |
| 0.2222        | 2.99  | 187  | 0.1350          | 0.956    | 0.9564      | 0.956    | 0.9561   | 0.956           | 0.956        | 0.9551       | 0.9598             | 0.956           | 0.9600          |
| 0.1705        | 4.0   | 250  | 0.0873          | 0.9725   | 0.9727      | 0.9725   | 0.9725   | 0.9725          | 0.9725       | 0.9721       | 0.9740             | 0.9725          | 0.9740          |
| 0.1541        | 4.99  | 312  | 0.0642          | 0.9825   | 0.9825      | 0.9825   | 0.9824   | 0.9825          | 0.9825       | 0.9822       | 0.9830             | 0.9825          | 0.9830          |
| 0.1253        | 6.0   | 375  | 0.0330          | 0.9915   | 0.9915      | 0.9915   | 0.9914   | 0.9915          | 0.9915       | 0.9913       | 0.9916             | 0.9915          | 0.9916          |
| 0.1196        | 6.99  | 437  | 0.0524          | 0.982    | 0.9822      | 0.982    | 0.9820   | 0.982           | 0.982        | 0.9817       | 0.9832             | 0.982           | 0.9832          |
| 0.0896        | 7.94  | 496  | 0.0436          | 0.9865   | 0.9865      | 0.9865   | 0.9863   | 0.9865          | 0.9865       | 0.9861       | 0.9869             | 0.9865          | 0.9870          |


### Framework versions

- Transformers 4.28.1
- Pytorch 2.0.0
- Datasets 2.11.0
- Tokenizers 0.13.3