AkshatSurolia
/

DeiT-FaceMask-Finetuned

Image Classification

Inference Endpoints

Model card Files Files and versions Community

AkshatSurolia commited on Feb 18, 2022

Commit

7a7f81c

•

1 Parent(s): a3b4dff

Update README.md

Files changed (1) hide show

README.md +33 -1

README.md CHANGED Viewed

@@ -1,3 +1,35 @@
 ---
-license: mit
 ---

 ---
+license: apache-2.0
+tags:
+- image-classification
+datasets:
+- Face-Mask18K
 ---
+# Distilled Data-efficient Image Transformer for Face Mask Detection
+Distilled data-efficient Image Transformer (DeiT) model pre-trained and fine-tuned on Self Currated Custom Face-Mask18K Dataset (18k images, 2 classes) at resolution 224x224. It was first introduced in the paper Training data-efficient image transformers & distillation through attention by Touvron et al.
+## Model description
+This model is a distilled Vision Transformer (ViT). It uses a distillation token, besides the class token, to effectively learn from a teacher (CNN) during both pre-training and fine-tuning. The distillation token is learned through backpropagation, by interacting with the class ([CLS]) and patch tokens through the self-attention layers.
+Images are presented to the model as a sequence of fixed-size patches (resolution 16x16), which are linearly embedded.
+## Training Metrics
+    epoch                    =          2.0
+    total_flos               = 2078245655GF
+    train_loss               =       0.0438
+    train_runtime            =   1:37:16.87
+    train_samples_per_second =        9.887
+    train_steps_per_second   =        0.309
+---
+## Evaluation Metrics
+    epoch                   =        2.0
+    eval_accuracy           =     0.9922
+    eval_loss               =     0.0271
+    eval_runtime            = 0:03:17.36
+    eval_samples_per_second =      18.22
+    eval_steps_per_second   =       2.28