imgutils-models
This repository includes all the models in deepghs/imgutils.
LPIPS
This model is used for clustering anime images (named 差分
in Chinese), based on richzhang/PerceptualSimilarity, trained with dataset deepghs/chafen_arknights(private).
When threshold is 0.45
, the adjusted rand score can reach 0.995
.
File lists:
lpips_diff.onnx
, feature difference.lpips_feature.onnx
, feature extracting.
Monochrome
These model is used for monochrome image classification, based on CNNs and Transformers, trained with dataset deepghs/monochrome_danbooru(private).
The following are the checkpoints that have been formally put into use, all based on the Caformer architecture:
Checkpoint | Algorithm | Safe Level | Accuracy | False Negative | False Positive |
---|---|---|---|---|---|
monochrome-caformer-40 | caformer | 0 | 96.41% | 2.69% | 0.89% |
monochrome-caformer-110 | caformer | 0 | 96.97% | 1.57% | 1.46% |
monochrome-caformer_safe2-80 | caformer | 2 | 94.84% | 1.12% | 4.03% |
monochrome-caformer_safe4-70 | caformer | 4 | 94.28% | 0.67% | 5.04% |
monochrome-caformer-110
has the best overall accuracy among them, but considering that this model is often used to screen out monochrome images
and we want to screen out as many as possible without omission, we have also introduced weighted models (safe2
and safe4
).
Although their overall accuracy has been slightly reduced, the probability of False Negative (misidentifying a monochrome image as a colored one) is lower,
making them more suitable for batch screening.
Deepdanbooru
deepdanbooru
is a model used to tag anime images. Here, we provide a table for tag classification called deepdanbooru_tags.csv
,
as well as an ONNX model (from chinoll/deepdanbooru).
It's worth noting that due to the poor quality of the deepdanbooru model itself and the relatively old dataset,
it is only for testing purposes and is not recommended to be used as the main classification model. We recommend using the wd14
model instead, see: