metadata

license: apache-2.0

GitHub issues classifier (using zero shot classification)

This model was trained using the Zero-shot classifier distillation method with the BART-large-mnli model as teacher model, to train a classifier on Github issues from the Github Issues Prediction dataset

Labels

As per the dataset Kaggle competition, the classifier predicts wether an issue is a bug, feature or question. After playing around with different labels pre-training I've used a different mapping of labels that yielded better predictions (see notebook here for details), labels being

issue
feature request
question

Training data

This model was trained on 5k titles (unlabelled, as per distillation requierements) from the train dataset of Github Issues Prediction dataset

Acknowledgements

Joe Davison and his article on Zero-Shot Learning in Modern NLP
Jeremy Evans and his notebook on Iterate like a grandmaster