metadata
license: apache-2.0
GitHub issues classifier (using zero shot classification)
This model was trained using the Zero-shot classifier distillation method with the BART-large-mnli model as teacher model, to train a classifier on Github issues from the Github Issues Prediction dataset
Labels
As per the dataset Kaggle competition, the classifier predicts wether an issue is a bug, feature or question. After playing around with different labels pre-training I've used a different mapping of labels that yielded better predictions (see notebook here for details), labels being
- issue
- feature request
- question
Training data
This model was trained on 5k titles (unlabelled, as per distillation requierements) from the train dataset of Github Issues Prediction dataset
Acknowledgements
- Joe Davison and his article on Zero-Shot Learning in Modern NLP
- Jeremy Evans and his notebook on Iterate like a grandmaster