--- license: mit --- **Note: please check [DeepKPG](https://github.com/uclanlp/DeepKPG#scibart) for using this model in huggingface, including setting up the newly trained tokenizer.** Paper: [Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study](https://arxiv.org/abs/2212.10233) ``` @article{https://doi.org/10.48550/arxiv.2212.10233, doi = {10.48550/ARXIV.2212.10233}, url = {https://arxiv.org/abs/2212.10233}, author = {Wu, Di and Ahmad, Wasi Uddin and Chang, Kai-Wei}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ``` Pre-training Corpus: [S2ORC (titles and abstracts)](https://github.com/allenai/s2orc) Pre-training Details: - Pre-trained **from scratch** with a science vocabulary - Batch size: 2048 - Total steps: 250k - Learning rate: 3e-4 - LR schedule: polynomial with 10k warmup steps - Masking ratio: 30%, Poisson lambda = 3.5