Discarded basic tokenization to better fit our vocabulary acc64ce Andrey Kutuzov commited on Feb 26, 2021