abinayam/gpt-2-tamil · limited pre-training and guard rails

hi Abinaya and team?:
This is great effort; I think you should write a paper and post to Arxiv on this topic and significant contribution for Tamil.
Can you release the methodology for training, tokenization and encoding representations ?

However since the model seems to be having some limited self correction and guard rails, and the model has limited cleanup of personally-identifiable information it should be mentioned in the announcement and user guide. There should be more guard rails added to this model and harmful content generation should be listed.

Thank you
-Muthu Annamalai