Fixes #16, which showed that one of the merges is missing, which creates different outputs. The issue with this is that finetuned models were trained using this.

@sanchit-gandhi if you can merge this one!

Thanks for the fix @ArthurZ !

