Problem:
While applying this model to my streamlit app, I got below error:
"""
Gemma's activation function should be approximate GeLU and not exact GeLU.
Changing the activation function to gelu_pytorch_tanh.if you want to use the legacy gelu, edit the model.config to set hidden_activation=gelu instead of hidden_act.
"""
Solution:
So, to use the legacy 'gelu', I want to propose to edit the model.config to set the 'hidden_act' to 'hidden_activation' key.

Thanks.

Google org

Hello @risaxD19 ! First, note the message is a warning, not an error :) The model was actually trained with an approximate GeLU function, which in the configuration is represented by gelu_pytorch_tanh. The gelu value is there for historical reasons, but you should really use gelu_pytorch_tanh to use the model the way it was designed. What the warning is saying is that the activation function in hidden_act will be ignored, but it lets you know how to override it in case you need for fine-tuning or other purposes.

One way to suppress the warning would be to set "hidden_activation": "gelu_pytorch_tanh" in the configuration.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment