This model has 1 file scanned as unsafe.
- attn_loss_fn=None, attn_weight=0, gradient_accumulation_steps=1, hs_loss_fn=mse, hs_weight=2.0, learning_rate=0.0004, lr_scheduler_kwargs=__num_cycles___4_, lr_scheduler_type=cosine_with_restarts, max
- attn_loss_fn=None, attn_weight=0, gradient_accumulation_steps=1, hs_loss_fn=mse, hs_weight=2.0, learning_rate=0.0004, lr_scheduler_kwargs=__num_cycles___8_, lr_scheduler_type=cosine_with_restarts, max
- attn_loss_fn=None, attn_weight=0, gradient_accumulation_steps=1, hs_loss_fn=mse, hs_weight=2.0, learning_rate=0.0004, lr_scheduler_type=cosine_with_restarts, max_grad_norm=None, num_cycles=4, optim=pa