training with forecast_length = 1

#7
by Vanin - opened

Hi everyone!
I'm just trying to check different configurations. I need to have only 1 step forecast in my app and if I train the model with forecast_length = 1 parameter instead of 96 I get much better loss for same dataset:
96 length
{'loss': 0.5172, 'grad_norm': 1.2684060335159302, 'learning_rate': 0.0007707993110636555, 'epoch': 1.0}
{'loss': 0.3486, 'grad_norm': 0.30313369631767273, 'learning_rate': 0.0009468165328964254, 'epoch': 2.0}
vs 1 length
{'loss': 0.3329, 'grad_norm': 0.9496201872825623, 'learning_rate': 0.0007706020808719667, 'epoch': 1.0}
{'loss': 0.1024, 'grad_norm': 0.27570515871047974, 'learning_rate': 0.0009468843306027785, 'epoch': 2.0}

But there is also the following error with forecast_length = 1:
torch.Size([64, 1, 16])) that is different to the input size (t
orch.Size([64, 96, 16])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

So the question is:
for getting more accurate first frame forecast, should I train model with forecast_length = 1 or I should always use forecast_length = 96?
Thanks in advance!

IBM Granite org

kindly do not change the prediction length value.

There is another parameter called prediction_filter_length which you should set to 1 for finetuning loss to apply only on the first point.

Thanks, Vijay!

Sign up or log in to comment