English

multiple description candidates to facilitate DDR and HDR training

#1
by Arsenever - opened

Thanks for your good work!

  1. I wonder when will the rep. release the multiple description candidates to facilitate DDR and HDR training?
  2. In some json files, I find 5 long captions and they look like have little difference, so which I should use as the GT?

屏幕截图 2024-10-11 214520.png

Alibaba-PAI org

Hi, thanks for your attention to our work.

  1. As stated in the paper, we train HDR and DDR tasks by randomly synthesizing data samples online. So we don't design a fixed dataset for release.
  2. These sentences are different expressions of the same semantic meaning, implemented by the step Desc. Rewrite mentioned in the paper. You can randomly choose one of them each time during training.

Thanks for your reply! Could you tell how long time and how many A100s it takes to complete the training?

Alibaba-PAI org

We use 16 A100-80G GPUs and it takes about 10 hours for pre-training.

jpWang changed discussion status to closed

Sign up or log in to comment