Question Regarding the trainingsset
#1
by
verbrannter
- opened
Hello,
You said you trained with your own proprietary dataset, but can you talk about how your dataset was structured?
I want to train something similar to you, maybe with more fields, but I'm not quite sure how to structure my training dataset.
Kind regards
lars
to-be
changed discussion status to
closed
Hello @to-be .Thank you for the demo. Since the dataset is confidential, I have 2 questions regarding the training that you've done please :
- What hyperparameters did you choose for your training (I have a similar dataset -- as I've seen int your 3 test invoices --) ?
- How were you able to use a different input size (I get an error indicating a mismatch when I change the default input size used in Donut) ?
Thank you in advance.
- train_batch_sizes:
- 1
val_batch_sizes: - 2
input_size: - 1600
- 1280
max_length: 256
align_long_axis: False
num_nodes: 1
seed: 2022
lr: 3e-05
warmup_steps: 300
num_training_samples_per_epoch: 1200
max_epochs: 100
max_steps: -1
num_workers: 4
val_check_interval: 1.0
check_val_every_n_epoch: 3
gradient_clip_val: 1.0
- 1
- from transformers import VisionEncoderDecoderConfig
max_length = 768
image_size = [1920, 1280]
#image_size = [1280, 960]
update image_size of the encoder
during pre-training, a larger image size was used
config = VisionEncoderDecoderConfig.from_pretrained("naver-clova-ix/donut-base")
config.encoder.image_size = image_size # (height, width)
update max_length of the decoder (for generation)
config.decoder.max_length = max_length