Edit model card
  • Continue pre-training RoBERTa-base using discharge summaries from MIMIC-III datasets.

  • Details can be found in the following paper

Xiang Dai and Ilias Chalkidis and Sune Darkner and Desmond Elliott. 2022. Revisiting Transformer-based Models for Long Document Classification. (https://arxiv.org/abs/2204.06683)

  • Important hyper-parameters
Max sequence 4096
Batch size 8
Learning rate 5e-5
Training epochs 6
Training time 130 GPU-hours
Downloads last month
30
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.