Edit model card

Description

This is the polish gpt2 model in small architecture.

This model was released on 30.11.2023 and it is the newest version of radlab/polish-gpt2-small (https://huggingface.co/radlab/polish-gpt2-small)

Datasets

Data which are used to train this model:

  • clarin-knext/msmarco-pl
  • clarin-knext/nq-pl
  • clarin-knext/hotpotqa-pl
  • clarin-knext/scidocs-pl
  • clarin-knext/nfcorpus-pl
  • clarin-knext/dbpedia-pl
  • clarin-knext/trec-covid-pl
  • clarin-knext/quora-pl
  • clarin-knext/arguana-pl
  • clarin-knext/fiqa-pl
  • radlab/wikipedia-pl
  • radlab/legal-mc4-pl
  • own corpora not published yet

It is about 30,5 GB of data which is 3 times more than the prevoius version.

Metrics from W&B

image/png

image/png

image/png

Changelog

  • 2023.11.30 - new dataset
Downloads last month
37
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train radlab/polish-gpt2-small-v2

Collection including radlab/polish-gpt2-small-v2