Edit model card

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This model is further trained on top of scibert-base using masked language modeling loss (MLM). The corpus is roughly abstracts from 270,000 earth science-based publications.

The tokenizer used is AutoTokenizer, which is trained on the same corpus.

Stay tuned for further downstream task tests and updates to the model.

in the works

MLM + NSP task loss
Add more data sources for training
Test using downstream tasks

Downloads last month: 17

Safetensors

Model size

110M params

Tensor type

I64

F32

Inference Examples

Fill-Mask

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.