This model is based on bigscience/bloom-1b7.

We pruned its vocabulary from 250880 to 46145 with Chinese corpus to reduce GPU memory usage. So the total parameter is 1.4b now.

How to use

from transformers import BloomTokenizerFast, BloomForCausalLM

tokenizer = BloomTokenizerFast.from_pretrained('Langboat/bloom-1b4-zh')
model = BloomForCausalLM.from_pretrained('Langboat/bloom-1b4-zh')

print(tokenizer.batch_decode(model.generate(tokenizer.encode('中国的首都是', return_tensors='pt'))))

Downloads last month: 5,435

Safetensors

Model size

1.4B params

Tensor type

F32

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Langboat
/

bloom-1b4-zh

How to use

Spaces using Langboat/bloom-1b4-zh 3