invalid weights doesn't match modeling code

by winglian - opened Apr 1

Apr 1

https://huggingface.co/SinclairSchneider/dbrx-base-quantization-fixed/blob/main/modeling_dbrx.py#L754-L756

The modeling code this model references has split the expert weights, but this model isn't

 size mismatch for transformer.blocks.38.ffn.experts.mlp.9.v1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([10752, 6144
]).

johnrachwanpruna

Pruna AI org Apr 1

This is a converted model from the original one that changes some architecture components in order to enable bnb quantization (see https://huggingface.co/databricks/dbrx-instruct/discussions/10#660921b553b869c928b0c5d0)

johnrachwanpruna changed discussion status to closed Apr 2

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment