invalid weights doesn't match modeling code
#3
by
winglian
- opened
The modeling code this model references has split the expert weights, but this model isn't
size mismatch for transformer.blocks.38.ffn.experts.mlp.9.v1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([10752, 6144
]).
This is a converted model from the original one that changes some architecture components in order to enable bnb quantization (see https://huggingface.co/databricks/dbrx-instruct/discussions/10#660921b553b869c928b0c5d0)
johnrachwanpruna
changed discussion status to
closed