THUDM
/

chatglm2-6b

Inference Endpoints

Model card Files Files and versions Community

模型并行出错并给出修改方案

#54

by yuanzhoulvpi - opened Jul 17, 2023

base: refs/heads/main

←

from: refs/pr/54

Discussion Files changed

Files changed (1) hide show

modeling_chatglm.py +1 -1

modeling_chatglm.py CHANGED Viewed

@@ -952,7 +952,7 @@ class ChatGLMForConditionalGeneration(ChatGLMPreTrainedModel):
             # Shift so that tokens < n predict n
             shift_logits = lm_logits[..., :-1, :].contiguous()
-            shift_labels = labels[..., 1:].contiguous()
             # Flatten the tokens
             loss_fct = CrossEntropyLoss(ignore_index=-100)
             loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))

             # Shift so that tokens < n predict n
             shift_logits = lm_logits[..., :-1, :].contiguous()
+            shift_labels = labels[..., 1:].contiguous().to(shift_logits.device)
             # Flatten the tokens
             loss_fct = CrossEntropyLoss(ignore_index=-100)
             loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))