为什么计算softmax之前要将logits转为float?
#10 opened 5 months ago
by
yuanshuai
how did you guys pretrain the tokenizer using tiktoken ?
#9 opened 6 months ago
by
StephennFernandes
是否可以运行在两张不同型号的GPU上
#8 opened 8 months ago
by
XCZDH
Adding Evaluation Results
#7 opened 9 months ago
by
leaderboard-pr-bot
On how much English token was the model trained onn
3
#5 opened 10 months ago
by
aslawliet
_set_gradient_checkpointing() got an unexpected keyword argument 'enable'
2
#3 opened 12 months ago
by
ehartford