Update RotaryEmbedding caching
#33
by
beibin79
- opened
- Save computation for multiple alternating calls with the same sequence length.
- Debug: the cos_sin(...) function could return (None, None) during multi-threading on some hardware. The latest dict design can resolve this issue.