lizpreciatior/lzlv_70b_fp16_hf · responses frequently cut off?

Hi, I really like this model and when it works it works really well. however, i frequently run into this problem where the model output cuts off before the end of the response (for example in the middle of a sentence). From my estimation, it's happens around 2/5 times.

For what its worth, my usecase is story writing and the prompts tend to be kinda long. I'm also using Alpaca style prompting as described here https://huggingface.co/lizpreciatior/lzlv_70b_fp16_hf/discussions/2#6548df6b5491ffcf2e4433c2

The cut off doesn't seem to be due to context window either- most of the time we are well below 4k for total input + output tokens

Any tips on how to get this to happen less? Is this a known issue? Thanks