GPU Memory increase after each prompt
#7
by
thanhpt93
- opened
Hi, I loaded model with 15Gb ram, but after each prompt, gpu memory increase. How to release memory? thank you very much!
I suggest you calling torch.cuda.empty_cache() after every 2-3 prompts. It may not be the most effective solution, especially if the questions are not connected and the model needs to maintain context. Using a stronger GPU like the A100, A40, or L40 is a much better approach.
thanhpt93
changed discussion status to
closed