Great work!

#1
by jukofyork - opened

Just had a look through your code and I think you might even be able to implement an idea I had that may make them work even better:

Calculate the L2-norm of the hidden state before adding the control vector(s), then add on whatever combination of control vectors you want, remeasure the new L2-norm, calculate the ratio and scale back to the original L2-norm.

Mathematically this makes more sense when combining multiple control vectors:

  • It acts more like a rotation / SLERP.
  • Some of the 8 sets of control vectors are (much) more correlated than others (even though I tried hard to reduce this via the prompt wording), and combining them can be very hit or miss.
  • It lets the user concentrate on just choosing the ratios of the different control vectors.
  • It greatly reduces the chance of "overcooking" the hidden states' magnitudes and causing gibberish output.

It may even work better for single control vectors too!?

The only reason I didn't try this in llama.cpp is the ggml code is a complete nightmare to work with (mainly due to llama.h being C-only, yet common.h and common.cpp are C++), and also every different architecture is implemented as a separate function with lots of duplicated code... :/

Thank you :) You're right, I was wondering about that as well, especially when adding multiple vectors with weight 1. I'll try that and commit if it works fine.

I've implemented the change. It seems to work fine, but I haven't done any comparisons yet.

Sign up or log in to comment