(Partial) Feedback
I haven't really had time to properly evaluate (eg long RP sessions) but initially I'd say the A version seems to hold a lot more coherence and logic than the B version. What I have tested so far at least seems to be a pretty good model and I hope to be able to put in the proper time to really test it soon.
As a side note, Stenho has a v3.2 version and it seems like the changes are not minor. It's supposed to be a LOT better, particularly in regards to general logic. Since it's a part of these, it might be worth redoing both with the 3.2 version to see what happens. My bet is it would indeed be an improvement to both.
EDIT: Ok, had time to do a more proper session. RoPE was set wrong before indeed. Not sure how that happened, but I think it was set way off. The GradientAI numbers seem to be best. At 16K it needs something around the range of 1776948.8, give or take. (On my hardware exactly 2000000 gave me the best PPL results.) Once I tried again with it set correctly I feel like it gave me some pretty good quality writing. RP had some quite detailed writing that felt pretty decent. I'm not sure yet about other details like character card adherence, but at least basic RP writing quality is good.
I'm definitely even more curious about what a Stenho v3.2 version might yield.
Unfortunately, it does still seem to break down for me around maybe something like 10K. At least so far.