arxiv:2310.08164
Abdullah
amirabdullah19852020
AI & ML interests
Mechanistic interpretability, high dimensional geometry, persona role playing.
Organizations
Papers
1
models
16
amirabdullah19852020/interpreting_reward_models
Updated
amirabdullah19852020/test
Text Generation
•
Updated
•
18
amirabdullah19852020/gpt-neo-125m_hh_reward
Text Generation
•
Updated
•
29
amirabdullah19852020/gpt-neo-125m_utility_reward
Reinforcement Learning
•
Updated
•
14
amirabdullah19852020/pythia-70m_sentiment_reward
Reinforcement Learning
•
Updated
•
32
amirabdullah19852020/pythia-160m_sentiment_reward
Reinforcement Learning
•
Updated
•
9
amirabdullah19852020/gpt-neo-125m_sentiment_reward
Reinforcement Learning
•
Updated
•
13
amirabdullah19852020/pythia-160m_utility_reward
Reinforcement Learning
•
Updated
•
13
amirabdullah19852020/pythia-70m_utility_reward
Reinforcement Learning
•
Updated
•
15
amirabdullah19852020/gpt-j-6b-sharded-bf16_sentiment_reward
Reinforcement Learning
•
Updated