Post
2228
🔍 Today's pick in Interpretability & Analysis of LMs: Context versus Prior Knowledge in Language Models by
@kdu4108
@vesteinn
@niklasstoehr
J. C. White A. Schein
@rcotterell
This work examines the influence of context versus memorized knowledge in LMs through the lens of the shift caused by contexts at various degrees of informativeness to the models' predictive distribution. Understanding this difference is especially important in the context of knowledge conflicts between memorized and contextual information.
Authors propose disentangling context influence in terms of "persuasion", i.e. how impactful is the inclusion of the context for answers of a given query/entity pair, and "susceptibility", i.e. how much answers of a given query/entity pair are likely to be swayed by the presence of context, and operationalize these concepts using information-theoretic measures akin to mutual information.
The two metrics are validated using a synthetic dataset sourced from a knowledge graph. Analysis shows that:
- The degree of persuasiveness of relevant contexts increases with the increase of model size (interesting implications here for the jailbreaking of LLMs!)
- assertive contexts tend to be more persuasive for closed queries (yes/no) and mid-sized models
- Negation affect context persuasiveness
- Familiar entities (explored as real vs. fake, more frequent in training data and more connected in the KG) are less susceptible to context influence
Finally, authors suggest applications of the persuasion/susceptibility framing for social science analyses and gender bias evaluation.
💻 Code: https://github.com/kdu4108/measureLM
📄 Paper: Context versus Prior Knowledge in Language Models (2404.04633)
🔍 All daily picks: https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
This work examines the influence of context versus memorized knowledge in LMs through the lens of the shift caused by contexts at various degrees of informativeness to the models' predictive distribution. Understanding this difference is especially important in the context of knowledge conflicts between memorized and contextual information.
Authors propose disentangling context influence in terms of "persuasion", i.e. how impactful is the inclusion of the context for answers of a given query/entity pair, and "susceptibility", i.e. how much answers of a given query/entity pair are likely to be swayed by the presence of context, and operationalize these concepts using information-theoretic measures akin to mutual information.
The two metrics are validated using a synthetic dataset sourced from a knowledge graph. Analysis shows that:
- The degree of persuasiveness of relevant contexts increases with the increase of model size (interesting implications here for the jailbreaking of LLMs!)
- assertive contexts tend to be more persuasive for closed queries (yes/no) and mid-sized models
- Negation affect context persuasiveness
- Familiar entities (explored as real vs. fake, more frequent in training data and more connected in the KG) are less susceptible to context influence
Finally, authors suggest applications of the persuasion/susceptibility framing for social science analyses and gender bias evaluation.
💻 Code: https://github.com/kdu4108/measureLM
📄 Paper: Context versus Prior Knowledge in Language Models (2404.04633)
🔍 All daily picks: https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9