Do LLM Shibboleths Affect Calibration?
reading time: 0.28 mins
status: question
published: 2026-05-09
updated: 2026-06-29
A research question about whether ideological shibboleths, beliefs, or preferences in LLMs affect scoring and calibration.
Operative Question
Do LLMs typically demonstrate shibboleths, ideological beliefs, or stable preferences? If we train those out, does that improve calibration when the model scores arguments, predictions, actions, or evidence?
Hypotheses
Blank for collaborative planning.
Experiments
Blank for collaborative planning.
Results
Blank for results.
Notes
This should probably distinguish at least three things:
- surface shibboleths in language,
- deeper preferences or policy tendencies,
- scoring/calibration behavior on downstream tasks.