Do LLM Shibboleths Affect Calibration?

reading time: 0.28 mins

status: question

published: 2026-05-09

updated: 2026-06-29

A research question about whether ideological shibboleths, beliefs, or preferences in LLMs affect scoring and calibration.

Operative Question

Do LLMs typically demonstrate shibboleths, ideological beliefs, or stable preferences? If we train those out, does that improve calibration when the model scores arguments, predictions, actions, or evidence?

Hypotheses

Blank for collaborative planning.

Experiments

Blank for collaborative planning.

Results

Blank for results.

Notes

This should probably distinguish at least three things:

surface shibboleths in language,
deeper preferences or policy tendencies,
scoring/calibration behavior on downstream tasks.