Trust Calibration
CanonicalConfidence
Cognitive Load
Low
Evidence
production validated
Impact
feature
Ethical Guardrail
Never present all outputs with the same confidence level. Never use vague language like pretty sure. Never hide low-confidence outputs behind optimistic framing.
Design Intent
Trust is not binary -- it must be calibrated to reality. Over-trust leads to blind acceptance of errors; under-trust leads to constant manual overrides. Trust Calibration dynamically signals exactly how much the user should trust any given agent output.
Psychology Principle
Trust is not binary -- it must be calibrated to reality. Over-trust leads to blind acceptance; under-trust leads to constant overrides.
Description
Dynamically signal exactly how much users should trust each agent output using visual weight, language, and evidence linkage.
When to use
Every agent output or recommendation -- especially when confidence varies across outputs.
Example
Claude Response: High confidence (92%) -- pulled from 8 of your past projects. Low confidence on legal nuance -- recommend human review.
Autonomy Compatibility
Behavioral Objective
Users apply the appropriate level of scrutiny to each agent output based on the visible trust signal.
- More accurate final decisions
- Reduced over-reliance and under-reliance
- Faster development of accurate mental model of the agent
Target Actor
role
Everyday user
environment
Mixed-confidence AI recommendations
emotional baseline
Needs to know when to trust vs. verify
ai familiarity
medium
risk tolerance
medium
Execution Model
signal_design
Use consistent visual + textual cues for confidence.
User cannot quickly tell confidence level.
evidence_linkage
Tie the signal to specific reasons.
Signal feels arbitrary.
adaptive_learning
Let the system improve its calibration based on user feedback.
Calibration never improves for the individual user.
Failure Modes
Signals are ignored over time
Keep them subtle yet consistent and actionable
Over-confident language on uncertain outputs
Always be conservatively honest
Low-confidence outputs are hidden
Surface them with clear warnings
Calibration feels static
Personalize over time with user feedback
Cultural differences in trust signaling
Allow user preference for signal style
Agent Decision Protocol
Triggers
- Any output or recommendation is generated
- Confidence is not 100%
- User has previously corrected similar outputs
Escalation Strategy
L1: Diagnose the failing element via behavioral_signals
L2: Nudge -- adjust copy, timing, or visual salience
L3: Restructure -- simplify flow, add progressive disclosure, restructure form
L4: Constrain -- lock Autonomy Dial to confirm_execution, add Strategic Friction
L5: Yield -- flag for human designer or domain expert review
Example
Agent suggests a section change -> Medium confidence (64%) -- limited historical matches in your domain. Review recommended.
Behavioral KPIs
Primary
- Alignment between signal and user scrutiny level
- Accuracy of final decisions after calibrated outputs
- User calibration score
Risk
- Over-acceptance of low-confidence outputs
- Over-rejection of high-confidence outputs
Trust
- User-reported I know exactly when to trust the agent
- Autonomy Dial adjustments based on trust signals
Decay Monitoring
Revalidate when
- Model accuracy or data quality changes
- User expertise level evolves
- New output types are introduced
Decay signals
- Misalignment between signals and user behavior
- Drop in user trust accuracy
- Feedback that the confidence signals don't match reality
Pattern Relationships
Amplifies
Conflicts with