Trust Calibration

Canonical
Agentic UX

Confidence

67%

Cognitive Load

Low

Evidence

production validated

Impact

feature

Ethical Guardrail

Never present all outputs with the same confidence level. Never use vague language like pretty sure. Never hide low-confidence outputs behind optimistic framing.

Design Intent

Trust is not binary -- it must be calibrated to reality. Over-trust leads to blind acceptance of errors; under-trust leads to constant manual overrides. Trust Calibration dynamically signals exactly how much the user should trust any given agent output.

Psychology Principle

Trust is not binary -- it must be calibrated to reality. Over-trust leads to blind acceptance; under-trust leads to constant overrides.

Description

Dynamically signal exactly how much users should trust each agent output using visual weight, language, and evidence linkage.

When to use

Every agent output or recommendation -- especially when confidence varies across outputs.

Example

Claude Response: High confidence (92%) -- pulled from 8 of your past projects. Low confidence on legal nuance -- recommend human review.

Autonomy Compatibility

SuggestConfirmAuto

Behavioral Objective

Users apply the appropriate level of scrutiny to each agent output based on the visible trust signal.

  • More accurate final decisions
  • Reduced over-reliance and under-reliance
  • Faster development of accurate mental model of the agent

Target Actor

role

Everyday user

environment

Mixed-confidence AI recommendations

emotional baseline

Needs to know when to trust vs. verify

ai familiarity

medium

risk tolerance

medium

Execution Model

1

signal_design

Use consistent visual + textual cues for confidence.

User cannot quickly tell confidence level.

2

evidence_linkage

Tie the signal to specific reasons.

Signal feels arbitrary.

3

adaptive_learning

Let the system improve its calibration based on user feedback.

Calibration never improves for the individual user.

Failure Modes

Signals are ignored over time

Keep them subtle yet consistent and actionable

micro

Over-confident language on uncertain outputs

Always be conservatively honest

micro

Low-confidence outputs are hidden

Surface them with clear warnings

feature

Calibration feels static

Personalize over time with user feedback

feature

Cultural differences in trust signaling

Allow user preference for signal style

feature

Agent Decision Protocol

Triggers

  • Any output or recommendation is generated
  • Confidence is not 100%
  • User has previously corrected similar outputs

Escalation Strategy

L1: Diagnose the failing element via behavioral_signals

L2: Nudge -- adjust copy, timing, or visual salience

L3: Restructure -- simplify flow, add progressive disclosure, restructure form

L4: Constrain -- lock Autonomy Dial to confirm_execution, add Strategic Friction

L5: Yield -- flag for human designer or domain expert review

Example

Agent suggests a section change -> Medium confidence (64%) -- limited historical matches in your domain. Review recommended.

Behavioral KPIs

Primary

  • Alignment between signal and user scrutiny level
  • Accuracy of final decisions after calibrated outputs
  • User calibration score

Risk

  • Over-acceptance of low-confidence outputs
  • Over-rejection of high-confidence outputs

Trust

  • User-reported I know exactly when to trust the agent
  • Autonomy Dial adjustments based on trust signals

Decay Monitoring

Revalidate when

  • Model accuracy or data quality changes
  • User expertise level evolves
  • New output types are introduced

Decay signals

  • Misalignment between signals and user behavior
  • Drop in user trust accuracy
  • Feedback that the confidence signals don't match reality

Pattern Relationships

Related Patterns

Canonical Implementation

Claude Response: High confidence (92%) -- pulled from 8 past projects. Low confidence on legal nuance -- recommend human review.

Telemetry Hooks

trust_signal_shownuser_scrutiny_appliedcalibration_feedback_given

Tags

agentic-uxtrustcalibration