Trust Calibration

Canonical

Agentic UX

Confidence

67%

Cognitive Load

Low

Evidence

production validated

Impact

feature

Ethical Guardrail

Never present all outputs with the same confidence level. Never use vague language like pretty sure. Never hide low-confidence outputs behind optimistic framing.

Design Intent

Trust is not binary -- it must be calibrated to reality. Over-trust leads to blind acceptance of errors; under-trust leads to constant manual overrides. Trust Calibration dynamically signals exactly how much the user should trust any given agent output.

Psychology Principle

Trust is not binary -- it must be calibrated to reality. Over-trust leads to blind acceptance; under-trust leads to constant overrides.

Description

Dynamically signal exactly how much users should trust each agent output using visual weight, language, and evidence linkage.

When to use

Every agent output or recommendation -- especially when confidence varies across outputs.

Example

Claude Response: High confidence (92%) -- pulled from 8 of your past projects. Low confidence on legal nuance -- recommend human review.

Autonomy Compatibility

SuggestConfirmAuto

Behavioral Objective

Users apply the appropriate level of scrutiny to each agent output based on the visible trust signal.

More accurate final decisions
Reduced over-reliance and under-reliance
Faster development of accurate mental model of the agent

Target Actor

role

Everyday user

environment

Mixed-confidence AI recommendations

emotional baseline

Needs to know when to trust vs. verify

ai familiarity

medium

risk tolerance

medium

Execution Model

signal_design

Use consistent visual + textual cues for confidence.

User cannot quickly tell confidence level.

evidence_linkage

Tie the signal to specific reasons.

Signal feels arbitrary.

adaptive_learning

Let the system improve its calibration based on user feedback.

Calibration never improves for the individual user.

Failure Modes

Signals are ignored over time

Keep them subtle yet consistent and actionable

micro

Over-confident language on uncertain outputs

Always be conservatively honest

micro

Low-confidence outputs are hidden

Surface them with clear warnings

feature

Calibration feels static

Personalize over time with user feedback

feature

Cultural differences in trust signaling

Allow user preference for signal style

feature

Agent Decision Protocol

Triggers

Any output or recommendation is generated
Confidence is not 100%
User has previously corrected similar outputs

Escalation Strategy

L1: Diagnose the failing element via behavioral_signals

L2: Nudge -- adjust copy, timing, or visual salience

L3: Restructure -- simplify flow, add progressive disclosure, restructure form

L4: Constrain -- lock Autonomy Dial to confirm_execution, add Strategic Friction

L5: Yield -- flag for human designer or domain expert review

Example

Agent suggests a section change -> Medium confidence (64%) -- limited historical matches in your domain. Review recommended.

Behavioral KPIs

Primary

Alignment between signal and user scrutiny level
Accuracy of final decisions after calibrated outputs
User calibration score

Risk

Over-acceptance of low-confidence outputs
Over-rejection of high-confidence outputs

Trust

User-reported I know exactly when to trust the agent
Autonomy Dial adjustments based on trust signals

Decay Monitoring

Revalidate when

Model accuracy or data quality changes
User expertise level evolves
New output types are introduced

Decay signals

Misalignment between signals and user behavior
Drop in user trust accuracy
Feedback that the confidence signals don't match reality

Pattern Relationships

Supports

confidence-signaling explainable-actions

Amplifies

strategic-handoff intent-preview

Requires

mental-model-alignment audit-trail

Conflicts with

uniform confidence presentation

Related Patterns

confidence-signaling explainable-actions strategic-handoff

Canonical Implementation

Claude Response: High confidence (92%) -- pulled from 8 past projects. Low confidence on legal nuance -- recommend human review.

Telemetry Hooks

trust_signal_shownuser_scrutiny_appliedcalibration_feedback_given

Design Intent

Psychology Principle

Description

Autonomy Compatibility

Behavioral Objective

Target Actor

Execution Model

signal_design

evidence_linkage

adaptive_learning

Failure Modes

Agent Decision Protocol

Behavioral KPIs

Decay Monitoring

Pattern Relationships

Related Patterns

Canonical Implementation

Telemetry Hooks

Tags