Trust Calibration
What is Trust Calibration?
Users either over-trust or under-trust AI agents. Over-trust leads to passive reliance on inaccurate outputs where users stop checking and mistakes compound. Under-trust means users micromanage every action, defeating the purpose of delegation. Trust calibration is the design challenge of aligning a user's perception of the agent's reliability with its actual performance over time. Unlike one-time confidence scores, this is a relationship that evolves - the agent earns more or less trust based on its track record with that specific user. The pattern starts agents supervised with high visibility, shows per-domain track records, proactively repairs trust after mistakes, and offers autonomy upgrades only when earned. Trust builds slowly and breaks quickly, and the design must account for this asymmetry.
Problem
Users either over-trust or under-trust AI agents. Over-trust leads to missed errors; under-trust leads to micromanagement. Trust calibration aligns user perception of agent reliability with actual performance, but it evolves over time per domain.
Solution
Build appropriate trust through demonstrated competence: start supervised, show per-domain track records, celebrate milestones, proactively repair trust after errors, and only offer autonomy upgrades when performance warrants it.
Real-World Examples
Implementation
AI Design Prompt
Guidelines & Considerations
Implementation Guidelines
Never increase autonomy without asking. Even if the agent has been 100% accurate, the user should consciously opt into higher autonomy.
Make the agent's confidence visible, not just its outputs. 'I'm very confident about this' vs. 'I'm guessing here' helps users calibrate their own trust.
After errors, show corrective learning. 'I made an error with X. I've adjusted my approach - here's what I'll do differently.'
Provide a trust dashboard for power users - accuracy by domain, error log, escalation history.
Celebrate milestones: 'I've completed 100 tasks for you with 97% accuracy.' This reinforces appropriate trust.
Calibrate trust per domain - an agent might be reliable for scheduling but unreliable for financial analysis.
Design for trust asymmetry: trust builds slowly and breaks quickly. A single visible failure should trigger proportional, not total, trust reduction.
Design Considerations
Trust alignment score: do users' trust levels match actual agent performance measured by surveys vs. accuracy
Autonomy progression rate: how quickly users move to higher autonomy levels over time
Trust recovery time: after an error, how long until the user returns to the same autonomy level
Over-trust detection: users who stop checking outputs may need periodic trust recalibration prompts
Under-trust detection: users who reject accurate outputs consistently may benefit from track record visibility
Domain-specific trust scores require the agent to track performance separately for each task type
Proactive trust repair must feel genuine, not formulaic - the same apology repeated loses effectiveness