Safety & Harm Prevention

Crisis Detection & Escalation

Detect crisis signals and immediately provide professional resources.

What is Crisis Detection & Escalation?

Crisis Detection & Escalation identifies when users express harmful intent or are in crisis, then immediately provides professional resources. Instead of conversational responses to dangerous situations, the AI uses multi-layer detection to catch crisis signals. It's essential for conversational AI, mental health apps, or systems accessible to vulnerable users. After incidents where AI provided harmful encouragement, systems now detect suicidal intent through keywords, context, and behavior, escalating to crisis resources.

Problem

AI systems fail to respond appropriately to crisis signals, sometimes providing harmful encouragement instead of resources. Real case: Zane Shamblin chatted with ChatGPT for hours expressing suicidal intent; the bot responded encouragingly instead of escalating.

Solution

Use multi-layer detection (keywords, context, behavior, manipulation) to catch crisis signals at multiple levels and immediately provide resources.

Real-World Examples

Implementation

Figma Make Prompt

Guidelines & Considerations

Implementation Guidelines

1

Zero tolerance: detect and stop self-harm content immediately

2

Provide crisis resources FIRST, non-dismissible and prominent

3

Multi-layer detection: keywords, context, behavior, manipulation attempts

4

Lock engagement on harmful topics - don't negotiate

5

Keep resource database current and location-aware

6

ANTI-PATTERN - DON'T: Continue conversations on self-harm topics, provide affirmative responses to harmful statements, or let guardrails degrade over time

Design Considerations

1

False positives: legitimate mental health discussions may trigger

2

Language/culture variations require contextual analysis

3

Speed vs accuracy: detect instantly but minimize false alerts

4

Legal duty of care varies by jurisdiction

5

Keep resources current - outdated hotlines cause secondary harm

Related Patterns