aiux
PatternsPatternsCoursesCoursesNewsNewsResourcesResources
Overview

Foundations

  • What Is Conversational UI? (And What It Isn't)
  • Anatomy of a Chat Interface

Building

  • Building Message Bubbles in React
  • Typing Indicators & Streaming Responses
  • Suggested Prompts & Conversation Starters

Advanced Patterns

  • Managing Conversation Context
  • Error Handling & Fallback Design
  • Voice Interface Design Patterns

Ship It

  • Accessibility in Conversational UI
  • Putting It All Together - Architecture Checklist
  • Agentic Conversational UI - When AI Takes Actions
  1. Guides
  2. /
  3. Build a Conversational UI
  4. /
  5. Voice Interface Design Patterns
Advanced PatternsLesson 8 of 11

Voice Interface Design Patterns

4 min readConversational UI for DesignersUpdated Apr 2, 2026

Voice interfaces break most of the assumptions text chat makes. Responses need to be two or three sentences, not paragraphs; users will interrupt mid-reply; and screen-based feedback shifts to audio cues plus a pulsing-orb affordance. This lesson covers the patterns unique to voice.

Voice-Specific Design Principles

  • Keep responses short - 2-3 sentences max. If complex, chunk it: "Here's a quick answer. Want me to go deeper?"
  • Confirm before acting - In voice, the AI hears what it hears. For destructive actions: "I'll delete the Monday meeting. Should I go ahead?"
  • Handle interruptions - Users will cut the AI off mid-sentence. Stop immediately and listen - don't finish the response first.
  • Provide visual feedback - Even voice-first interfaces need visual cues: a pulsing orb while listening, a spinner while processing, text transcription of what was heard.

The Voice Interaction Loop

1

Wake / Trigger

User activates voice input via button press, wake word, or always-on listening.

2

Listening

Show visual feedback (pulsing animation, waveform). Capture audio.

3

Transcription

Show the transcribed text so users can verify what was heard.

4

Processing

Show "thinking" state. Keep it short - voice users are less patient than text users.

5

Response

Speak the response + show visual companion (text, card, image).

6

Follow-up

Offer next actions or stay in listening mode for follow-up.

Design each state explicitly. The transition between listening, processing, and responding should feel smooth, not jarring.

When Voice Needs a Visual Companion

Some information doesn't work in voice-only:

  • Lists longer than 3 items ("Here are the 7 restaurants near you..." - no one remembers all 7)
  • Anything with numbers, URLs, or code
  • Comparisons ("Option A costs $45/month with 10GB, Option B costs...")

For these, the voice says a summary and the visual shows the detail: "I found 7 restaurants nearby. Here they are on your screen." This is the pattern Siri and Google Assistant use - voice for the headline, screen for the data.

← Previous LessonError Handling & Fallback DesignNext Lesson →Accessibility in Conversational UI
← Back to Build a Conversational UI overview

On this page

  • Voice-Specific Design Principles
  • The Voice Interaction Loop
  • When Voice Needs a Visual Companion

aiux

AI UX patterns from shipped products. Demos, code, and real examples.

Have an idea? Share feedback

Get daily AI UX news

Resources

  • All Patterns
  • Browse Categories
  • Contribute
  • AI Interaction Toolkit
  • Agent Readability Audit
  • Newsletter
  • Documentation
  • Figma Make Prompts
  • Designer Guides
  • All Resources →

Company

  • About Us
  • Privacy Policy
  • Terms of Service
  • Contact

Links

  • Portfolio
  • GitHub
  • LinkedIn
  • More Resources

Copyright © 2026 All Rights Reserved.