Designing AR Interfaces That Don't Suck

A Guide to Human-Centered Augmented Reality

Jun 30, 2025

Meta Unveils Orion AR Glasses Prototype and Quest 3S | VT News

AR visors represent the next evolution of computing—when computers become truly three-dimensional and seamlessly integrated with our daily lives. But designing interfaces that people will actually want to use requires understanding how human vision works and respecting our natural attention patterns.

Understanding Human Vision

Before diving into design principles, we need to grasp some key aspects of how we see:

The Fovea is the small central area of the retina responsible for sharp, detailed vision. It's what you're using to read these words right now—only about 2 degrees of your visual field.

Saccades are the rapid eye movements we make 3-4 times per second to redirect our fovea to different points of interest. Microsaccades are tiny movements that occur even when we think we're staring at one spot, and they can reveal what we're subconsciously interested in looking at next.

Peripheral vision surrounds the fovea and, while poor at detecting details and colors, is extremely sensitive to movement and brightness changes.

Learn more from Tobii the eyetracking company: tobii.com/resource-center/learn-articles/types-of-eye-movements

The Notification Revolution

Traditional notifications grab attention by suddenly appearing—often rudely interrupting whatever you're doing. AR visors need a more sophisticated approach that respects human attention patterns.

Peripheral Placement: Notifications should gently slide into peripheral vision rather than appearing directly in your foveal view. Sudden appearances trigger reflexive attention shifts that break concentration.

Brightness Hierarchy: Since peripheral vision can't distinguish colors well but is sensitive to brightness, brighter notifications can signal higher importance while dimmer ones indicate lower priority.

Spatial Coding: The position of notifications in your 360-degree visual field can indicate their source—messages from your partner might always appear in the upper right, while work notifications slide in from the upper left.

Distance Matters: High-priority notifications can edge closer to your central vision, making them easier and faster to focus on when you choose to engage.

Respecting the Hierarchy of Engagement

The key principle is letting users choose what to focus on rather than forcing attention. When you want to check a notification, you perform a saccade (quick eye movement) to bring it into your foveal vision. This maintains user agency over their attention.

The Balloon Technique: For content that needs to expand (like videos), provide a central reference dot for the user's fovea to rest on while the content grows around it. This prevents the jarring experience of chasing expanding borders.

Binocular Mirroring: Unlike Google Glass's single-eye approach, true AR visors should display information to both eyes simultaneously to avoid the brain strain that comes from processing conflicting visual information.

General Example of Hierarchy of Attention - adapt to your app specifically yet also keep context of your users’ life.

Interaction Challenges

Engaging with AR interfaces presents unique challenges. Traditional solutions like constant blinking, winking, or gross motor movements (large arm gestures like in "Minority Report") break the core promise of AR—augmenting reality without detracting from it.

Microsaccade Inference: By analyzing the direction and distance of microsaccade patterns, we might infer user intent before they consciously direct their attention somewhere.

Pupil and Eyelid Tracking: Pupil dilation and eyelid behavior can provide additional context about user interest and intent.

Text Input Dilemma: This remains unsolved. Virtual keyboards are clunky, while Speech to text is not private, error prone and annoying for everyone. Even the interesting Project Orion neural wristband still will need a kind of interface layer to draw characters, select from an onscreen option or other methods.

Subvocalization(moving vocal cords without making sound) is promising but underdeveloped, and EEG-based thought readingremains science fiction. The companion smartphone may handle text input for several generations.

The Audio Component

Visual AR is only half the equation. Bone conduction audio (like Google Glass used) transmits sound through skull vibrations rather than air, but many find it uncomfortable and tickling.

The future lies in personal audio bubbles—technology that creates controlled sound spaces around the user's head, delivering private audio without blocking environmental sounds or requiring uncomfortable bone conduction.

The Bigger Picture

AR interface design isn't just about placing digital elements in 3D space—it's about creating a computational management layer that fades into the background. The interface should feel natural and unburdening, leaving maximum mental energy for exploring content and engaging with the augmented world.

Spatial computing might sound like it's all about 3D positioning, but the most important aspect is spatio-temporal—knowing when and how to present information to avoid overwhelming users.

The companies that master this delicate balance between information availability and attention respect will create the AR visors people actually want to wear. Those that don't will join the pile of abandoned wearable tech gathering dust in drawers.

The future of computing is coming to our faces. Let's make sure it's a future we actually want to look at.

Spatial & Immersive Design

Discussion about this post