Why We Hate Being on Video Calls (The Science of Zoom Fatigue)

3 min read

What Zoom Did to the Face

The video call was supposed to be the closest thing to presence that remote communication could offer. You can see the person. They can see you. Facial expressions are visible, something voice calls cannot provide. When remote work became suddenly mandatory in 2020 for large portions of the professional population, video conferencing platforms absorbed billions of hours of human attention, and within months a collective complaint had crystallized: this is exhausting in a way that phone calls and in-person meetings are not. Researchers named it Zoom fatigue, and the explanation turned out to be more interesting than simple overuse.

The Mirror Problem

One of the most thoroughly documented contributors to video call fatigue is something that does not exist in any other communication format: the persistent visibility of your own face. In-person conversation does not provide this. Phone calls do not. But video calls typically display a thumbnail of the caller in a corner of the screen, and many people find themselves looking at it regularly throughout the call, monitoring their own appearance and expression in real time. Seeing your own face continuously triggers a kind of self-evaluation that would not otherwise be active. Research from Stanford's Virtual Human Interaction Lab, conducted by Jeremy Bailenson and colleagues, found that the presence of a self-view on video calls was one of the primary independent contributors to fatigue, emotional exhaustion, and reduced attentiveness over the course of a day of video meetings. When participants had their self-view disabled — a simple toggle available in most platforms — these effects were measurably reduced. The simple act of not seeing yourself while talking significantly decreased post-call fatigue.

Nonverbal Communication Under Constraint

Human nonverbal communication is designed for three-dimensional physical space. When two people share a room, the full system is active: posture, gesture, proximity, movement, gaze direction, micro-expressions, and dozens of other signals are continuously generated and interpreted without conscious effort. Video calls collapse this into a two-dimensional cropped rectangle, transmitted with slight latency, typically from a camera positioned at desk level rather than eye level. The result is a communication channel that looks like it should be rich but is systematically impoverished in ways that require compensatory effort. The brain spends real cognitive resources trying to interpret signals that are degraded, trying to generate signals that will read correctly through the medium, and managing the uncanny valley effect of a face that is present but not quite present in the way evolved social cognition expects. A tangent that connects this to older research: psychologists have long documented that sustained, motionless eye contact — the kind that video calls effectively enforce because both parties are looking at the camera — activates threat responses in humans and many other primates. In person, eye contact is continuous but not staring; it ebbs and flows naturally. On video, the gaze is fixed in a way that has no natural equivalent in embodied interaction, and this contributes to a low-level discomfort that accumulates across hours of meetings.

Cognitive Load and Temporal Delay

Even small amounts of audio and video latency — as little as one-tenth of a second — disrupt the natural flow of conversation in measurable ways. Research from University College London has documented that sub-200-millisecond delays are sufficient to alter turn-taking patterns, cause people to perceive their conversation partner as less attentive or less friendly, and increase conversational hesitation. The brain is calibrated for the immediate feedback of in-person speech and interprets even tiny delays as social signals — usually negative ones. The combined cognitive load of managing degraded nonverbal signals, compensating for latency, maintaining self-monitoring through the persistent self-view, and sustaining attentiveness through a screen rather than a shared physical environment adds up to something meaningfully more taxing than equivalent in-person interaction. This is not a character failing. It is a mismatch between the tool and the social hardware being asked to use it.

What Actually Helps

The research that has accumulated since 2020 converges on several practical interventions. Disabling the self-view reduces fatigue with essentially no cost to communication quality. Scheduling breaks between video calls rather than back-to-back meetings is more effective than most people expect. Audio-only calls for meetings that do not require visual collaboration reduce cognitive load substantially and, counterintuitively, often produce better conversational outcomes than video for purely verbal discussions. The option to walk during a call — to be on audio while moving through a physical space rather than sitting motionless in front of a camera — has been shown in multiple studies to improve both cognitive performance and mood during and after the call. The body's evolved expectation is that social interaction involves physical presence, and movement while talking is one of the closest available approximations to meeting that expectation. Zoom fatigue is real. It has a real explanation. And unlike most modern complaints about technology, it has specific, evidence-supported remedies that do not require abandoning the tool entirely.

Continue the Conversation with Coach Reeves

✓ Free · No signup required

Post on X Facebook Reddit