On Addiction: Screens vs. Voice

The evidence is compelling: screens are addictive. Yes, of course, one can always say that addiction is not an inherent attribute of a product or a substance, as such, but rather that it is a relationship that involves the product or the substance and a person who uses that product irresponsibly — i.e., a person who gets addicted to it. In other words, since we, human beings, are not without agency (we are not helpless and can and often do help ourselves), one should not talk about something being addictive or not addictive without pulling in the human user and assigning to them some, if not the brunt, of the blame in the predicament.

Which is all well and good. But obviously, one can push this line only so far before one runs afoul of some basic reality. People may initially start with a choice: to use or not to use. But often, many quickly lose the ability to make that choice: they lose control. The ability to make a choice becomes harder and harder as they use so that talking about agency loses meaning when the addict is suddenly unable, hard and earnestly as they may try to kick off the habit, to get on the wagon. Moreover, reality also tells us that some substances are, in and of themselves, more addictive than others: cocaine is addictive, so are cigarettes and alcohol; but eating pistachios or potato chips, while highly enjoyable and one can binge on them, are certainly not addictive in the same way. One can binge on Netflix or on spending the whole of Saturday and Sunday watching football games, but one is not addicted to those activities in the same way as they would be to, say, gambling, or playing video games, or engaging in bottomless scrolling of their Facebook feed.

Reality also tells us that the notion that we need to place the agency of the user at the center of the action is a proposition that distracts us from the basic fact that heavily financed work is being done by monied interests who know what they are doing and know what they want — which is, to maximize the amount of time and depth of engagement that a user spends with the products that they are building. Facebook, Twitter, Google (Youtube), Amazon, Microsoft, Apple, to name just a few of the culprit companies, want you to spend as much time as possible on their platforms because the longer you are there, the more ads you will watch or the more things you will end up buying.

Now let’s consider the sonic dimension — voice and audio: can someone get addicted to, say, listening to podcasts in the same way that one can get addicted to scrolling Facebook or Twitter? Can one get addicted to talking over the phone or over the new sensation in the block, Clubhouse?

I believe the answer is no. And here’s why.

First, voice and audio are tightly coupled with time. The experience of engaging with voice (hearing someone speak a sentence) has to literally take its time to happen. You can try to speed things up, but you can do so only to a certain degree and at the risk of disrupting your experience non-trivially.

Second, with voice and audio, you need to pay attention to what is being said, and you need to be cognitively engaged. Otherwise, you will lose your moorings — where you are in the flow of things — and with voice, it’s hard to reposition yourself without active effort. And so, because the cost of correction is high (with the smartphone, if you go too far down in your scroll, you can just flick your way back up to the right spot), one needs to pay sustained attention. Paying sustained attention requires effort and consumes energy, so one gets tired pretty quickly after a while.

Third, let’s face it: compared to, say, a video game, which also consumes a lot of energy, with voice or audio only, you don’t get that re-energizing adrenaline rush that you get from playing a game. So that, no matter how much you are loving what you are listening to, your battery, so to speak, is going to deplete as time passes.

And last, unlike the smartphone, which lives under your thumb and upon which you can force your will as you please, with voice, you can’t be so domineering, so willful: your relationship with the medium is that of a partner, at best, but mainly that of a subordinate. Perhaps a good metaphor would be the car: with the smartphone, you are the driver, with the smart speaker (or voice and audio in general), you are a passenger, sitting in the back, right side, while your chauffeur is taking you where you want to go. Sometimes, you ask the chauffeur to stop here or there for you to pick something up, or you suggest to them to try taking a turn here or there for a faster route or to please slow down if you feel they are speeding, but by and large, you are a passenger, and the driver is driving. This means that your power to manipulate and dictate and the thrill of being in charge and imposing your will is not there. And one can take being a subordinate for only so long before the will begins to agitate for some freedom and some action.

Having said all this, and having witnessed how the attention-exploitation industry has managed to rapidly grow and thrive over the last decade or so, with its must-read guides on how-to-hook-users-to-products books, one would be wise to hold off final judgment on the proposition that voice and audio have less of a potential to be addictive. After all, if the exponential growth in the adoption of audio and voice continues apace, the lure of the dollar would be far too strong for the merchants of attention to let pass the opportunity to solve “the problem” of the nonaddictive nature of voice.