Why don't we talk to our computers?
The technology has arrived. Our comfort levels haven't caught up.
In Star Trek IV: The Voyage Home there’s a scene that I still remember well. Scotty, having been transported back to the mid 1980s, sits down in front of a computer and speaks to it. Nothing happens. He’s offered a mouse, picks it up and cheerfully holds it to his mouth.
“Hello, computer.”
Still nothing, until he’s offered a keyboard.
The gag works because it’s implying speech is the obvious interface of the future. Only primitive technology would still need us to type, right?
That movie was nearly forty years ago. We have natural language AI, speech recognition, microphones, earbuds, and powerful processing models. Most of that future is already here.
And when it’s time to interact with our computers, we still place our hands on the keyboard. To type, write, and edit. It feels kinda weird to talk to our computer, almost never the default (and we think it strange if someone else strays from that default). Especially outside our homes.
Speech technology hasn’t failed. We use it all the time. But it seems like it’s mostly in private. I tell Alexa to set a timer while I’m cooking. We change the music, adjust the thermostat, or dim the lights. All very domestic interactions, whispered between our own walls. Nobody’s watching, and the stakes are low. And nobody really cares what we’re saying to our appliance.
But the moment we move into shared spaces, we fall silent. Very few people dictate text messages on the train (and we often look strangely at the people who do). We’re not seeing people walking into a library and saying “summarize this article” in a clear and confident tone. Even in our most open-plan, designed-for-collaboration offices, everyone is quietly typing.
We’re speaking to machines at home. We’re typing to them in public.
So if it’s not a technological problem, it must be a cultural one.
Speech is expressive, and it’s social. Sometimes it can be as likely to reveal uncertainty as intent. Typing is private, and it lets us hesitate, erase. If I mistype a sentence it can be erased, but saying something dumb by accident can hang in the air. Speech is performing, typing is more concealed. And I think for most people, professional lives are more than a little concealed.
I’ve heard the argument that speech interfaces lack precision. Work needs exactness, and language is messy. But this age of AI interaction - where our engagement can be imprecise and metaphorical - seems to disprove that argument. We use adjectives, mood, fragments. It’s one of the biggest reasons why AI can feel approachable. We tell it what we want, not the specifics of how to do it. And the language models can cope by inferring, even guessing.
My guess is that speaking leaves us feeling more exposed. When we’re talking to a machine, other people might overhear. And that’s more embarrassing, somehow, than if we’re overheard talking to another person. We worry about sounding foolish - not to the computer we’re talking to, but to the real person who’s walking past at the same time.
Technological improvement isn’t going to help speech become the dominant interface. It needs our relationship to machines to change. I converse with people, I instruct objects. We avoid speech because computers still feel like tools. If they begin to feel like collaborators, speech might follow. But it might also raise further anxiety.
Scott’s joke wasn’t about the keyboard or the mouse. It was about a future in which speaking to a computer wasn’t just possible, it was entirely unremarkable. We’ve got that technology. We don’t have anything like that level of comfort. Talking to a machine still feels like talking to yourself. We can do it, we just don’t like to where anyone else can hear.
Further reading
Banks, David. Why we Are Uncomfortable Talking to Our Computers. The Society Pages (Univ. of Minnesota), Oct 2016.
Great Moments in Star Trek History - Hello, Computer (YouTube)
