Spoken Programming Languages?
Haskell may not be a language in the same way that French, Swahili, or Quechua are languages, but language both spoken and coded regardless have much in common.
However, traditional languages are much better adapted for speech than they are to writing; after all, we humans have been speaking for far longer than we have been writing. Language has been shaped enormously by what is and is not easy to convey with the sounds our mouths make, and text is at best a very lossy technique for recording those sounds, even if the content conveyed by tone, cadence, and emotion is largely lost.
If humans are naturally better adapted to speaking than writing or typing, could there perhaps be something useful or at least interesting to gain from exploring spoken code?
This kind of problem has arisen a few times before. Sometimes a blind person tries to write code. Sometimes it's someone with a wrist injury. While it's impressive that people can hack tools together to make such things work, I seriously doubt that pushing Java or Python through a screen reader is really the best possible implementation of such an idea rather than a crude implementation by someone who just needed something that worked without having to first solve the catch-22 of having to write a ton of code to solve the problem of not being able to write any code.
And would a spoken programming language really only benefit the disabled, or could such a technology benefit everyone? I personally think most clearly and deeply when pacing around a room, and in conversation I naturally talk a lot with my hands. Sitting for extended periods of time benefits no one's health, and many of those with the aforementioned wrist injuries acquired them from excessive typing.
Meanwhile many people find it perfectly possible to talk nonstop without impact to their health.
Another potential motivation is multisensory computing; using different senses uses different parts of the brain. Using more senses makes use of more of the brain. Neurological multicore!
I've made the argument in the past that your desire to listen to music while working probably isn't due to music making you more productive “just because”, but is instead because the brain is adapted for highly multisensory environments where ignoring a sensory domain makes it harder to find your next meal and easier to wind up as someone else’s. Music for the sake of productivity is junk food for the brain; it's complex enough to satisfy this itch but is otherwise useless, having no relation to your work.
This suggests a hybrid approach to programming with a mix of auditory and visual programming could perhaps prove superior to either on their own. At the very least, we'd increase the number of meaningful bits per second streaming into the brain and get a different set of tradeoffs to how they could be used.
While a spoken programming language is on my list of things I'd like to build eventually, I'm not here to announce such a project today (I may have an unrelated project to announce next week though). Rather, I'd like to at least spark some interest in this idea in my readers and provide some suggestions of what such a language may look like so as to not leave anyone in search of a new side project completely in the dark.
There are of course challenges that must be creatively solved for such a language. Not all of these are obvious. I'm sure someone could spend an incredible amount of time designing a pronounceable syntax only to find in hindsight that the real effort should have gone toward a more obscure problem. Pronunciation seems to me to be by far the simplest part of this whole thing.
The most obvious challenge in my view is the loss of periphery. A single glance at a text editor can portray the rough structure of a hundred lines of code in an instant. You might only comprehend the little bit of code your fovea is scanning in the current saccade, but the more peripheral parts of your vision still serve a crucial role in the efficient navigation of code.
Then again, ordinary text-based code provides a shockingly poor periphery as well; that hundred lines of code visible in your editor at any moment is probably a pretty small part of your entire codebase. No wonder people spend so much more time reading code than writing it; we're trying to piece together the appearance of a mountain by inspecting every stone one at a time.
So this is a place where old-fashioned textual code sucks too, it just sucks a bit less.
Another potential challenge is ambiguity and verbosity. Natural language makes use of a lot of contextual inference, and requiring the explicitness of your typical programming language might feel awkward and uncomfortable. I'd expect a spoken programming language would at a minimum make heavy use of type inference, etc.. It may even need to go above and beyond the usual features in this regard.
With all that said though, there's a vast and rich possibility space that audio offers. Any proper solutions to any challenges with spoken programming will be the result of exploiting this.
Degrees of Freedom
You could ask the computer to recite the definition of a function, or at least a line or two (in a spoken language, perhaps these are better thought of as verses?). But you could also ask for the type of a variable, ask if a function contains any if statements or loops, or a variety of other questions.
Another new freedom provided by speech is all the expressivity of speech beyond mere words. Does the intonation and cadence affect the meaning? How do stressed and emphasized words affect the meaning? They certainly affect the meaning in a conversation. What about when the computer reads the code back to you or replies to your questions? How does it use these features of language?
Then, what of the wide range of sounds beyond speech? Could there be useful sonifications of code? In the same way that syntax highlighting and indentation can help us visualize the structure of code, could the pitch, rhythm, and timbre of computer-generated noises portray such structure too?
How might this scale? Could you listen to the sound of a thousand lines/verses of code per second and still make sense of it?
What about noise and filtering? The human brain is pretty good at picking voices out of noisy environments, but would adding a little bit of meaningful background noise help? Different environments echo and absorb different frequency ranges, and this can help distinguish details about the room around you. Speaking in a cave, or an auditorium, or a forest, or your living room all sound noticeably different. There's a certain silence that comes with freshly fallen snow. Could some contextual information be portrayed by mimicking this effect with a little audio filtering? There's a lot of possibilities here.
Spoken programming is a challenge seeking an engineer possessing great creativity and lateral thinking. I hope I've highlighted well the potential that is available here, and I hope we can someday break out of the current ubiquity of C-style textual programming languages and get some much richer variety beyond the usual syntax quirks and type systems.
Thank you for reading this week's Bzogramming article. If you're someone interested in a project like this and are interested in more deep questioning on this subject, feel free to DM me on Twitter at @bzogrammer.
I'm trying this week to write two shorter articles rather than one long one. We'll see how that goes though, as I'll also be moving across the country this week too and will need to find time to write between 20 hours of driving.
If you enjoyed this article, and find ideas like this interesting, consider sharing and subscribing to my newsletter. Most articles are free, though the occasional paid article slips in from time to time. Paid subscriptions are $7 per month or $75 per year.