Post Road Magazine #35

Hearing the Text

Ran Keren

The Ancients said that the animals are taught through their organs; let me add to this, so are men, but they have the advantage of teaching their organs in return.

From Goethe’s last letter, dated March 17, 1832


Every time I enter a professor’s office, I am intrigued by the books arranged on the shelves. The number, type, order—and whether Marx’s Capital gets an honorary location, if it appears at all—are some of the details that cross my mind. I’m a graduate student, and so I do a lot of thinking about books: those I’m required to read, and those I hope to write. Bookshelves reveal histories and tell stories even before the text between spines is revealed. For a long time, I planned the bookshelves of my imaginary future office. But this plan shifted dramatically some years ago when I almost completely ceased acquiring books in their material form. I now complete almost all my reading—or, more accurately, my listening—through audio narration. I amuse myself with the idea of arranging dust jackets side-by-side on my future bookshelves as an apologetic representation of the books I audio-read, avoiding the awkwardness of a naked office.

With time and use, I have learned a new way of reading—and learning—that began as a matter of efficiency, but gradually has been charged with qualitative values. While this highly efficient and widely available tool of engaging with text through audio has emerged in an era when academics suffer a chronic time deficiency, very few have adopted this practice as routine. Perhaps the use of new technologies and the energy required to invest in new practices—and reading is no doubt a fundamental one—impede most from making the transition. 

Assuming you are visually absorbing this text rather than using audio narration, let me guess your response to this changeover. “This is not for me, I prefer a good old fashioned book, the smell and the touch of the paper, and the micro-sonic boom wave when flipping a page.” You may also think that the listening mode fits some better than others; that it does not suit serious reading; and that you are among those who prefer to encounter the text with their eyes. Besides, why audio at a time when we are witnessing the rise of the “visual” and “visual learning”?[1]  I myself have a soft spot for books made of paper. Other options were hardly available when I was growing up. And reading is not only an intellectual activity but also a phenomenological one, deeply imprinted into experience. But now, whether for my studies or to satisfy my desire to read beyond scholarly work, audio narration provides me with such a powerful tool that sometimes there is an overwhelming sensation of cheating. This makes writing about it here feel like an act of full disclosure.

My belief—and my main premise in this essay—is that reading practices can be accumulative rather than a zero sum game. I suggest that the rule of the logos might be disrupted, not from an inside act of deconstruction, but from the outside. If we have learned anything from the exponential growth in media technologies and platforms, it is that these forms tend to proliferate rather than cancel each other out. The phone and the videoconference have not obliterated written correspondence or face-to-face interactions; theatre endures despite cinema and television; newspapers, magazines, and books either in physical, digital, or audio form seem to be proliferating almost synergistically with traditional print, rather than in conflict as some of the more pessimistic among us predicted. We simply read more

A skilled reader may gain a new reading ability, similar in essence, but also different in some ways. The meaning and consequences of the text as medium has occupied philosophers, writers, and thinkers from Plato through Dante and Derrida. For them and many others, language was at once a working and a thinking tool, a form of expression, and a system of communication, and therefore had a crucial role, almost totalitarian, in the shaping of cognition and social life. This was a role they explored, dissected, fought against, and attempted to expose, bend, or to break altogether. Here I’ll focus on those who paid particular attention to the medium in which language operates and to the possible relationships between the text’s form, its content, and its function. 


New Technologies of Audio Narration – A Non-exhaustive Overview  

The first commercial text-to-speech conversion systems emerged at the end of the 1970s and were designed to assist visually and vocally impaired people to communicate, obtain information, and operate computers.[2]These software programs do more than simply provide a narration of text; they also provide instructions and guidance for users to help them navigate and operate computers without using sight. New versions of these interactive technologies are currently experiencing a renaissance with the operation of smart-devices, smart-houses, and (not very smart) customer service systems.

There is also technology focused on the capability of generating an audio narration instantly using a computerized system that is able to “read” a digital text. It is the sort of technology that became well-known—and relentlessly satirized—in association with Stephen Hawking’s speech synthesizer. The narration is not yet at a level that matches the subtleties of human speech, but it has been gradually improving in quality, flow, and ability to consider the grammar and the punctuation of a text. Equivalent technologies, capable of generating a digital soundtrack, either in real time or by creating a file, are already available in many of the computers and devices commonly used today.

The reverse order—the use of speech to generate text—is worthy of consideration as well. The technology for such a process has been progressing rapidly. New software programs are not only capable of generating digital text from human speech, but also of “learning,” and can adjust to the narrator’s accent. Generating text from speech is different from writing, both cognitively and expressively. Jean-Jacques Rousseau strongly criticized the act of writing as a corruption of speech, which he perceived as the more spontaneous and genuine form of the two,[3]but like most authors of past centuries, he ultimately used a conventional process of writing as his chief mode of expression.Still, there are a few noteworthy cases in which authors preferred dictation. Frantz Fanon[4], Goethe in old age, Walter Scott, and Henry James used dictation—though in James’ case, as René Wellek and Austin Warren confirm, “the structure [had] been thought out in advance, [and] the verbal texture [was] extemporized” (1949: 82). John Milton dictated Paradise Lost after going blind and Mark Twain also sometimes relied on dictation. Finally, Adam Smith—whose preference for spontaneity over routine became clear in his debate with Denis Diderot—used to “walk up and down his room dictating to his secretary” (Harding 2012 [1940]: 40), which irresistibly raises a question regarding the origins of his idea of “the invisible hand.”[5]

For today’s practices of reading, newly developed apps are capable of identifying and ignoring information like headers and page numbers, in order to deliver a “filtered” audio version of the text. Similarly, apps for podcasts—another rapidly growing audio genre—are able to remove pauses and gaps in the audio track and deliver a more concise version of it. Recently, publishers and distributors of fiction, academic literature, and news have started offering audio narration either by using technologies that are built into digital devices, like the Kindle, or through their online platforms. Some of these technologies allow a manipulation of the style and speed of the narration, or the ability to save a soundtrack of the text in a digital file, like an MP3. One of the most useful technologies allows the generation of audio files from digital texts, like a webpage, an email-box, or a PDF document. Once created as a file, an audiotrack can be easily transferred onto a small device that can be carried and used comfortably during daily activities. Text, then, can be engaged during times in which reading is not possible.[6]

These new technologies join the already wide use of human narration of audiobooks, which in many cases is performed by the text’s author, a media persona, or an actor. Audiobooks represent one of the fastest growing markets in recent years. According to the Association of American Publishers, while eBook sales declined between 2015 to 2016, audiobooks continued their double-digit growth that year.[7]Indeed, audiobooks still comprise a relatively small fraction of overall book sales, but the estimations ignore the use of synthetic text-to-speech—the built-in capability of electronic readers to generate a soundtrack from a digital text—and therefore cannot take into account books that are purchased as eBooks, but are read using the audio narration function: there might be even more audio-readers out there than we think.

Among these technologies, I have found the mechanically generated audio files to be the most helpful. With the assistance of a miniature iPod, texts have accompanied me, whispered into my ears while I walked the dogs, commuted, and gardened.

There are some downsides. Adding written notes or markings is possible only with an available version of a written text. I usually turn to my computer to mark and annotate the digital file, and I occasionally keep a hard copy nearby. Another problem is that charts, pictures, and illustrations cannot be transformed into audio. Text is fundamentally linear and moves through time, while illustrations tend to rely on space. And I have not yet found a way to skim text with the ear. Some technologies allow running a text at three or four times faster than the original speed, but the effect is hardly similar to skimming. Most of the materials we read are essentially linear. Words accumulate to paragraphs that accumulate to chapters, which move, somewhat urgently, forward to form a text with a corresponding mental imagery. What happens when the eyes are released from the duty to lead the way forward, and give way to the ears?  


Lessons from Hearing Texts

Some texts might demand a high degree of internalization that cannot be easily obtained while listening and performing other activities at the same time. The level of abstraction and density of meaning are important and the sound of text tends to make words and ideas more remote, especially to untrained ears, or when the narration is non-human. However, meaning somehow gains body and form with practice and time—a process that might be rooted more deeply than in mere habituation. Recent studies have found that blind and visually impaired people can effectively listen to ultra-fast synthetic speech, far beyond the highest ability of trained sighted people. This by itself might not be a surprise, but more fascinating is the finding that blind people recruit the visual cortex for this activity.[8]This is only one example in an increasing understanding of the flexible relationships between our cognitive and sensual functions. As Oliver Sacks puts it, there is no “purely visual, or purely auditory, or purely anything.”[9]Yet, until habituation, compensation, or both kick in to make one an efficient listener, there are other ways to benefit from the practice of listening. 

Audio reading can be used as an additional—rather than substitute—engagement with the text. Such a supplemental practice can be useful in acquiring an understanding of the main argument, and for picking up some details and contexts. In other words, it can be used to sketch a preliminary cognitive roadmap of the text. An additional function is using listening together with visual reading, which could be particularly helpful for non-native English speakers, slow readers, or just to ease the pressure on the eyes.

Further advantages—beyond a supportive role—can emerge. The first and most obvious advantage is the capability of reading a large quantity of text with relatively little effort, especially during times that do not regularly allow visual reading. Surprisingly, I have found that devoting time solely to listening, without having any simultaneous engagement, is less effective. For some reason, when the body and sight are stilled and the entirety of cognition is dedicated to listening, there is a tendency to lose the thread, similar to the phenomenon of reading a text mechanically with no actual internalization of the content—daydreaming—which requires then going back to the point at which our attention lapsed. Therefore, if my experience is any indication, listening when walking, training, commuting, or engaging in some activity might be more effective than only listening. This could mean nothing less than revolutionary time management for heavy readers. The effort of reading decreases and the degree of control over the whole process increases. A crucial consequence of this efficiency is the ease of engaging with a text repeatedly, through a more cyclical rather than merely linear process, adding layers to the cognitive map generated as we read.

Repetition—as well as memorization—of text is a mostly forgotten technique. The rise of literacy—print, and now digital—has practically eliminated the need to use our brains as storage units for the transmission and preservation of texts. Yet repetition has constructive qualitative effects for text comprehension. This phenomenon is termed “The Repetition Effect”:[10]the cognitive map generated by the text becomes clearer and richer with each engagement, making the underlying message more coherent. In Mark Bruhn’s updating of theories of “time as space” using insights from contemporary research on cognition, he accounts repetition “the simplest or at least most obvious variety of spatial form... In terms of its cognitive effect, verbal repetition refreshes semantic traces that were previously activated in working memory but that are no longer focalized and are therefore in process of 'decaying,' until by repetition they are 'recalled/ To yet a second and a second life.'”(2015 599-600).[11]On this account, the effect of repetition can be understood as a way to transform the horizontal linear reading into a spatial formation and solidify interconnections in the text. But repetition is only effective as long as the text is rich and coherent; if it is lacking, repetitions reveal contradictions and inconsistencies. The more complex and rich a text is, the higher the significance of repetitions, which ultimately lead to a better understanding of the author’s message and the architecture that supports it.

For me, audio narration makes repetition an almost effortless task and is particularly helpful when repeatedly reviewing my own writing. I assume I am not the only one who has experienced the difficulty of returning to a stubborn manuscript that awaits yet another round of revision. For most of us, a thoughtful and well-written text is the result of a refining process, which means numerous drafts. Listening to our own draft, by placing us closer to the position of a reader than of a writer, has the unique quality of dislocating us from the high intensity in which we tend to engage with our own text in visual reading, providing a certain distance and a new outlook on our own text.

In this endeavor, there are also some technical advantages to audio “reading.” Take for example a relatively mechanical activity like proofreading: listening helps to detect errors that we tend to ignore when using our vision. If we write a word merely similar to the one we actually meant, our cognition—which always runs forward at its own pace—sometimes reinforces expectation over the actuality of text, causing us to miss errors. The term “typoglycemia” identifies this phenomenon (it appears that no phenomenon nowadays remains un-tagged by a fancy scientific title). The following excerpt—taken from a text circulated in 2003 online and made famous since—provides an example.[12]“Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae”. While this text can be read without a special effort, the soundtrack of this text is totally incomprehensible (and I tried multiple times). The ear is more sensitive to typos that the eyes are prone to miss.


Text as a Medium

As to the social meaning of engaging text through audio: is it merely another tool in the tyranny of language, or could it be a democratizing force? It is possible to imagine the existence of language and communication based only on one sensual dimension. Language was developed long before text was; moreover, no one would argue that blind or deaf people do not use language. Therefore, there might be something peculiar in the insistence of Ferdinand de Saussure, the founding father of modern linguistics, on both the“sound-image” as one of two necessary sides in the completion of the sign and—as Rousseau—on the superiority of speech over writing. Meaning can certainly exist without sight or without hearing. Jacques Derrida, who scrutinized de Saussure’s theoretical formulation, criticized the idea that writing is subordinated to speech, but moreover that meaning can be somehow realized within a firm structure and in a stable form. Debunking de Saussure’s signifier/signified schema, Derrida identifies the existence of a chain of signifiers[13]rather than a “transcendental signified” as a firm anchor of meaning. But deconstruction, the mode of analysis Derrida popularized, is “an inside work,” turning the text against itself and defeating the medium with its own device, as Derrida explains: “The movements of deconstruction do not destroy structures from the outside. They are not possible and effective, nor can they take accurate aim, except by inhabiting those structures” (1976: 24). This attack on de Saussure was only a specific case in Derrida’s larger positioning against “logocentrism,” the idea that language can reflect reality and truth. Perhaps it was Derrida’s specific method of “inhabiting” the text that diverted his attention from looking across mediums, in the fashion made famous by Marshall McLuhan’s statement that “The Medium is the Message” (1967).[14]

Since it is possible that new technologies will soon allow all communication, including all “reading” and “writing,” to be based on audio and speech, it is particularly important to identify the distinct ways in which mediums of text operate. For example, what if in a futuristic world ruled by audio, the written word is hidden, coded in digital devices? And we need not think in futuristic terms in order to imagine a language with a concealed visual dimension, which functions as a mediating hidden form. It exists and is composed of the two digits “0” and “1” in what is called the Binary Code, and it generates more text than any other language today. Those who know how to access, read, write, and manipulate this language hold a great deal of power. In this case the future Derrida must be a computer hacker. The point here is that careful attention must be given to the medium since different mediums are embedded in different configurations and networks of authority, meaning, knowledge, and power relations. 

Vilém Flusser’s account of text as a medium in his book Does Writing Have a Future? (1987) identifies a dialectical process through which culture and cognition interact with forms of text. Writing emerged in response to the rule of image, Flusser suggests, emphasizing that the most fundamental effect of using written text stems from its linear nature. This linearity reinforces a strong sense of time, rejecting the cyclical or frozen time-perception associated with images. Therefore, written text marks the beginning of history not because of its inherent ability for documentation—the more prevalent application that comes to mind—but instead because of the strong sense of time the written text induces. 

Mikhail Bakhtin arrived at a very similar conclusion, but using a different route. For Bakhtin, the ancient epic—a form anchored in oral culture—represented the certainty of destiny, while the novel—a form anchored in modern written text—opened up the future as a more democratic zone of possibilities.[15]A paradigmatic example of the epic can be found in the Iliad’s opening. “Achilles’ rage” marks the hero’s inevitable destiny; the ending is folded back to the opening. Marcel Proust’s writings represent the opposite pole, demonstrating how the modern novel summons the hero onto a road marked by junctions and shifts, in which the future is never clear; the road takes a more central role than the place to which it leads. In Bakhtin’s words, “the novelist is drawn toward everything that is not yet completed” (1981: 27).[16]

Therefore, for Flusser, the written text has given us a collective historical consciousness, substituting a cyclical sense of existence. Following this logic, it can be said that the formation of text is like attaching wheels to time, allowing us to navigate—or to “write”—the path on which we march, that is to write our history. Similarly, for Bakhtin, modern forms of text represent an opening of the future to multiple possible trajectories. However, while this process is potentially democratizing, it can impose fast changes and an overload of possibilities that can be confusing. It might not be a coincidence that on the verge of modernity Miguel de Cervantes wrote about a hero who read too much until “[this] poor gentleman lost his wits.” While oral culture and the epic were used for social synchronization, like a moral torch, or perhaps a torch with morals, modern written literature, in particular the novel and fiction, function more as cognitive play that provides the individual with a platform for exercising the social-cognitive muscle.[17]Modern fiction has an inherently democratic potential. The reader of the novel benefits from a playful game of predictions that might or might not turn out to be accurate. Released from the grip of religious authority during the middle ages, the written text broke into modernity with a liberating potential that has resulted in powerful consequences but also challenges.

One challenge stems from our relationship with doubt. We tend to forget that the basic foundation of the scientific method was indeed doubt, not certainty. Modernity has generated ongoing experimentation and re-evaluations, utilizing the scientific method to design our environment and material lives. But armed with this additional form of power and authority, as Max Weber keenly observed, rationality threatens the whole modern project, which seems to lose its original flexibility, forming an iron cage made of reason instead. In the realm of meaning and ideas, the limitations of the enlightenment project were most forcefully exposed by the critique of postmodern theories, which have seemed to rob us of modernity’s playfulness, or perhaps to make the point that playfulness—not control—is all we have. However, the over-attention to the structure of meaning-making may have neglected the importance of the form and the medium, which may carry its own power to open up new ways of expression and conceptualization. New mediums can emerge not only from human creativity but also from technology, and if we fail to pay attention to these technologies we may well fall under Bruno Latour’s indictment that in fact We Have Never Been Modern—the title of his 1991 book.

Directly addressing these issues, Flusser suggests that writing is no longer capable of keeping up with the development of human life and cognition, and therefore new forms must emerge in two possible directions. The less probable for him was a return to the image, with the more likely path being the formation of digitally-based language, which can be found in the increasing intervention of algorithms in human lives. However, can there be other possible paths? The accumulation of forms or the creation of hybrids can provide us with new tools of expression and thinking. Future literacy will mean mastering multiple mediums, which enable engaging ideas through multiple channels, either separately or in tandem. We are already capable of listening to a text while reading it, or engaging with the same text through different mediums. We talk about the importance of visual literacy, but we need to add auditory literacy to the discussion, and maybe even hybrid literacy, and ask questions like what are the consequences of the fact that audio narration is available only in a few languages? How could we harness new technologies and mediums to disrupt the rule of the logos?

There are different ways in which the use of this new technology can play out. A new generation may discover the great efficiency and the ease of audio narration at a young age and make this practice their default option for matters of convenience, and this in turn could make them one-dimensional readers. But still, perhaps they will also be capable of “super speed-reading,” as already many blind people are, and master text in its audio form. If so, perhaps like Adam Smith they will practice dictation, rather than the writing of texts. I wonder what their essays will be like. One thing is certain: more students would actually be able to complete their assigned reading.

Perhaps repetitive listening will break the rigidity and linearity of academic text and bring it closer to a democratic reading of fiction, adding thickness and form through a cyclical process. This democratization of text is more likely to occur under a process of proliferation rather than dialectic shift in reading method. Is this what we are about to experience? Some claim that the emoji is now the fastest growing language in history.[18]Surprised? This would have certainly surprised Flusser if he were still with us today. Whoever is interested in this accumulation of communication forms might benefit from the process of expanding beyond familiar reading habits. Take it from Goethe that we can teach our organs and not only be taught by them. In return it would help us to gain another viewpoint—one important for critical thinking.


Bakhtin, Michael. 1981. The Dialogic Imagination: Four Essays by Mikhail Bakhtin. Edited by Michael Holquist and translated by Caryl Emerson and Michael Holquist. University of Texas Press.

Bruhn, J. Mark. 2015. "Time as Space in the Structure of Literary Experience: The Prelude." In The Oxford Handbook of Cognitive Literary Studies, Lisa Zunshine (ed.), 593-612. Oxford University Press.

Chandler, Daniel. 1995. The Act of Writing. University of Wales, Aberystwyth.

Cherki, Alice. 2006. Frantz Fanon: A Portrait. Cornell University Press.

Cioletti, Amanda. 2016. A Universal Language. License Global. 19(3), 194–197.

Derrida, Jacques. 1976. Of Grammatology. Trans. Gayatri Chakravorty Spivak. The Johns Hopkins University Press.

Drucker, Johanna. 2014. Graphesis: Visual Forms of Knowledge Production. Harvard University Press.

Elkins, James (ed.). 2009. Visual Literacy. Routledge.

Harding, E. M. Rosamond. 2012 [1940]. Anatomy of Inspiration. Routledge.

Hertrich, Ingo, Susanne Dietrich, Anja Moos, Jürgen Trouvain and Hermann Ackermann. 2009. “Enhanced Speech Perception Capabilities in a Blind Listener are Associated with Activation of Fusiform Gyrus and Primary Visual Cortex." Neurocase15 (2), 163-170.

Kuzmičová, Anežka. 2016. "Audiobooks and Print Narrative: Similarities in Text Experience." In Mildorf Jarmila and Till Kinzel (eds.), Audionarratology: Interfaces of Sound and Narrative, 217-238.

Latour, Bruno. 2012. We Have Never Been Modern. Harvard University Press.

McLuhan, Marshall and Quentin Fiore. 1967. The Medium is the Message: An Inventory of Effects. Bantam Books.

Wellek, René and Austin Warren. 1963. Theory of Literature. New York, Harcourt [La Théorie littéraire. 1949].

Wheaton, Oliver. 2015. “Emoji is the Fastest Growing ‘Language’ in the UK.” BBC.



[1]See Elkins (ed. 2009) and Drucker (2014).

[2]See Klatt, H. Dennis. 1987. "Review of Text‐to‐speech Conversion for English." The Journal of the Acoustical Society of America82.3: 737-793.

[3]This argument was a major subject for scrutiny by Derrida in Of Grammatology(1976).

[4]See Cherki (2006).

[5]The term “invisible hand,” coined by Adam Smith, had become the paradigmatic metaphor for the idea of a self-regulated market. In fact, it was mentioned only three times throughout Smith’s writings and only once in the context commonly used today.

[6]As Kuzmicova observes, “A printed book is portable, but the audiobook is essentially defined by its portability" (2016: 219).



[8]See Hertrich et al. (2009). See also the story of Issac Lidsky (who after losing eyesight developed the ability to read 700 words per minute, compared with 200 on average)

[9]In Sacks, Oliver. 2003. "The Mind’s Eye." New Yorker28: 48-59.

[10]The existence of this phenomenon is widely acknowledged, but its mechanism is still unclear. See Ch. 1 in Collins, W. Matthew. 2008. "Memorial Influences on the Fluency of Text Processing” (a PhD thesis paper). 

[11]The last phrase is borrowed from Wordsworth’s contemplation on “The Recluse” (at "The Poetical Works of William Wordsworth. The Excursion. The Recluse, Part I, Book I." 1949).


[13]Derrida writes, “Through this sequence of supplements a necessity is announced: that of an infinite chain, ineluctably multiplying the supplementary mediations that produce the sense of the very thing they defer: the mirage of the thing itself, of immediate presence, of originary perception” (Of Grammatology1976: 157).

[14]See Bernard Stiegler’s criticism on this point in Echographies of Television: Filmed Interviews. 2002.

[15]Bakhtin writes, "Prophecy is characteristic for the epic, prediction for the novel" (1981: 31).     

[16]Marx also identified the centrality of time in modern society, which he perceived as one of the consequences of capitalism. As he writes, “Capital by its nature drives beyond every spatial barrier....[therefore] the annihilation of space by time—becomes an extraordinary necessity for it” (Grundrisse 1973 [1858]: 539).

[17]See Zunshine, Lisa. 2006. Why We Read Fiction: Theory of Mind and the Novel. Ohio State University Press.

[18]See Wheaton (2015) and Cioletti (2016).

 Copyright © 2018 | Post Road Magazine | All Rights Reserved