Linguistics 001     Lecture 22     Reading and Writing

What is writing?

Writing is not language, but merely a way of recording language by visible marks.

      -Leonard Bloomfield, Language (1933)

Some version of this is clearly true, as we can see by looking at the history of the human species and of each human individual. Writing is an "optional feature" of human culture, consciously invented and developed in a few relatively recent societies, whereas every historically documented human group has always had a spoken language. Spoken language is learned in the cradle by all normal human children, without any apparent organized instruction, whereas writing is learned later in life, only by some, and through explicit instruction. There are good reasons to think that the human species has evolved to make spoken language more natural and more effective; there is no reason to think that biological evolution has been affected by the existence of writing.

Another way to express Bloomfield's point is to say that writing is parasitic on speech, expressing some but not all of the things that speech expresses. Specifically, writing systems convey the sequence of known morphemes in a real or hypothetical utterance, and indicate (usually somewhat less well) the pronunciation of morphemes not already known to the reader. Aspects of speech that writing leaves out include emphasis, intonation, tone of voice, accent or dialect, and individual characteristics.

Some caveats are in order. In the first place, writing is usually not used for "recording language" in the sense of transcribing speech. Writing may substitute for speech, as in a letter, or may deploy the expressive resources of spoken language in visual structures (such as tables) that can't easily be replicated in spoken form at all.

In the second place, writing systems may include some conventions that are substantially autonomous of speech. For example, Geoff Nunberg has argued that punctuation in English is not only or even primarily a representation of the phrasing and intonation of spoken English, but rather an autonomous system for indicating certain kinds of textual relationships. Words that are pronounced the same way ("homophones") may not be written the same way ("homographs"), and vice versa.

Still, Bloomfield was basically correct: writing is a way of using "visible marks" to point to pieces of real or hypothetical spoken language.

Types of writing

There are no pure systems of writing, just as there are no pure races in anthropology and no pure languages in linguistics.

-I.J. Gelb, A Study of Writing (1963)

We asserted above that writing systems convey the sequence of known morphemes in a real or hypothetical utterance, and indicate (usually somewhat less well) the pronunciation of morphemes not already known to the reader. This is true of all conventional orthographic systems, present and past, but these systems accomplish more or less the same thing in various rather different ways.

In discussions of writing systems, you will sometimes see typological terminology like the following, with particular writing systems given as examples of each type:

Type of writing Meaning
Pictographic Elements are pictures, combined in graphically-interpretable patterns (e.g. temporal sequence or spatial relationship)
Ideographic Elements denote ideas, combined in a logical fashion
Logographic Elements denote words or morphemes, combined morphosyntactically
Syllabic Elements denote syllables, combined phonologically
Moraic Like syllabic elements, but units are a bit smaller (see discussion of Japanese kana below)
Alphabetic Elements denote phonemes (more or less), combined phonologically
Featural Elements denote distinctive features of phonemes (such as voicing or place of articulation), combined phonologically

This typology seem very rational, but in fact it is misleading, as rational taxonomies often are. All documented writing systems are a mixture of two or (usually) more of the these categories, and all include a significant phonological aspect. This critique has been most fully develop by John DeFrancis in his book Visible Speech, from which most of the examples below have been taken.

Given the definitions of writing we've given so far, pictographic and ideographic systems would not be included, since they are not ways of "recording language," but rather ways of directly picturing things, events and their relationships. Interestingly, as a matter of empirical fact, it seems that pictographic and ideographic systems have never really developed fully as such. This is not to say that people have never conveyed information with pictures, nor that sets of conventional icons standing for language-independent ideas have never been developed and used. In fact, pictographic and ideographic signs played a central role in the (various) inventions of writing.

However, pictographic or ideographic systems as such have never developed into a form fully capable of conveying unlimited messages from one person to another. Instead, they either remain as limited systems operating within a highly restricted application -- say to keep warehouse records -- or else they develop into a genuine writing system, capable of conveying any linguistic message. In the second case, the process of development into a genuine writing system always involves adding some phonological aspects, in ways we'll describe shortly.

Origins of writing

When they appear in the archeological record about 5,500 years ago, the Sumerians had developed a system of icons inscribed on clay tablets for keeping temple records. A typical example includes icons for "two", "sheep", "temple/house", and the gods "An" and "Inanna". The meaning might be "two sheep received from the temple of An and Inanna", or "two sheep delivered to the temple of An and Inanna", or perhaps something else entirely.

The table whose picture is shown here shows a more sophisticated use of a numbering system, as well as a way of specifying the accounting period, but the basic principles are similar.

These marks constituted a limited notation system, which in the beginning may only have served to remind the writer of what he had once already known. However, as long as agreed-on standards were obeyed, another person could also read the record in the same way. In this, these were similar to systems for record-keeping, based on symbolic tokens of many sorts, developed over and over again in many cultures over the millennia -- marks on stone or bone, clay figurines, even knots in cords. As civilizations become more complex, record-keeping of this kind becomes increasingly important in order to keep commercial transactions straight. The ability of trained third parties to read such records in a consistent way became increasingly important as systems for mediating or adjudicating disputes in non-violent ways come into use. However, most such systems remained limited in their expressive capacity.

In the case of the Sumerian record-keeping system, two crucial innovations led (over a few hundred years) to a full writing system, capable of expressing anything that could be expressed in the (written) words of the Sumerian language.

The first innovation was the Rebus Principle: if you can't make a picture of something, use a picture of something with the same sound. The first clear example of this is in a tablet from Jemdet Nasr, dated to around 2900 BC, in which a pictograph of a reed (GI in Sumerian) is used to mean "reimburse" (also pronounced GI).

The second innovation was what we might call the Charades Principle: if you combine an ambigous or vague picture of the meaning of a word, with a little information about what the word sounds like, you can get a more effective communication of the identity of the word than if you tried to use only imperfect information about meaning, or imperfect information about sound. To give an example from Sumerian, a particular symbol having a meaning something like "leg" might be combined with a symbol pronounced "ba" to give the word GUB "to stand"; the same "leg" symbol, combined with a symbol pronounced "na", gave the word GIN "to go"; and combined with a symbol pronounced "ma", it gave the word TUM "to bring." Thus a Sumerian reader was in effect being asked to play a sort of game of charades: what word has something to do with "leg" and ends in the initial sound of "ba"? -- why of course, that's GUB, "to stand", what else! These combinations became conventionalized, resulting in a system that was presumably somewhat easier to learn to read than to learn to write, but was not very efficient in either direction.

Still, the result was a complete writing system, in which the Sumerians wrote down not just warehouse records, but poems, diplomatic treaties, letters, contracts and judicial decisions, dictionaries, and epic myths.

We can see a modern version of a similar system in Chinese characters. Most characters can be analyzed as containing two elements, one of which provides semantic information, while the other provides phonological information. The following small table (from DeFrancis) illustrates this with a set of four semantic elements crossed with a set of four phonological (or as DeFrancis calls it "phonetic") elements. The numbering of the semantic elements is taken from a standard set of 214 that have been recognized at least since the Kang Xi dictionary of the 18th century, while the numbering of the phonetic elements is taken from a list of 895 compiled by Soothill.

It is clearly inappropriate to call the Chinese system "ideographic", as is sometimes done. Chinese characters refer to morphemes, not ideas. However, to the extent that the pattern in the table above is taken as typical (and DeFrancis claims that about 75% of all Chinese characters work like these examples), Chinese characters are simultaneously a kind of syllabic writing. DeFrancis suggests the term "morpho-syllabic" to describe it.

It can be argued that the degree of phonological information found in the Chinese writing system is not radically different from what is found in English. English spelling usually tells us what the morphemes are, but unless we know in advance, it gives us only imperfect information about pronunciation. We can be sure that "tough" will not be pronounced "congressional" or "halter", but only knowledge of the word itself tells us that it rhymes with "rough" and not with "dough" or "through" or "plough".

Egyptian hieroglyphics also combined pictographic and phonological aspects, often in complicated ways, as the example below suggests. This is the word hememu "humanity". It starts with four symbols denoting the four consonants in the word (the symbol glossed with /u/ is actually /w/). It ends with three semantic determinatives: a seated man, a seated woman, and a set of three lines indicating that multiple entities are referenced.

This approach to writing produced a small number of symbols with simple phonetic values -- Egyptian had 24 simple consonant symbols, shown below -- and led naturally to the development of alphabetic writing systems.

Why pictographic/ideographic writing is not practical

No one has ever developed a full communications system based on pictographic or ideographic principles, although people have often surmised that this would be useful, because it would (or at least could) be universal. The problem is that universality means only that it is equally hard for everyone to develop and learn such a system. If it is feasible to design such a system at all, it is at least very, very difficult. Since everyone already knows at least one ordinary spoken language, practical people will always tend to give up on the ideographic system and start using a written form of their speech, as soon as they can figure out how to do this.

For an amusing myth about this process, check out the story of How the first letter was written, from Rudyard Kipling's Just So Stories.

Why phonological writing is (eventually) practical

It is rather difficult to get enough conscious access to the phonological structure of speech to design an alphabetic writing system, and very few languages have small enough inventories of syllables for a syllabic system to be an easy place to start. More important, the idea of constructing a full writing system (on any basis, phonological or otherwise) is not at all an obvious one.

So writing seems to have started with pictograms for mnemonic aids in record keeping, or as vehicles of insight in divination. As the inventory of signs increases, the possibility arises to begin using some of the signs as rebuses or as phonological/semantic combinations. This is much more efficient than trying to design a new symbol for every word or morpheme. Once this meaning-plus-sound process begins, it can develop into a full (if complex and inefficient) writing system, able to encode any passage in the language. This development seems to have occurred independently at least three times: in the middle east; in China; and in Mexico.

Various other developments are then logically possible. The Chinese (and other cultures influenced by them, including Japan) developed a meaning-plus-sound system based on the syllabic unit. The Mayans did the same. A logical next step is to increase efficiency by doing away with some or all of the meaning-related units, in favor of a consistent syllabary of some sort. Such syllabaries were developed throughout the far east, but in most cases they did not displace the mean-plus-sound elements. Instead they supplemented them for certain uses (such as the encoding of grammatical particles in Japanese) or for certain populations (such as women in some places and periods in China).

By contrast, the Egyptians (and other semitic languages) developed a meaning-plus-sound system based on primarily or solely on consonants. This naturally led to purely consonantal writing systems for some of the semitic languages (such as Phonecian), which shared with Egyptian the property of changing vowels extensively for morphological purposes. To give an example from Hebrew, the root /ktb/ can have among many other forms katav "I wrote", kotav "I write", katoov "written", kitav "letters", katban "scribe". In a language that works this way, it's natural to factor words into consonants and vowels, and to start with a sort of acronym-like use of pictographs to denote their initital consonant, in a meaning-plus-sound system based on consonants only. In the case of most if not all Semitic languages, it has turned out to be usually possible to figure out the vowels from context, even without adding semantic determinatives. This made it possible to abandon the semantic determinatives without giving up general writing.

Alphabetic systems seem to be rather unnatural, and have arguably been developed only once, by the Greeks when they adapted the Semitic consonant-only system to their language, which couldn't so easily be written without vowels. It is possible that this invention would not have happened at all without this particular historical sequence.

Part of the reason for the success of meaning-plus-sound systems is that two kinds of evidence are always better than one: two fairly lousy systems can be combined into one decent one. But there is another reason as well. Sound systems are made up of quite limited materials. In many lanugages, the number of distinct syllables is not terribly great; and the number is much reduced if natural equivalence classes (such as "starts with" or "rhymes with") are used. Once one gets started down the sound-system road, it is tempting to go all the way, since it makes the practical training of scribes more efficient only to have to learn a few dozen symbols, rather than several thousand. Of course, the Scribe's Guild may think it's just fine to limit the number of literate people, and to leave large barriers in place, blocking entry to skill in their profession.

Why is reading hard to learn?

For the same reasons that writing was hard to invent, reading is hard to learn. Neither reading nor writing is a biologically natural process. Alphabetic writing systems are in principle the most efficient, since they require learning the smallest number of symbols. It's unlikely that anyone would design a writing system today on any other basis. However, alphabetic systems seem to impose a special burden on learners, because they require understanding a level of analyses -- phonemic analysis -- that is relatively inaccessible to introspective scrutiny. (And getting past the cognitive barriers "Phonological Awareness" has been shown to be a key factor in learning to read in an alphabetic orthography.)

The orthographic system of some language -- notably English -- also has many morpheme-related idiosyncrasies, which may eventually make it easier to recognize visually-presented words (just as the Chinese logographic system does), but which also may obscure the alphabetic principle for early learners.

Over the years, many long-term longitudinal studies have shown that about 60% of American children find it difficult to learn to read, and that 20-30% fall seriously behind or fail entirely. The reasons for these problems, and the best ways to deal with them, are a matter of great controversy. A great deal depends on the answers.

The opening salvo in one of the this war's battles was fired more than 50 years ago by Rudolf Flesch in his 1955 book Why Johnny Can't Read.

In an attempt to present an authoritative consensus on this important topic, these lecture notes used to start with material from 25 years ago, in the form of quotes from congressional testimony given in 1998 and 1999 by Dr. G. Reid Lyon, Chief of the Child Development and Behavior Branch of the National Institute of Child Health and Human Development, which in turn is part of the National Institututes of Health (NIH). You can read those quotes in the 2022 version of this page.

A more recent exposition of the same ideas can be found in a Psychology Today articlefrom September of this year, "Elite Universities Call for Change in Reading Education", 9/20/2023 -- showing that the "Reading Wars" have continued over the subsequent decades. For a more general summary of recent battles in the "Reading Wars", see Sarah Mervosh, "'Kids Can't Read': The Revolt That Is Taking On the Education Establishement".

A sample of links to other recent mass-media articles on the topic, from various perspectives, can be found here.

And for an in-depth survey of the science, see Mark Seidenberg's 2018 book "Language at the Speed of Sight: How We Read, Why So Many Can't, and What Can Be Done About It". As you can learn from that book, or from your own review of the literature, the key scientific foundations have been around for many decades -- which raises the interesting socio-cultural question of why educational practice (at least in the U.S.) has been so resistant to implementing the conclusions.



home

 

schedule

 

homework

    [course home page]    [lecture schedule]     [homework]