Writing and Reading

1. The History of Writing

The dating of the beginning of human language is not easy, but we have a clear picture of the relevant interval for the upper and lower boundaries. There were hominids with a human-like vocal tract as early as 200,000 B.C., but they probably did not have a sufficiently developed nervous system to control it until about 100,000 B.C. Evidence regarding Neanderthals (70,000-35,000 B.C.) is not clear as to human language capacities; most experts believe that the essential features of human language were in place at least by the time of Cro-Magnon (35,000 B.C.) Crystal (p.293) gives the "window of likelihood" for the evolution of spoken human language as being between 50,000 - 30,000 B.C.

Nothing that we can call writing, however, evolved before about 3000 B.C. In other words, spoken human language seems to have been around from at least 30,000 - 50,000 years before writing was invented. The domestication of plants and animals, the invention of pottery making, the development of new technologies of grinding and polishing in the manufacture of stone tools c. 8000-9000 B.C. -- all of these occurred some five or six thousand years before writing was invented. In this historical and evolutionary sense, then, spoken language has been prior to written language. It is true, too, that writing systems were based on spoken languages -- initially, in an attempt to capture meaning via graphic representation. Spoken language is prior to written language as well in the life of every human being who becomes literate: ability to produce and comprehend written language comes later than these abilities in the spoken language. Further, whereas all human beings of even quite low I.Q. become competent native speakers, not everyone is able to acquire similar competence in the derivative, written, medium. Spoken language does not have to be taught; written language, by and large, does.

Pictographic Writing

The major division among types of writing systems is the division between phonologically-based systems (where the written symbols represent sounds of the languages) and non phonologically-based systems (where the written symbols represent meaning). The earliest writing systems developed out of pictorial representations of objects, and "reading" initially represented simply recognizing the symbols. Egyptian and Mesopotamian pictograms date from about 3000 B.C.; pictograms in China (an independent development) have been dated at about 1500 B.C. Pictograms slowly became conventionalized, and developed into ideographic writing systems.

Ideographic Writing

Ideograms, as Crystal explains, have "an abstract or conventional meaning, no longer displaying a clear pictorial representation of an object". In addition, symbols in ideographic systems seem to have been used for abstractions like dark, (from a picture of a starry sky). Ideographic elements are found in the writing systems that developed in the Near East (the Sumerians, Babylonians, Assyrians, Hittites, and Egyptians) during the Bronze Age -- roughly, between about 2500 B.C. and the first century B. C. The technique widely used for actually performing the writing was the cuneiform method, whereby a wedge-shaped stylus was used to press the imprints into soft clay tablets. Most of the writing systems from this era combine ideographic elements with other principles, including some some symbols (phonograms) that represented sounds of the language. This was true of the best known such system, the Egyptian hieroglyphic (or "sacred writing") system. Hieroglyphs included ideograms, phonograms and determinative symbols, that were paired with meaning symbols and used to identify the meaning category of the symbol they attached to. Egyptian hieroglyphics were first deciphered during the 19th century thanks to the discovery of the Rosetta Stone, carved with three writing systems including ancient Greek.

Logographic Writing

Logographic writing is a further development towards abstraction, in which the graphemes represent words. Logograms or characters (best known from the Chinese or Japanese writing systems) refer to linguistic units, often morphemes that are parts of words rather than whole words. Basic literacy in Chinese today is considered to be mastery of approximately 2000 characters.

Phonological Writing: Syllabaries and Alphabets

Syllabaries are phonologically-based writing systems that represent syllables, usually CV, rather than individual vowels or consonants like alphabets, or meaning units like ideographic or logographic systems. Graphemic inventories of syllabaries, which seem to have been independently developed in several widely scattered areas of the world, typically include from about 50 to several hundred units. "Linear B", the written language of texts recently deciphered and identified as a very ancient Greek (from about 1300 B.C.) was largely organized as a syllabary, as was the Cypriot Greek writing system dating from the 3rd century B.C. Meanwhile, the ancient Greeks in the center of Greek civilization had borrowed another, simpler writing system from its Semitic inventors via their trading partners the Phoenicians. This alphabetic system of writing was used to immortalize the epic Greek poems the Iliad and the Odyssey in the Classical Greek period (see below).
Other examples of syllabaries include the Japanese kana script, and the Native American Cherokee script. Alphabets, the most recently-invented writing systems, most closely mirror spoken language in containing a relatively small set of non-meaning-bearing symbols that represent (roughly), the phonemes of the language. They usually number from 30 - 50, paralleling the typical number of phonemes in languages.
As opposed to other types of writing systems including syllabaries, the alphabetic principle seems to have been invented only once, by North Semitic peoples living in Palestine and Syria in c. 1700 B.C. This alphabet represented only consonants (22 consonants were initially represented). The principle was diffused rapidly, but as symbols were passed along, they also changed such that many of today's alphabetic writing systems are unrecognizable as "daughter" systems deriving from this one brilliant idea.
The North Semitic alphabet was used to represent Aramaic and Hebrew, and was borrowed by the Phoenicians in approx. 1000 B.C., being passed on by them to the Greeks, who added vowels, and thence to the Etruscans in about 800 B.C. The Etruscan alphabet was the source of the Roman alphabet that has since been adopted for use in many languages around the world.
The Cyrillic alphabet used in writing Russian and some other Slavic languages, was devised in the 9th century by St. Cyril, based on the Greek alphabet. The Devanagari script used originally in writing Sanskrit, the ancient sacred language of India, was derived from the ancient North Semitic script.

Devising New Writing Systems

Throughout the course of history (only itself actually possible via the invention of writing!), writing has been used for languages considered to be special, or sacred, or worthy. Writing has also tended to enshrine dialects or languages that have fallen into disuse in the oral tradition as the only proper vehicles of religious scriptures. It was only with St. Jerome's translation of the Bible into Latin in the 4th century A.D. (the version known as the Vulgate) that Greek was abandoned by European Christians as the language of liturgy. In turn, during most of the Middle Ages, in areas where people were speaking "vernaculars" derived from Latin, Latin remained the only written language. When the Finnish linguist Arto Antilla visited Penn last week, he explained that until the 19th century in Finland, Latin and Swedish, but not Finnish, were used in Finnish universities.
Being written in itself came to symbolize prestige for -- even confer prestige on -- languages. During the 20th century, writing has been diffused to many languages that were previously unwritten, often via the efforts of linguists interested in documenting these languages, or of missionaries who wanted to teach literacy in order to make the Bible accessible to more people. The Summer Institute of Linguistics (S.I.L.) has for many years been the most important institution for this effort. When I [GS] first went as a graduate student to do research on the Buang language in 1966, there was a missionary linguist from the S.I.L. (Bruce Hooley, who subsequently received a Ph.D. from Penn) working on an adjacent dialect. I soon discovered that people in the dialect I chose to research (the "Headwaters" dialect) considered it very important that their own dialect also be written, as they were convinced it was superior to the "Central" or "Mapos" dialect Hooley was writing. People were anxious to help me, and the many Buang villagers who had received basic literacy training in other languages they knew were able to transfer this knowledge immediately to writing their own language. One man, who had taught himself to read, starting with deciphering what was written on coins, presented me with a bilingual dictionary he had made to help me in my work.
What are the decisions that have to be made in devising or adapting a writing system to a new language?
1. First, should every language be written?

This is basically a decision to be made by the speakers of the language. Many linguists have been asked by representatives of languages whose speakers number only in the hundreds to help in the creation of writing systems and dictionaries. Most such people in the world today are bilingual in some other language(s) of wider currency, in which they may find more reading materials, however such people are also in a good position to realize that the very existence of their language may be threatened due to failure to pass it on to the next generation. They may be all the more interested in documentation via writing.
2. What writing system should be used?
Although the Roman alphabet has been the most popular and successful for some time, speakers of other languages may choose other alphabets due to religious or cultural affiliations with speakers of the languages those writing systems are associated with. Thus, for example, Hindi uses the Devanagari alphabet inherited from Sanskrit, whereas the closely related Indic language Urdu uses the Arabic alphabet in general use among Muslims.
3. Should a strict phonemic principle be observed? Once again, there may be cultural reasons for speakers' wishes to override the most simple solution to reducing sounds to writing. Many speakers of Haitian Creole rejected the use of the symbol /k/ for the [k] sound, because /k/ seemed to be to "English" a letter, and they wanted to represent their historic relationship to French by using /qu/ to represent the [k] sound, as French does. Further, it may be preferable to use only one symbol for two distinct phonemes if to do otherwise would mean inventing a new symbol or using diacritics. (The IPA represents many fine distinctions, but it is not easy to read!)
3. Should morphology be taken into account?
If different sounds are represented in the same way, decoding may be more difficult for the unskilled reader. However, there is an argument to be made that giving a unique representation to the same morphological item helps readers identify related words. Thus the English long and short a, e, i, o, and u are related across hundreds of word sets in which the alternation is always the same. Representing the vowel differently would result in the loss of the information about morphological unity that is captured in the spelling.

/a/ sane
/e/ serene
/i/ divide
/o/ tone

/i/ reduce

Proposed spelling reforms for established languages run into at least as many obstacles as are involved in devising a new orthography for a previously unwritten language; cultural factors including a reverence for the past, as well as factors like the retention of representing morphological unity in favor of phonetic irregularity in the spelling -- these factors and many others typically intervene.

2. Literacy in the World Today

By no means all of the languages spoken in the world today have been "reduced to writing". However, almost all of those with speakers numbering in the millions have been written. Yet even among the languages with the most speakers, literacy across countries and across languages is very uneven.

In trying to understand what proportion of the world's population Taking Crystal's top 40 languages, in terms of population (Crystal, p. 289), we can discover a good deal about literacy in these languages by looking at the "Ethnologue" web pages. The Ethnologue is a catalogue of more than 6,700 languages spoken in 228 countries. The Ethnologue Name Index lists over 39,000 language names, dialect names, and alternate names. The Ethnologue Language Family Index organizes languages according to language families.

In the following list, the number of native speakers only, rounded to millions, that Crystal cites, is given in brackets after the language. The Ethnologue base is more up to date, and where its numbers differ, they are included in the information following the title line.


1. MANDARIN CHINESE [726 million native speakers] Sino-Tibetan.

China. 836,000,000 native and second language speakers in mainland China, 70% of the population (1990 census). Han Chinese population: 1,033,057,000 or 93.5%. Official language taught in all schools. Literacy rate 73% to 76.5%.
Also spoken in Taiwan, Singapore, Malaysia, Indonesia, Russia, USA, Mongolia, Viet Nam, Brunei, South Africa, Thailand, Laos, Cambodia, Hongkong, United Kingdom, and Mauritius.

2. ENGLISH [427 million native speakers. The biggest concentrations are in the following countries:

United States of America. 261,000,000 total population (1994 US Census Bureau). 210,000,000 first language speakers in USA (1984 estimate). Literacy rate 95% to 99%.

United Kingdom. 58,210,000 total population (1995). United Kingdom of Great Britain and Northern Ireland. 55,000,000 first language speakers in United Kingdom (1984 estimate). Literacy rate 97% to 99%.

Canada. 27,567,000 total population (1993). 14,122,770 mother tongue speakers in Canada (1976 Govt. report; probably should be corrected to c. 17 - 18 million now [GS]). Literacy rate 96% to 99%.

Australia. Total population 17,690,000 (1995). Commonwealth of Australia. Literacy rate 99%. 15,682,000 in Australia (1987), 95% of population. 170,000 people or about 1% of the population is of Aboriginal descent, of whom 47,000 have some knowledge of an Aboriginal language.

New Zealand. Total population 3,507,000 (1995). Literacy rate 99%. 3,213,000 native speakers in New Zealand (1987), 90% of the population. The number of languages listed for New Zealand is 4. Of those, 3 are living languages and 1 is a second language with no mother tongue speakers.

3. SPANISH [266 million native speakers]. Indo-European.

28,173,600 in Spain, 72.8% of the population (1986). Literacy 93% - 97%.
81,174,760 in Mexico (literacy rate 87% to 88%) and Central America;
18,154,926 in the Caribbean;
89,569,500 in South America [Peru: 25,123,000 (1995). Literacy rate 67% to 79%; Argentina: 34,264,000 (1995). Literacy rate 92% to 95%.
22,400,000 in USA (1990 census);
50,000 to 60,000 in Israel;
134,000 in Germany;
23,815 in Canada (1971 census);
500,000 in Philippines (nearly all second language);
100,000 in Africa;
Also spoken in Norway. Jamaica; 4,444, U.S. Virgin Islands (1970);
266,000,000 in all countries first language speakers.


The total population of India was 904,800,000 in 1995. Literacy rate 36% to 52%. The number of languages listed for India is 418. Of those, 407 are living languages and 11 are extinct. After giving the infomation on Hindi, the 4th largest language, we'll consider the 15 other very large languages of India: 11 from the Indo-Aryan family and 4 Dravidian languages from the south of India. Many of them have their own individual writing systems. If specific literacy rates are not mentioned for the language, the default for India as a whole is the best guess.

4. HINDI [182 million native speakers - with Urdu, 223 million.) Indo-Aryan. State language of Delhi, Uttar Pradesh, Rajasthan, Punjab, Madhya Pradesh, Bihar, Haryana, Himachal Pradesh. Devanagari writing system, "and formal vocabulary is borrowed from Sanskrit, de-Persianized, de-Arabicized." Indo-Aryan.
180,000,000 in India (1991) Overall Indian iteracy rate: 36% - 52%.
346,000 in Bangladesh (1993); Overall literacy rate: 24% - 25%.
685,170 in Mauritius;
890,292 in South Africa;
232,760 in Yemen;
147,000 in Uganda;
(Also in USA; Singapore; Nepal; New Zealand; Germany)
182,000,000 in all countries; 418,000,000 including second language users (1995 WA).

7. BENGALI [162 million native speakers]. Indo-Aryan.Bengali script.
67,200,000 in India. Overall Indian literacy rate: 36% - 52%.
100,000,000 native speakers in Bangladesh. Overall literacy rate: 24% - 25%.
70,000 in United Arab Emirates (1986)

15. PANJABI [60] Indo-Aryan.
25,742,000 in India (1994 IMA);
30,000,000 to 45,000,000 in Pakistan (1981 census Literacy rate 26%. "Perso-Arabic script is used, but not often written in Pakistan." [Also 43,000 in Malaysia (1993); 10,000 in Kenya (1995); 9,677 in Bangladesh (1961); 1,167 in Fiji]

16. MARATHI [58 million native speakers] Indo-Aryan.Devanagari script.
MARATHI 64,783,000 (1994). Maharashtra and adjacent states. 34% literacy rate.

18. TELUGU [55 million native speakers]. Dravidian. Telugu script.
66,318,000 in India (1994 IMA);
30,000 in Malaysia (1993);
2,008 in Fiji;
300 in Singapore (1970);
73,000,000 in all countries (1995 WA).
State language of Andhra Pradesh.

24. GUJARATI [36] ). Indo-Aryan. Gujarati script.
43,312,000 in India (1994 IMA) State language of Gujarat. 30% literate (1974).
140,000 in United Kingdom (1979);
147,000 in Uganda (1986);
50,000 in Kenya (1995);
12,000 in Zambia (1985);
9,600 in Zimbabwe (1973);
6,203 in Fiji;
5,000 in Malawi (1993);
800 in Singapore (1985); 44,000,000 in all countries.

25. MALAYALAM [30] Dravidian. Malayalam script.
33,667,000 in India (1994 IMA);
300,000 in United Arab Emirates (1986);
37,000 in Malaysia;
10,000 in Singapore (1987); 34,014,000 in all countries.
Also in Fiji, United Kingdom, Bahrain, Qatar.

26. KANNADA [26] Dravidian. Kannada script; similar to Telugu script. 60% literate. State language of Karnataka. 44,000,000 including second language users (1995 WA).

20. TAMIL [49] Dravidian. Tamil script. 62,000,000 or more in all countries first language speakers
58,597,000 in India (1994 IMA) . State language of Tamil Nadu.;
3,000,000 in Sri Lanka (1993);
250,000 in South Africa;
274,218 in Malaysia (1970 census);
191,200 in Singapore (1980);
35,000 in Germany;
22,000 in Mauritius (1993);
Also spoken in the Netherlands and Fiji.

23. BHOJPURI [41] Indo-Aryan. Kaithi script. 50% to 75% literate.
23,375,000 in India (1994 IMA);
1,370,000 in Nepal (1993); 25,000,000 in all countries. Indo-European

28. MAITHILI [24] Indo-Aryan.
22,000,000 in India. Spoken by Brahmans and other high caste or educated
Hindus. There is a Maithili Academy and Dictionary. 25% to 50% literate

30. ORIYA [24] Indo-Aryan. Oriya script.
30,158,000 in India
13,299 in Bangladesh (1961 census);
31,000,000 in all countries. State language of Orissa. 25% to 50% literate.

35. AWADHI [20 million speakers]. Indo-Aryan.
20,000,000 in India (1951 census);
540,000 in Nepal (1993 Johnstone);
20,316,950 in all countries. Awadhi is the standard for literature. There is considerable epic literature. 50% to 75% literate.

38. NEPALI [18] Indo-Aryan. 25% to 75% literate.
6,000,000 in India (1984 Far Eastern Economic Review);
300,000 in Bhutan (1973 Dorji);
9,900,800 in Nepal (1993);
16,200,000 in all countries.

40. ASSAMESE [15 million native speakers] Indo-Aryan.
State language of Assam. Bengali script.
14,604,000 in India (1994 IMA); a few in Bangladesh (1991)


5. ARABIC Semitic. Arabic script. Estimated in all countries first language speakers of all Arabic varieties 202,000,000 (1991). These include:
Algeria. 29,306,000 (1995). Literacy rate 50% - 52%. (14% Berber speakers).
Bahrain. 600,000 - Literacy rate 40% to 75%.
Egypt. 60,470,000 - 55% literacy (1993)
Iran. 1,400,000 native Arabic speakers in a population of 64,525,000 (1995). Overall literacy rate 48% to 52%.
Iraq. 22,411,000 (1995). Literacy rate 60% to 70%
Jordan. 3,888,000 (1991 govt. report). Literacy rate 71% to 80%.
Kuwait 1,300,000 (1995). Literacy rate 71% to 79%.
Libya. 5,445,000 (1995); Literacy rate 22% to 60%.
Oman 2,018,000 (1993), of which 535,000 are expatriates. Literacy rate 60%.
Palestinian West Bank and Gaza 1,800,000 or more (1995).
Qatar 516,000 (1995). Literacy rate 60% to 76%.
Saudi Arabia 17,000,000 (1995. Literacy rate 38% (50% men)
Syria. 14,904,000 (1995). Literacy rate 65%, 78% males.
U.A.E. 2,176,000 (1995) Literacy rate 68% to 73%.
Yemen 15,700,000 (1995) Literacy rate 25% to 39%.


6. PORTUGUESE [165 million native speakers] Indo-European. Roman alphabet.
Brazil. 153,725,670 (1995) Literacy rate 76% (1989 WA).
Portugal 10,429,000 (1995). Literacy rate 83% to 84%.

8. RUSSIAN [158 million native speakers]. Indo-European. Cyrillic script.
Russia 153,646,000 incl. Europe and Asia regions, include. the former republics of the U.S.S.R. (1995). Literacy rate 98% to 99%.

10. GERMAN [98,000,000 native speakers in all countries]. Indo-European. Roman alphabet.
(121,000,000 including second language speakers and Low German (1995 WA)).
75,300,000 in Germany (1990) - 99% literacy rate
7,500,000 in Austria;
150,000 in Belgium;
896,000 in Russia;
958,000 in Kazakhstan;
101,057 in Kyrghyzstan;
500,000 in Romania;
250,000 in Hungary;
200,000 in Czech Republic;
1,400,000 in Poland;
20,000 in Slovenia;
1,500,000 in Brazil;
400,000 in Argentina;
28,000 in Uruguay;
45,000 in South Africa;
35,000 in Chile;
32,000 in Ecuador;
6,093,054 in USA (1970 census);
135,000 in Australia;
40,000 in Uzbekistan;
(Also in Puerto Rico, Liechtenstein, Moldova, Namibia, United Arab Emirates)

11. FRENCH [72,000,000 speakers in all countries, mother tongue]. Indo-European. Roman alphabet.
57,188,000 (1995).
51,000,000 first language speakers in France - Literacy rate 99% (1991 WA).

6,000,000 in Canada (1988);
1,100,000 in USA (1989);
40,000 in Israel;
124,000,000 in all countries including second language speakers (1995 WA). Also in Belgium, Switzerland, Italy, Haiti, French Guiana, Monaco, Austria, Africa, Southwest Asia, French Polynesia, other former colonies.

14. ITALIAN [65 million native speakers]] Indo-European. Roman alphabet.
Italy. 57,592,000 (1995). Literacy rate 97% to 98%.

21. UKRAINIAN [45 million native speakers] Indo-European. Cyrillic script.
53,770,000 (1995). Literacy rate 99%.

22. POLISH [42 million native speakers] Indo-European.] Roman alphabet.
39,365,000 (1995). Republic of Poland. Literacy rate 98% to 99%.

33. DUTCH [21 million native speakers] Indo-European.] Roman alphabet.Literacy rate 95% to 99%.
13,400,000 in the Netherlands (1976 WA);
90,000 in France;
101,000 in Germany;
159,165 in Canada (1971 census);
5,640,150 in Belgium (1990 WA);
412,637 in USA (1970 census);
47,955 in Australia;
1,680 in Israel (1961);
1,000 or more in Surinam;
21,000,000 in all countries (1995 WA).

31. PERSIAN (FARSI) [22 million native speakers] Indo-Iranian. Arabic script.
Literacy 48%-52% in Iran as a whole.
25,300,000 in Iran, 50.2% of the population (1993)
[26,523,000 in all countries, including Khorasan, Tajikistan.Turkey, Turkmenistan, Uzbekistan, Qatar, Bahrain, Iraq, Oman (1993), USA, Austria, Canada, Germany, Greece, Saudi Arabia, United Arab Emirates, Denmark, Netherlands, United Kingdom, Israel.


9. JAPANESE [124 million native speakers] Possibly related to Korean. Hiragana, Katakana, and Kanji (Chinese character) writing systems used. Literacy rate 99% to 100%. 126,319,000 native speakers (1995).

13. KOREAN [66 million speakers] Language isolate.
42,000,000 in South Korea (1986) - Literacy rate 92%;
20,000,000 in North Korea (1986) - Literacy rate 91% - 99%.

12. JAVANESE [75 million speakers] Austronesian. Traditional Javanese script.
75,200,000 native speakers in Indonesia; 42% of the population (1989)
(Overall literacy rate in Indonesia: 78% - 85%.)
High Javanese (Jawa Halus) is the language of religion, but the number of people that can control that form is diminishing.

26. SUNDA [26 million speakers] Austronesian
13.6% of the population of Indonesia (1990 Clynes). Western third of Java Island.

37. MALAY [19] Austronesian. Roman and Arabic (Jawi) scripts.
7,181,000 or 47% of the population (1986)
12,000 in Hong Kong;
6,253 in USA (1970 census);
10,000,000 in Indonesia;
396,000 in Singapore;
21,000 in Myanmar;
4,200 in United Arab Emirates;
17,600,000 or more in all countries first language speakers.
Literacy rate 72% (1980 government report); 62% literate in Sarawak
Over 80% cognate with Indonesian.

INDONESIAN (BAHASA INDONESIA). Austronesian. Roman and Arabic scripts.
17,000,000 to 30,000,000 mother tongue speakers (over 140,000,000 second language users with varying levels of speaking and reading proficiency; 1993 Moeliono and C. Grimes);
37,000 in Saudi Arabia (1993);
8,000 in Singapore (1993);
2,520 in USA (1975 govt. report);
10,000 in Netherlands

17. VIETNAMESE [57 million native speakers] Austro-Asiatic. Roman alphabet.
65,051,000 in Viet Nam, 86.7% of the population (1993); 75,030,000 (1995). Literacy rate 78% to 88%.
76,000 in Laos (1993); 770 in Vanuatu (1993); 600,000 to 1,000,000 in Cambodia; 60,000 in Germany; 10,000 in France (1975); 6,000 in China (1990); 330 in Martinique; 5,000 in New Caledonia (1984); 76,000 in Laos (1993); 770 in Vanuatu (1993); 8,000 in Netherlands; 99,000 in Norway; 22,000 in United Kingdom; 60,000 in Canada; 35,000 in Australia; 66,897,000 in all countries.

31. BURMESE [22 million native speakers] Sino-Tibetan. Burmese script.
21,553,000 first language speakers (1986), 58.41% of the population of Myanmar (formerly Burma). Literacy rate 66% to 78%; 78.5% over 15 years old (1991).

33. THAI [21 million native speakers] Daic. Thai script.
20,000,000 to 25,000,000 in Thailand, including 4,704,000 mother-tongue Thai speakers who are ethnic Chinese, or 80% of the Chinese (1984); 14,416 in USA (1970 census); 3,000 in United Arab Emirates (1986); 30,000 in Singapore (1993); 21,000,000 or more in all countries. Literacy rate 89%.


19. TURKISH [53 million native speakers] Altaic. Roman script now used.
61,151,000 (1995). Literacy rate 76% to 90%.
46,278,000 in Turkey, 90% of the population (1987);
845,550 in Bulgaria (1986);
19,000 in Uzbekistan, Kazakhstan, Kyrghyzstan, and Tajikistan (1979 estimate);
18,000 in Azerbaijan;
120,000 in Cyprus;
128,380 in Greece (1976 WA);
63,600 in Belgium (1984 Time);
1,552,300 in Germany
150,000 in Romania (1993);
250,000 in Macedonia and Yugoslavia (1982);
24,123 in USA (1970 census);
8,863 in Canada (1974 govt. statistics);
135,000 in France (1984 Time);
192,000 in Netherlands (1984 Time);
67,000 in Austria (1993);
20,000 in Sweden (1993); 3,102 in Georgia;
30,000 in Denmark
53,000 in Switzerland; 60,000 in United Kingdom;
Also in United Arab Emirates, El Salvador, Honduras, Finland, Iran, Iraq
59,000,000 in all countries (1995 WA).

39. UZBEK [17 million native speakers] Altaic. Arabic and Roman scripts used formerly, now Cyrillic script is used.
23,377,000 (1995). Literacy rate 99%.


28. HAUSA [24 million native speakers] Afro-Asiatic.
22,000,000 in all countries (1991); 38,000,000 first and second language speakers.
Mainly in Nigeria. Also in Niger, Cameroon, Togo, Benin, Chad, Sudan, Burkina Faso.
The trade language of northern Ghana.

35. YORUBA [20 million native speakers] Niger-Congo.
465,000 in Benin (1993 Johnstone); 20,000,000 in all countries (1991 UBS). Zou and Ouéme provinces. Also Togo, primarily Nigeria. 1% to 30% literate in Yoruba.

Obviously, in presenting these estimates of literacy for speakers of the world's largest languages, the object is not to have you memorize the figures for the varios languages nor even to memorize what the 40 languages are! You should go over this data with an eye to understanding the great diversity in access to literacy, and to education in general.

Whereas almost all the European countries report over 95% literacy, the lesser figures in most of the rest of the world represent not failure of individuals to learn to read, but lack of exposure to education.

3. Reading as a Problem

Though there are obviously neurological problems that can interfere with many aspects of linguistic processing and speech production, it is clear that there is a fundamental distinction between human acquisition of language in the oral-aural mode, and in the written mode. The first, although it must be learned and practised, develops without specific instruction, whereas the second is almost always acquired via explicit instruction.

Crystal presents a useful discussion of the "eye" vs. the "ear" in learning to read, and in the act of reading, and it seems clear that both must be involved. Rather than devoting lecture time to going over the sensible things he has to say on the subject (which you should read!), it seems more useful to look at the way a parallel debate has played out in the U.S. among educators responsible for teaching reading. Over several decades, educators have differed in their views on whether or not "phonics" should be taught as the fundamental basis of reading. The "whole language" method, introduced to counter a mechanical approach to teaching children the basic letter-sound correspondences, has been adopted widely, but many children are still not learning to read under this approach. The consensus among psychologists and linguists who study reading now seems to be building toward the view that without what psychologist I.Y. Liberman called "phonemic awareness", children's learning how to decode the words on the printed page will be very much impeded. Reading specialists who are building an understanding of the correspondence between the spoken and the written language into the teaching of reading are showing more promising results.