Notes on iconicity and arbitrariness

Imagine you are a small child who wants a cookie. You have no particular cookie or even type of cookie in mind; you would settle for an Oreo or a chocolate chip cookie or even just a plain butter cookie. You are too little to get to the cabinet over the counter where you know they keep the cookies. In order to get what you want, you need to communicate your desire to someone who can reach the cabinet. The main problem is that the desire that you experience and the object of your desire are imperceptible to your fellow humans. Fortunately, your species has a way of making imperceptible mental constructs (concepts or meanings) perceivable by a code based on a simple yet ingenious principle: the imperceptible constructs in your mind are associated with entities that are perceivable by others (as well as by yourself). The perceivable entities (forms) represent the imperceptible constructs (meanings). Each form-meaning pair is a symbol.1 In ordinary terms, we call such a code a language and the symbols words (or more technically, morphemes). (Languages also have rules for combining the symbols, but these do not concern us for the moment.) Humans as a species all share this encoding ability, and children easily acquire basic vocabularies in their first years of life without special instruction.

Most of the world's languages are spoken languages; that is, their speakers use acoustic signals produced by the vocal tract to represent meanings. Different groups of humans use different vocal signals. For instance, among English speakers in America, the vocal signal for the concept the child in our example has in mind sounds roughly like /kuki/; among English speakers in Europe, roughly like /biskit/; among German speakers, roughly like /keks/. In general, the association between the vocal signal and the concept is arbitrary; in other words, nothing in the form suggests the associated meaning. For instance, the concept of silence is not expressed by a period of silence, nor are 'whisper' or 'shout' necessarily expressed in a whisper or a shout, or even quietly or loudly. There are some exceptions to the arbitrary character of morphemes. For instance, if the concepts to be encoded are a sneeze or the characteristic sound of a cat, the vocal signals /etSu/ or /miao/ mimic real-world instances of those concepts. But such examples of sound mimicry (also known as onomatopoeia) are fairly narrowly restricted, and they are not central to spoken languages.

Acquiring spoken language depends not only on the ability to produce vocal signals, but on the ability to perceive them as well. There are visual and other sensory cues associated with at least some vocal signals (notably the ones produced towards the front of the mouth), but the most reliable cues are acoustic. Therefore, if a child has trouble perceiving or otherwise interpreting the acoustic cues for vocal signals, the acquisition of spoken language is difficult, if not impossible. However, like biological instincts in many other species (for instance, the imprinting in the geese described by Konrad Lorenz), the human language instinct is abstract and therefore superficially malleable. In particular, the encoding ability (the ability to pair forms with meanings) does not require the linguistic forms to be audible signals produced by movements of articulators in the vocal tract. Instead, the forms can be visible signals produced by movements of other articulators, notably of the hands and face. Languages that use such visible signals are known as sign languages. For people with neither sight nor hearing, there are even languages that use tactile signals.

As mentioned earlier, the form-meaning association for morphemes in spoken languages is generally arbitrary, and any non-arbitrary origins are lost in the proverbial mists of time. Sign languages are more recent, and the form-meaning associations are often more accessible to us. This is because signers, faced with the communicative problem of encoding mental concepts by means of visual signals, naturally tend to choose signals whose visual form suggests the intended meaning. Such signs are called iconic (because the form-meaning relation is reminiscent of the relation between an icon and what the icon represents) or motivated (because the form-meaning relation is motivated by analogy).

It is worth examining in a bit more detail how iconic signs might emerge. We begin by considering an apparently unrelated problem - a classic problem in game theory. The point of the game is to meet someone you don't know tomorrow in New York City. You and your partner (let's call him Jim) need to coordinate a time and place to meet, but you are not able to communicate before you meet. If you meet, you both win the game; if not, you lose. What time and place do you choose? What are your considerations?

Here is Jim's solution. Jim has been to New York once in his life. This was on the occasion of his engagement to his significant other a few years back, which took place at a little restaurant in Greenwich Village at 6 p.m. He fondly recalls that dinner as one of the high points of his life, and so he chooses the restaurant and 6 p.m. as his meeting coordinates.

Your solution is probably not as foolish as Jim's. Clearly, in choosing your coordinates, you need to make choices that are salient, but not just to you. You need to also take into account your partner's state of mind, including that person's expectations about your own state of mind. Jim is completely unrealistic to expect a complete stranger to know details about his engagement. As it turns out, most people are not like Jim and are instead quite successful at finding points of convergence even in the absence of communication. Thomas Schelling, the originator of the "meet me in New York" game and a central figure in the history of game theory (Schelling 1960), called such points focal points; they are also called Schelling points in his honor.

Schelling points are not necessarily unique. Indeed, if they were, the "meet me in New York" game would be pointless. Instead, it does have and make a point - namely, that given the vast number of times and places available to meet in principle, people are able to severely restrict the set of possible options based on what they think another person might be thinking.

Now imagine another coordination game. Here, the meeting point is not in physical space, but in conceptual space. The point of the game is to get your partner to meet you at a concept of your choice. In other words, you think of a concept and try to get your partner to think of the same concept. How would you do this for "candle"? Or "pig"? Or "three"? Or "trash"? All sorts of variants of the game are possible. The game is trivial if you get to use a language you share with your partner. But imagine that you are playing with a partner with whom you don't share a language, or that neither you nor your partner have any language at all. Imagine that the rules of the game don't permit you to draw. This last variant would virtually force you to use iconic gestures as Schelling points.

Now consider what would happen in a community of players who are repeatedly playing the "meet me at my concept" game with many different partners over time. The players in the community might be expected to converge on particular forms for the various concepts that they wish to convey, especially if they get bonus points for how quickly they can get their partner to home in on their concept. In other words, a community playing "meet me at my concept" will develop symbols that are conventional.

What does it mean for a symbol to be conventional? It means that the association between the symbol's form and its meaning is fixed; individual language users cannot change the associations at their pleasure, or even for good reason. In this respect, there is no difference between an iconic symbol that has become conventional and a purely arbitrary symbol. It is clear that in an effective system of communication, arbitrary symbols must be conventional - a point humorously illustrated in the passage from Lewis Carroll's Through the looking-glass where Humpty Dumpty assumes the right to associate ordinary forms like glory with non-conventional meanings like 'nice knock-down argument', much to Alice's puzzlement. What we are saying here is that something like the converse is also true - that with respect to a language user's freedom to associate form and content, conventionality is tantamount to arbitrariness. Before an iconic symbol becomes conventional, its symbolic force derives from the iconic character of the form-meaning association. In the emergence of a large-scale system of conventional iconic symbols, the relation among the various symbols becomes increasingly important. Specifically, the particular form of two symbols becomes less important than the fact that they are in contrast (= formally distinct from one another). We might therefore expect conventional iconic symbols to become subject to the same kinds of linguistic pressures of production and perception as are arbitrary symbols. Recall further that children are masters at acquiring large numbers of arbitrary symbols. The fact that humans are extremely comfortable with arbitrary symbols is underscored by the likely prevalence of multilingualism throughout most of human history. For instance, Jared Diamond describes the Papua New Guinean companions on his expeditions as speaking an average of five languages (Diamond 2012). So once a language acquires native speakers, the role of iconicity is weakened yet further.

Iconicity can weaken in two basic ways. First, a once-iconic feature (such as a sign's handshape or location) may undergo change, in some cases to the point of complete deletion. A case in point is the ASL sign for HOME, a compound formed from EAT and SLEEP. SLEEP, originally a B handshape at the ear, has assimilated to the flat O handshape of EAT. (See here for ASL handshapes.) This type of loss of iconicity is described extensively in Frishberg 1975. Second, an iconic feature can be maintained in the sign's form but become opaque as a result of cultural or technological change. For instance, ASL has a sign (specifically, a classifier sign) used to represent a semantic class of vehicles including cars, trucks, motorcycles, boats, and submarines, though not airplanes. The handshape for the sign is the 3 handshape, canonically oriented with the middle finger horizontal and the thumb vertical. Originally, the sign depicted a sailing vessel, with the thumb and index fingers for masts and the middle finger for the hull. The decline of sailing vessels and the rise of motor vehicles have rendered the sign completely arbitrary for any contemporary learner of ASL. Another example is MILK, in which a curved 5 repeatedly changes to S. Depending on a signer's familiarity with animal husbandry, the sign has iconic features (depicting the action of milking) or not.

The relatively weak role of iconicity for native signers is illustrated by anecdotal evidence from our own experience.2 One of us (Jami) is a right-dominant native signer who learned ASL before English; the other (Beatrice) is a left-dominant very late learner of ASL. Learning the sign for JOT, Beatrice was struck by its iconicity: PUT (flat O hand) is directed towards the classifier for FLAT-SURFACE (B hand). This was news to Jami, who had never realized that the sign was a compound. Her realization is exactly analogous to Beatrice's realization as an adult that always is a compound of all and ways. Another example concerns signs related to clocks and calendars. HOUR is produced with a dominant G hand moving in a circle against an upright B hand. For right-dominant signers, the dominant hand's index finger moves outward from the speaker, and the tip of the index finger traces the path of the hour hand of an analog clock. What happens for left-dominant speakers? If iconicity trumps outward movement, the index finger should move inward; if outward movement trumps iconicity, the sign is no longer iconic. When Beatrice (using her rudimentary ASL) asked Jami how to produce the sign as a left-dominant signer, it took a while to get across the point of the question. To a native speaker like Jami, the only feature of the sign that was salient was the articulatory feature 'outward'; the fact that outward movement combined with left dominance results in a countericonic sign was completely irrelevant to Jami. The same issue arises in connection with a sign like WEEK, produced by moving a dominant G hand across the upward-facing palm of a nondominant B hand from the base of the palm to fingertips. Assuming a conventional calendar layout, the sign is iconic for right-dominant signers in that the dominant hand moves from early days of the week to later ones. For left-dominant signers, the sign is once again countericonic, since the dominant hand moves from right (late) to left (early). In EVERY-MONDAY (M moves straight down) followed by EVERY-TUESDAY (T moves straight down), EVERY-TUESDAY is signed slightly outside of EVERY-MONDAY. As in the previous examples, the sequence is iconic for right-dominant signers and countericonic for left-dominant signers.

We have just seen that the articulatory feature 'outward movement' trumps iconic considerations in the case of clock and calendar signs. It is worth noting that these signs are relatively weakly iconic. What we mean is that the layout of calendars and the direction of writing more generally is not necessarily left to right. Neither is it logically necessary for the hands of a clock to move in a clockwise direction. The reason they do is that a clock's hour hand represents the gnomon of a sundial, whose shadow moves in a clockwise direction because of the direction that the sun appears to move in as a result of the direction that the earth happens to spin in. Some signs, though, are strongly iconic. For instance, in the signs for UP and DOWN, the tip of the index finger points up and down, respectively. In principle, it would be possible for the sign for UP to point down, and vice versa, but this would be confusing at a very basic level. One might be tempted to conclude from this that UP and DOWN are purely iconic signs, but that conclusion would not be well-founded. In this connection, consider Table 1.

Table 1: The Stroop effect with congruent and incongruent colors
blue green red
blue green red

Both rows contain the arbitrary symbols blue, green, and red. There is no question whether the forms of the words are iconically related to the color they represent; they aren't. However, in the first row, the color of the symbol's written form is congruent with its meaning; in the second row, that is not the case. A classic experiment (Jaensch 1929, Stroop 1935) shows that people are faster and more accurate at reading color words and naming their colors in the congruent condition, an effect known as the Stroop effect. The original reports of the Stroop effect involved color, as in Table 1, but the effect has been replicated for other properties as well, including spatial orientation. Given the strength of the Stroop effect, we would expect sign languages to avoid non-congruent signs such as a downward-pointing sign for 'up'. In the unlikely event that such signs were created by some signer, they would die a-borning - they would be rejected by the community as terrible candidates for Schelling points. We will refer to this consequence of the Stroop effect as the constraint against countericonicity.

At one time, it was thought that arbitrariness was a necessary property of human language (Hockett 1960). If this were true, the iconic aspects of signs like UP and DOWN would compromise the status of sign languages as true human languages. Among sign language linguists, one strategy in response to this threat has been to emphasize the role of onomatopoeia in spoken languages. We consider this a weak strategy, since onomatopoeia in spoken languages is marginal. In the discussion above, we have outlined what we consider a stronger strategy - one that relies on iconic symbols becoming structurally equivalent to arbitrary symbols in the emergence of a large-scale system of conventional symbols.


1. As used here, the term 'symbol' corresponds to Saussure's 'sign', and 'form' and 'meaning' correspond to his signifier (signifiant) and signified (signifi´). In ordinary usage, 'symbol' is ambiguous between the form-meaning association and the form itself. 'Sign' is ambiguous in the same way. If we need to disambiguate, we will distinguish between Saussurean symbols and symbol forms, and analogously for signs. Thus, the ordinary question "What's the sign for 'sleep'?" only makes sense when it means "What's the sign form for 'sleep'?" The alternative paraphrase "What's the Saussurean sign for 'sleep'?" is incoherent.

2. We are perfectly willing to grant iconicity an important role in the non-native acquisition of ASL. Such a role is consistent with the anecdotal evidence we present. It is suppported by the fact that ASL learning sites like Signing Savvy charge a premium for giving users memory aids for the signs on the site, which make reference to iconic properties.