Notes on quotation, demonstration, and iconicity

Background material to: Davidson, Kathryn. 2015. Quotation, demonstration, and iconicity. Linguistics and philosophy 38:477–520. DOI 10.1007/s10988-015-9180-1

The semantic representations in Kate Davidson's paper on quotation, demonstration, and iconicity look (and are) highly technical, but the basic issues that she's grappling with are not. Or in any event, they don't require a background in formal semantics to understand. The following lecture notes are intended as accompaniment to the paper, highlighting the basic issues and largely ignoring parts that are less important for the purposes of this class.

The basic insights of the paper are as follows:

  1. All human languages provide a way of reporting speech that involves demonstrating (= performing or re-enacting) the reported speech.
  2. This same strategy of demonstration is the basis of classifier constructions in ASL (and other sign languages).
  3. Since classifier constructions are otherwise unrelated to reported speech, the two uses of demonstration should be able to occur together. And so it is, argues Davidson, in the sign language phenomena called role shift and constructed action.

If Davidson were writing the paper for our class, she might start off by reviewing classifier constructions because we've covered that topic earlier in this class and only then might she turn to the (us new) topic of reported speech. But the paper appeared in a journal called "Linguistics and philosophy", whose readers would be more familiar with the philosophy-of-language literature on quotations, so she leads with that topic before easing those readers into the (to them) new topic of classifiers. To minimize confusion (I hope), I'm following her order of exposition, but you should be able to read the sections on quotation and on classifiers below in either order.

Quotation and related issues

Indexical expressions

Imagine a situation where a person called Eva orders a whole bunch of stuff, including a month's supply of cat food. All of the stuff arrives at her house (at 6128 Lone Oak Drive in Bethesda, Maryland) on March 15, 2020, and she puts the cat food down in the basement out of the way. Later that day, she informs her family by saying (1).

(1)     Oh, by the way, the big Amazon order arrived at 6128 Lone Oak Drive in Bethesda, Maryland on March 15, 2020, and Eva put the cat food in the basement.

Hmmm... No. People don't talk like that. Instead, what Eva really says is something like (2) (in actual fact, she would probably omit here and today because they're obvious from the discourse context; I'm including them here because I want to discuss them later on.)

(2) a.   Oh, by the way, the big Amazon order arrived here today.
b.   I put the cat food in the basement.

Unlike (1), the report in (2) contains indexical expressions. I've highlighted them by underlining, and they are special in that their reference isn't constant and unchanging. Rather, it is determined relative to the sayer's identity and coordinates at the time of saying (I'm using 'sayer' as a cover term for speakers or signers. I could also use 'sender' but that might be confused here with Amazon.) In the case at hand, this works out as in (3).

Deictic expression Deictic anchor Referent
(3)     I Sayer Eva
here Sayer's location 6128 Lone Oak Drive ...
today Day of saying March 15, 2020

Indirect quotation

So far, so good. Now what if someone was there when Eva said (2), and they want to pass the information along to someone who wasn't? One way they can report what Eva said is as in (4). For simplicity, I'll just give the report of (2b).

(4)     Eva said that she put the cat food in the basement.

(2) is the report of certain events (the arriving and storing events). So (4) is the report of the report of an event. In this recursive report, she replaces the original indexical I. Unlike with I, the reference of she is computed independently of the current sayer's (= not Eva's!) identity. (It's true the reference is computed indirectly, via its anaphoric relation with Eva, but the computation of the referent doesn't rely on a deictic anchor.) This type of report is traditionally called indirect speech or indirect quotation.

Direct quotation

There is another way of reporting (2b), which is shown in (5).

(5)     Eva said, "I put the cat food in the basement".

The type of report in (5) is called direct speech or direct quotation because it directly reflects the form of the original report in (2). This direct correspondence between the relevant parts of (2) and (5) can be thought of as iconic, as Davidson notes. (This fact is utterly obvious once attention is drawn to it, but traditionally, it has been ignored in linguistics and philosophy - not actively denied, but simply not noticed. Perhaps, paradoxically, this is because the correspondence is not just similarity, but complete identity. Conversely, examples not conforming to this so-called verbatim assumption would traditionally not have been considered direct quotation.)

Shifted indexicals

Notice that the deictic expression I in (5) is interpreted as referring to Eva, just as it was in the original report in (2). But that's not expected at all given what we just said about indexical expressions being anchored to the (current) sayer. All other things being equal, the expected interpretation for (5) would therefore be (6), which is not true!

(6)   # Eva said that the current sayer (the reporter of the event) put the cat food in the basement.

But all other things aren't equal. What happens in direct quotations is that the deictic anchor shifts from the current sayer to the sayer to whom the quotation is attributed (that is, the subject of the verb introducing the quotation).

The way I've just it is the usual way indexical shift is described, and it is a fine description as far as it goes. But it is actually a bit unsatisfying because it doesn't tell us anything why shifts should happen, and happen in the way that they do. Davidson helps us out by making the following observation towards the end of the paper (p. 508): "in reported speech, one becomes the other speaker ..." This made the shifting indexicals come to life for me in a way that they didn't before. It wasn't really clear to me why there should be a rule that says, "Shift the deictic anchor to the subject of the verb that introduces the quotation". It seems much more natural why there might be a communicative strategy that says, "That person that you're about to report what they said? Become that person! Show us what that person said or did!" - we know from personal experience that people learn by watching other people do things, and not so much from listening to what they say.

As Davidson points out, in direct quotations with more than one indexical expression like (7), the reference shifts in tandem for all of them (at least in the ordinary case). Exceptions apparently occur in sign languages (p. 501), but in general indexicals obey a "shift together" constraint, requiring all of them within a single sentence to share the same deictic anchor. In other words, if the current sayer is in Philadelphia on April 10, 2020, here and today refer to Eva's values (Bethesda, March 15) rather than the current sayer's (Philadelphia, April 10).

(7)     Eva said, "I put the cat food here today".

Traditionally, direct and indirect speech have been held to be semantically equivalent. In particular, both (4) and (5) are held to entail (= have as their necessary consequence) that Eva did indeed put away the cat food. But as Davidson points out, direct quotations like (5) can be - and actually are - better understood as reports of events themselves rather than reports of reports. This is because the assumed entailment parallel doesn't actually hold. Imagine that Eva's family has a parrot called Polly, and imagine that Polly utters (2b). This state of affairs can then be reported as in (8a), but not as in (8b).

(8) a.   Polly said, "I put the cat food in the basement".
b.   Polly said that she put the cat food in the basement.

Moreover, (8a) certainly does not entail, as (8b) does, that the parrot put the cat food in the basement. All (8a) entails is that there was an event of Polly uttering certain sounds. For this reason, Davidson suggests that direct quotations are demonstrations (we could also say performances or (re-)enactments). In other words, what (7) means is something along the lines of (9). In what follows, I'll indicate the scope of quotation with quotation marks as usual, and then additionally indicate non-verbal demonstrations with square brackets.

(9) Polly said something, and I will now demonstrate that something: " [ insert demonstration here ] "

This way of thinking extends to the ordinary (non-parrot) cases as well. In other words, if (8a) involves a demonstration, there's nothing to stop us from saying that direct quotation does so in general.

In writing, the beginning and end of the demonstration are signalled by quotation marks. In spoken language, the scope of the demonstration is indicated in other ways. These include prosodic means such as shifts in voice quality, changes in rhythm, etc. or (in some literate societies and cultures like ours) means that are parasitic on writing conventions (air quotes). But all that is needed for the demonstration to be recognized as such is some more or less conventional signal. In sign languages, these signals are typically changes in eye contact and body orientation (you may remember these strategies from the story about the lumberjack and the deaf tree, where the signer takes on the persona of the lumberjack and the tree doctor).

The be like construction

I have to say: I find it higly amusing to see this very vernacular, often deprecated, construction, cast in a starring, or at least supporting, role in a paper with so many Greek letters in it!

I'm hoping that Davidson's discussion of the be like construction is clear. Continuing with her main idea that direct quotations are demonstrations of speech events, she uses the be like construction as a concrete illustration that speakers of English are already familiar with. A useful way of thinking about the faithfulness of quotations to the original speech event is to split this apparently atomic property into two main types: faithfulness to truth-conditional meaning and faithfulness to other properties. What I mean by "truth-conditional meaning" is that part of a sentence's meaning that, intuitively speaking, is invariant and doesn't change depending on other properties of a speech event. The other properties might need to be split further (into loudness, whininess, co-presence of grin, etc.,) but for the moment let's keep it simple and focus on what vs. how (as in "I appreciate what you said, but I sure don't like how you said it"). For the spoken modality, the distinction might be called 'verbal' vs. 'non-verbal'. I don't want to use those terms here, though, because it is too easy to confuse 'verbal' with 'vocal', and then that would lead to grave confusion when the discussion turns to sign languages. So instead, I'll call these two properties the content property and the manner properties. We could then say that written verbatim quotations are performances that are basically 100% content-faithful, but they don't even try to be manner-faithful. Indeed, how could it be otherwise, given that the written modality is not set up for demonstrations? Writing provides only very limited dedicated tools for representing the manner of speech events without quotations - a standard keyboad gives you a bit of punctuation and all caps, and there are ways to use different fonts, and underlining, and the recent invention of emojis, but that's about it. In fact, the popularity of emojis is surely a clear indication that people would like to be more manner-faithful than a standard alphabet allows them to be. The to-linguists-mysterious use of quotation marks may belong here too (I mean the quotation marks you see in corner convenience stores and laundromats: Be an "angel" - put your laundry "HERE"). Of course, it is possible to represent various manners of saying by describing them, but if the manner isn't relevant, description is overkill since it is hugely laborious to specify. "How exactly would you describe the tone of voice he used? Is sneering the right word? Or do I mean snarky? word? And what exactly is the difference between sneering and snarky anyway?" By contrast, ordinary face-to-face communication imposes no constraints on how high you can go in the direction of manner-faithfulness, so that part of the reported material can become increasingly prominent, to the extent of eliminating description completely. This is evident from the fact that the be like construction can be used not only for reports of speech and attitudes, but for reports of actions as well. (Conceivably, there is a trade-off between content-faithfulness and manner-faithfulness. The example in (10c) might be very faithful to the manner of the cat's behavior, but if the sayer's intent is to unambiguously convey the message "My cat was very hungry", the demonstration is probably not as faithful to that content as the description just given since the cat might be clawing, yowling, and so on for some other reason. But of course, that would be easy to fix by adding "(because it was) hungry" to the message. That way, the sayer could get the best of both communicative strategies.)

(10) a.   My cat was all like, "Feed me!" (spoken in a regular voice)
b.   My cat was all like, "Feed me!" (spoken in a high-pitched yowl-like voice)
c.   My cat was all like, " [ demonstration of cat clawing on owner's leg, along with yowling, staring, etc.]"

In order to distinguish regular use of language from demonstrations it is necessary to mark or set off the demonstrations in some way. This setting off is different for the different modalities:

Question: (11) gives some variously complicated examples of the be like construction along with related examples. What's going on in these examples?

Hint: Recall the three semiotic/communicative strategies of describing (via symbols), demonstrating (via icons), and indicating (via points). Points are irrelevant here.

(11)     My cat was all like, " [ Y-O-O-W-L ] ". ← (long-drawn out yowl sound)
b.   My cat was all like, " yowl ". ← (the word "yowl" itself, said more or less yowlingly))
c.   My cat was all like, " yowling ". ← (said in a yowling way)
d.   My cat was yowling.

Another such example is (12). (12b) would likely be accompanied by the appropriate demonstration; cf. the discussion of bimodal bilingualism in Section 3.2. Again, how could we describe the difference between (12a) and (12b)?

(12) a.   Pure demonstration: My friend was like, " [ wide grin, moving shoulders as if strutting ]" ← silent demonstration
b.   Description of demonstration: My friend was all like, "grin, strut". ← words (accompanied by optional demonstration)

In concluding this section, a parallel that comes to mind in connection with Davidson's discussion of written vs. spoken modality comes from various musical traditions. Since about 1750, the Western classical tradition has increasingly insisted on verbatim performances of musical scores (the written modality for music), but most musical traditions, including the Western classical tradition before that time, were much more flexible and improvisational. Moreover, no matter what the tradition, many questions arise in the domain of music that seem analogous to the ones raised in Davidson's paper for language. There has been some work in a Festschrift for Ursula Bellugi and Edward Klima on conductors demonstrating the kind of sound they want from an orchestra by using crossmodal manual icons ("F" handshape for "thin" sounds, large "C" handshape for "full" or "round" sounds, reminiscent of the Emmorey and Herzig experiment). The conductor's baton of course also provide clear visual icons concerning tempo and related musical properties. Because these icons are silent, they can accompany performances before an audience without distracting from the music. In rehearsal, conductors and performers are free to employ acoustic icons as well, by humming, singing, or otherwise producing acoustic icons to convey desired properties of rhythm, attack, tempo, "color", and so on. Glenn Gould was infamous for continuing to use these icons in public performances and in the studio; the sound engineers would try to edit them out, with mixed success.

Excursus on cadenzas: The insistence on verbatim performances can reach comical heights, as for instance when a conductor insists that a pianist perform the cadenza that he wants rather than the one she wants, and when she refuses, he refuses to perform with her. In the time of Bach, virtuoso performers were expected to be virtuoso improvisers, and cadenzas were a showcase of both talents within the confines of the concerto form. For musicians of that time, a fight over which cadenza to play verbatim would have been incomprehensible, because cadenzas by definition were not written down or fixed. Eventually, cadenzas did get written down, and eventually it got to the point performers played not their own cadenzas, but some other composer's, and that's where we are today. Beethoven was the last to write his own.

Monsters and supermonsters

I'm including a discussion of monsters and supermonsters here. In the paper, these notions becomes relevant in Section 4.4, which I've asked you to skip for this class. So you should feel free to skip this section too and proceed straight to the discussion on Classifiers.

Consider the examples with direct quotation in (13).

(13) a. Eva-i said, "I-i put away the cat food".
b. * Eva-i said, "She-i put away the cat food".

In (13a), I refers to Eva, as indicated by the suffixed index "i". In (13b), She cannot refer to Eva, as indicated by the asterisk preceding the entire example. This is because the direct quotation context shifts the deictic anchor to Eva, and she would not refer to herself as 'she'.

Now consider the indirect quotation counterparts to (14) in (15).

(14) a. * Eva-i said that I-i put away the cat food.
b. Eva-i said that she-i put away the cat food.

Here, the facts are the converse. In (14a), the deictic anchor is the current sayer, not Eva, so I must refer to the the current sayer and so can't share the referential index with Eva. For the same reason, she is free to refer to Eva in (14b). (In both (13b) and (14b), she can refer to some other female individual. We would indicate this by using a different index - she-j. The interpretation, though available, is irrelevant for present purposes.)

It seems almost inconceivable that languages should exist that allow (14a), and therefore indexicals that shift refer in indirect quotations have been termed 'monsters'. However, as Davidson notes, it has been reported that Amharic allows such monsters. In other words, the equivalent of (14a) is reported to be possible with the intended interpretation (p. 590, example 72a).

Bizarre as the Amharic case sounds (if the claim holds up), at least the verb of saying gives the recipient a warning that a quotation with potential indexical shift is coming up. That is, the monster is not completely unexpected.

Even worse would be the possibility of sentences with shifted reference that contain no contextual cue or warning. So imagine someone (not a dentist) came up to you and said I like milk chocolate best, but what they meant was The dentist likes milk chocolate best! That use of indexical is what is called a supermonster, and it has been claimed to exist in sign languages. On the face of it, that sounds truly crazy. How would reliable communication function if supermonsters were possible? This is the impetus for Davidson to come up with an analysis of such examples that doesn't involve supermonsters. As mentioned earlier, this issue is beyond the scope of the class, but it is discussed in Davidson's paper in Section 4.4.

Classifiers and classifier constructions

I'm hoping that the discussion of classifiers and classifier constructions will have the feel of a review (we've read and discussed some of the literature that Davidson cites).

The general idea is that classifier constructions involve light verbs. This is just another term for what we called verbal roots earlier in the class. I'll use the term 'light verbs' here to be consistent with Davidson's usage. Following other researchers, Davidson assumes four such light verbs (in connection with role shift, discussed later on, she introduces a further light verb, IMITATE, but we will ignore that one for now).

These light verbs have semantic content, but they can't be expressed in isolation because they are not themselves specified for phonological features. For one thing, they need to be associated with a handshape - this is provided by whole entity or handling classifiers. In addition, other phonological parameters (notably, orientation and path movement) need to be specified in as well. These last two are filled in, Davidson suggests, by the signer demonstrating contextually variable, iconically appropriate material for the light verb at issue:

These demonstrations modify the light verbs. In other words, they answer how or where questions: how/where is it located, how did the thing move, how does it extend, how did the agent handle the thing?). By contrast, quotation demonstrations answer a what question (what did the sayer say or express?) The what vs. how difference does not affect the overall point, though, that both direct quotations and classifier construction involve demonstrations.

In connection with Section 3.1, recall from our own discussion of classifiers in class that the first three light verbs (LOCATE, MOVE, EXTEND) imply an argument (a participant in the event) that is an undergoer in Johnson and Schembri's terms. (Davidson calls the argument in question an experiencer, but it is more conventional to reserve that term for animate entities that undergo an experience, not inanimate entities like books that undergo pure movement.) The last light verb (MANIPULATE) implies an agent (a participant doing the manipulating). This light verb combines with handling classifiers, which can iconically evoke the agent and the undergoer simultaneously by demonstrating the way the agent's hand would interact with the undergoer, the object being handled.

The examples in Davidson's (50) and (51) are similar to the ones we discussed in class.

Role shift

This section is very brief, I know. I will try to beef it up in response ot questions in the Canvas Threaded Discussion, but for the moment, it should be better than nothing.

In her section on role shift, Davidson introduces three types of reports:

  • Language reports
  • Attitude reports
  • Action reports
She argues that the first two types are ordinary quotations. It doesn't make sense to call the third type quotations, and so Davidson proposes to revive an analysis originally proposed by Supalla 1982, according to which action reports are classifier constructions with an underlying light verb IMITATE; this light verb is then modified in the usual way by a demonstration.

In Davidson's analysis, ASL action reports closely resemble the English be like construction. The two constructions differ in that the demonstration functions as the object of the English preposition like, whereas it functions as a modifier of the ASL light verb IMITATE.