R. Van Hout and F. Hinskens (eds.),
Proceedings of the International Workshop on
Language Variation and Linguistic Theory,
Nijmegen, September 1995
William Labov, University of Pennsylvania
This is a paper about a negative result, but it is not, I think, a negative paper. It investigates the possibility that a process of resyllabification will account for the sonority hierarchy in the constraint of a following segment on -t,d deletion. This has appeared to be an attractive, even irresistible idea for many researchers, one that would link empirical studies of variation to formal phonological theory. I have been not convinced that there were strong grounds for this belief, and this paper reports the results of efforts to find out what resyllabification actually takes place. At the outset, I thought that this was a case of formal theory leading empirical researchers astray; in the end, I discovered that it was the other way around. At the end of the paper, I will address the question, if not resyllabification, what then?
My own conception of theory is somewhat different from many of my colleagues, who equate the re-organizing activity of theory with the construction of models intended to represent language as a whole. This is certainly a valuable and useful activity, and as I will argue, an essential one. But it is not the only way in which we can go about increasing the generality and depth of our understanding of language. An alternative and complementary approach to theory construction is an inferential and inductive procedure, which builds in a cumulative manner on what we already know, generalizing from the known to the unknown. At various stages in the development of a field, the inferential approach to general principles may be more useful than the deductive approach; but this is a matter of judgment. Since there are far more languages and ways of using language than can ever be described, a fruitful investigation will begin by asking which observations and which experiments will eventually lead to a more fruitful and more rapid accumulation of knowledge. This planning will be guided by a detailed mapping of what is already known and not yet known, in order to define the frontiers of knowledge that are to be advanced. But it also needs guidance from an overall theory of what languages are like.
Over the past several decades, the study of linguistic change and variation, more or less embedded in quantitative sociolinguistics, has been guided by the inferential approach to theory. The results have shown, I believe, a cumulative character. The presupposition of this conference is that the relations between variation studies and linguistic theory have not been healthy: theoreticians have ignored the data developed by those studying variation in the use of language, and students of variation have failed to keep abreast of linguistic theory. The major topic of my presentation here, the study of (t,d) deletion, shows many grounds for this dissatisfaction. The brief sketch of the history of this miniature field will show that the original formulation of (t,d) deletion has been criticized because it was a modification of the SPE format (Chomsky and Halle 1968), based on a conception of rule that is now outmoded in linguistic theory. A more recent finding, the exponential model of Guy, was formulated in terms of autosegmental theory and the lexical phonology of the late 1980's. Because these models are no longer current, students of the 1990's are attempting to reformulate the findings of variation studies in terms of optimality theory. Those whose primary interest is the description of languages, as well as some of those concerned with uncovering general principles of the language faculty, have asked whether such repeated reformulations do represent a cumulative increase in our knowledge.
I believe that mistrust of formal linguistic theory on the basis of its instability is misguided. Formal theorists cannot be committed to the preservation of continuity with past descriptions of language, nor accept the responsibility of maintaining downward compatibility with preceding theories. Formal linguistic theory is not a cumulative activity in this sense. The chief value of formal models, I believe, is to draw the attention of empirical investigators to undetermined relationships and unanswered questions that they have overlooked. The discussion to follow will show many examples of such constructive intervention. Once such questions have been raised, and clearly formulated, the chief purpose of the model has been achieved. It may then fruitfully be dissolved and replaced by other models, which will reveal new aspects to be investigated. The cumulative character of the enterprise lies not in the models, but in the gradual development of our knowledge through further inference and investigation. Formal linguistic theory is thus an indexical rather than a substantive activity. Insofar as it draws attention to unperceived relationships, it is an essential component of any research that would contribute to general linguistics.
It is generally understood that linguistics is a search for invariance , that is, an effort to remove variation through the formulation of invariant rules (Jakobson and Halle 1972) The study of linguistic variation might then be considered marginal to this effort. However, the same conception of linguistics leads us to the notion of variation as the fundamental problem of linguistics from which every investigation departs. If for example, every yes-no question of a language could be formulated by adding an interrogative particle to declarative sentences, no help from a linguist would be needed to describe this fact. It is only when some questions use one particle and other questions use another that a linguistic investigation is called for. The task of the linguist is then to discover the invariant rule that determines question formation, usually in the form of complementary distribution of the environments of the alternating particles.
The systematic study of variation begins when the search for such invariance fails. We often find near-complementary distribution instead, where no rule exists that will predict which particle a speaker uses at any given time. The study of variation therefore takes up where the search for invariance leaves off, or is abandoned. The two types of study are immediately differentiated in their methodology. The search for invariance normally relies upon access to linguistic norms through introspection and elicitation. Whatever the reliability of these methods may be for invariant behavior, they have little value for the study of variable behavior. The analysis of variation must proceed by observations of the production of language, and experiments to obtain data on the perception of language.
This leaves two major questions about the place of variation studies in linguistic description. (1) How much of language structure is invariant, and how much shows inherent variation? (2) How much of this variation can be described, and what is the benefit of doing so? To the first question, I have no present answer, but I will make an effort to answer the second.
My own original motivation for the study of variation springs from a concern with linguistic change in progress (Labov 1963, 1966). Since the causes and much of the mechanism of linguistic change remain mysterious, it seems reasonable that a closer view of the process of change, based on the fluctuations of behavior from one generation to another, will illuminate our understanding. Thus the study of variation articulates naturally with the general study of the history and evolution of languages, as introductory textbooks in historical linguistics plainly show.
The place of variation studies is not so obvious in the case of the synchronic description of languages. In all language communities studied so far, we find stable variation, transmitted in much the same form across many generations. How much effort should be committed to the description of such variation, and to what purpose?. One answer often given is that the analysis of variation can help to confirm, complete or repair formal linguistic models. But for the moment I would rather focus on the goal of contributing to the permanent body of empirical knowledge about language that proceeds from observation and experiment. Given the enormous range of linguistic and dialectal diversity among speakers, and the vast amount of linguistic data produced in the course of any one speaker's daily life, the problem of locating the most productive lines of research is not a trivial one. The simple accumulation of more factual knowledge rarely leads to a permanent contribution: unless the new information can be indexed as part of an answer to a general question, it will probably be lost. A review of the history of the study of consonant cluster simplification in English shows that many lines of research were motivated by such general questions of language structure, and can be retrieved now by reference to those questions.
(a). The first quantitative studies of variation were concerned primarily with social constraints on the variables (Labov 1963, 1966; Shuy, Wolfram and Riley 1966). The first study of internal, linguistic constraints on a variable was the examination of consonant cluster simplification in the speech of black residents of South Harlem (Labov, Cohen and Robins 1965). All possible coda clusters were systematically tabulated as variable or invariant, and the definition of the variable as (t,d) deletion emerged for more detailed examination (Labov, Cohen, Robins and Lewis 1968). Other community studies (Wolfram 1969, Fasold 1972, Labov 1972) converged on the finding that in all English dialects, (t,d) deletion was favored by the sonority of the following environment, by the presence of two preceding consonants, by absence of stress, by homogeneity of voicing in the cluster, and by the absence of any grammatical function of the deleted segment.
Because the variable operates in the same way for all English speakers, it has served as a basic introduction to the analysis of inherent variation for several generations of students, who can observe these constraints operating in any recorded body of speech, including their own.
(b) The first formulation of variable rules was a description of (t,d) deletion in a modification of the rewrite rules developed in Chomsky and Halle 1968. This notation says that /t/ or /d/ is variably deleted after a consonant at the end of a word, and this process happens more often in unstressed syllables as in oldest, when a third consonant precedes as in next, and when there is no preceding grammatical boundary. The angled brackets following the dash show that the first segment of the following word favors deletion if it carries the [+consonantal feature], and secondly the [-vocalic feature], which generates the order of declining simplification by the manner of following segment as obstruent > liquid > glide > vowel.
(1)
This notation carried a great deal of information, but it produced considerable disagreement on whether or how quantitative information might be incorporated into a grammar. It was argued that generative rules were intended to represent possible types, and so they could not convey information on token frequencies (Kay and McDaniel 1979). It was also said that quantitative information could not be included in representations of linguistic competence since that would imply that people store numbers in their brains (Bickerton 1971)
(c) The first efforts to constrain the possible form of constraints were put forward by Wolfram and Fasold, who developed a model of geometric ordering, in which the strongest favoring constraint would outweigh a disfavoring setting of all others/
(d) The first of several efforts to reduce the quantitative information to more qualitative form was the proposal by DeCamp (1971) and Bickerton (1972) to use Guttman scaling of linguistic data , in which implicational scales predicted the nature and location of variation for environments and speakers. After considerable discussion, it appeared that binary implicational scales must be replaced by n-ary scales with more quantitative information. Finally, it was shown by Sankoff and Rousseau (1974) that the number of scaling errors in an implicational scale was within the range predicted by variable rules, so that no additional information was provided by implicational scales.
(e) A functional interpretation of the grammatical factor group was put forward by Kiparsky 1971,[1] and restated as a general "tendency for semantically relevant information to be retained in surface structure." (Kiparsky 1982:87). Guy 1991a pointed out that this would imply that participial /t/ or /d/ would be deleted more often than preterit /t/ or /d/, but this is not the case.
(f) Guy's studies of (t,d) deletion in Philadelphia (1980) established a high degree of uniformity across individual speakers, which matched the community pattern whenever the quantity of data was sufficient (approaching a total of 300 tokens).
(g) To obviate the incorporation of quantitative data into grammars of linguistic competence, it was proposed that the uniform constraints on (t,d) deletion like the sonority hierarchy, be explained by a set of universal, extra-linguistic factors (Kiparsky 1984). In the Philadelphia study, Guy had found that the effect of final pause did not show the uniformity of the other constraints, but varied across dialects. It was proposed that this difference might be the reflection of other phonetic factors, like differences in the pattern of final release.
(h) A series of studies focusing on the ambiguous or derivational class of loss, kept, told, etc. gradually showed that this factor was not uniform, but varied across individuals by age grading: the youngest speakers treated this /t,d/ as absent; older speakers as monomorphemic, and still older speakers as equivalent to preterit /t,d/ (Labov et al. 1968, Guy 1980, Guy and Boyd 1990, Labov 1989, Roberts 1995). Roberts found that children as young as 3 years old had acquired the main constraints of the Philadelphia system, and differed from their parents only in the derivational class. This indicated that the probabilities acquired by children were attached to abstract syntactic nodes, rather than to surface forms.
(i) Guy 1991b established narrower constraints on the relations within the grammatical factor group. The exponential model relates the retention of past tense, derivational and monomorphemic clusters in the ratios of x : x2 : x3. (cf. also Santa Ana 1991, Bayley 19??). Guy explained this relation as the consequences of the model of lexical phonology, under which the three types of clusters are processed by the rule once, twice or three times.
(j) An alternative account of (t,d) deletion was put forward by proponents of Optimality Theory, in which the ordering of a set of universal constraints produced an output dictated by the minimal violations of a set of universal constraints, in a dialect-specific ordering. To produce a variable output, the theory might be modified to allow variable ordering (Reynolds 1994). Kiparsky (1994) proposed that the exponential relationship could be accounted for by an exploded optimality constraint that would have the effect of processing clusters once, twice or three times.
Reviewing these ten stages of (t,d) studies, we see that two of them (a,f) establish the extent and regularity of the basic phenomenon; three are concerned with the general questions of transmission and acquisition (e,g,h); and five are involved with accounting for (t,d) deletion by a particular formal model. The presentation to follow pursues the general direction of these five: to establish a firm working relationship between the data on variation and abstract models of linguistic structure.
A common feature of all recent efforts to relate (t,d) deletion to formal models is the mechanism of resyllabification. In these models, it is argued that all of the phonological constraints on consonant clusters can be accounted for by a discrete condition on syllable structure: the retention of a final consonant is favored when it can form part of a following onset.
Guy's exposition of the exponential model incorporates an autosegmental representation of (t,d) deletion shown in (2).
(2) Autosegmental representation of (t,d) deletion (Guy 1991:19)
In this account, the second C is delinked from the higher phonological structure without being removed from the melodic stream. The segment is therefore still available to be relinked by syllabification to a following syllable attached later in the derivation. If it is not, it is erased -- that is, receives no phonetic realization. Deletion of a consonant is equivalent to its failure to be associated with the segmental skeleton. Efforts to deal with -t,d deletion within optimality theory also adopt this mechanism of resyllabification (Reynolds 1994). (3) shows the constraints proposed by Kiparsky in 1993, in an effort to capture Guy's exponential model within an optimality framework; he resolves the coda constraint into two NO CODA and NO COMPLEX CODA, and explicitly mentions resyllabification in the definition of ALIGN-LEFT-WORD.
(3) (a) Syllable well-formedness constraints:
ONSET, *CODA, *COMPLEX
(b) Alignment constraints:
ALIGN-LEFT-WORD: No resyllabification is allowed across word boundaries.
ALIGN-RIGHT-PHRASE: Phrase-final consonants are not deleted.
Resyllabification might account for more than one of the variable rule factor groups: not only the influence of the following consonant, but also perhaps the preceding consonant, the pre-preceding consonant, the homogeneity of the cluster, and the effect of stress. This would leave only the grammatical factor group unaccounted for.
In the light of the findings to follow, it would have been tempting to call this paper "The myth of resyllabification," since it will appear that most of the evidence is negative. Yet it would be a serious error to argue that resyllabification does not take place: it is not a myth, but a reality, as we will see. The target of our analysis is the proposition that resyllabification can account for the effect of the following environment on consonant cluster simplification.
Let us begin with the main arguments for resyllabification. They rest first on the existence of the well known sonority hierarchy. This dates back at least to dating back to Saussure 1949, whose 6 degrees of aperture took into account acoustic sonority as well as articulatory opening (Appendix, Chapter II). The sonority hierarchy appears to account for the ordering of the following environment in favoring deletion: stops, fricatives, liquids, glides, vowels. Resyllabification can account for the retention of clusters before vowels by the fact that the final /t/ or /d/ can readily act as the initial onset for a following vowel, since a single consonant onset is the most favored of syllable types. Stop plus glide and stop plus liquid follow in a slightly more marked status, but stop plus obstruent is not a possible onset of English.
One of the most striking features of the resyllabification hypothesis is that it makes predictions that are contrary to assumptions of the original (t,d) analyses. In these first studies, and all those that followed, it was assumed that deletion constraints should be formulated in terms of the natural classes of glides and liquids. But the specifics of English resyllabification demand that /l/ be treated differently from /r/, and predict that the deletion rate for /l/ will be much higher. As we will see, this is strongly confirmed by the data..
The history of languages give many examples of resyllabification. Perhaps the best known in the history of English are those shown in (3), which were the results of cliticization of the indefinite article and re-cutting. As seen in (4a) and (4b), the process operated in both directions.
(4) (a) M.E a napron -> Mod. E. an apron, a naddre -> an addre,
(b) M.E. an ewt -> Mod. E.a newt
Besides the historical evidence, it will be helpful to know how frequently resyllabification occurs in everyday speech. One source of evidence is the extent to which resyllabification produces misunderstandings. As part of a larger study of Cross-Dialectal Comprehension at the Penn Linguistics Laboratory, a large body of examples of naturally occurring misunderstandings were collected. This is still ongoing: the total was 763 misunderstandings at the time of this analysis. For our present purpose, we extract from these data what was intended by speaker A, and what was heard by speaker B, as in (5):
(5) A: A knife too.
B: => An ice cube.
This is a striking parallel to the historical process of re-cutting shown in (4). Let us see what further empirical evidence is available. This is an exact parallel to the historical process of re-cutting shown in (3), and as we will see, typical of a much larger set of examples. The total number of misunderstandings that showed resyllabification in the data set is almost 5% of the total: 34 in all. But the fact that syllables are cut differently by the listener from what the speaker intended does not necessarily bear on the hypothesis that resyllabification might account for the effect of a following segment on (t,d) deletion. That hypothesis must assume that the cause of the listener's mishearing is the phonetic production of the speaker, who in order to facilitate the articulation of the two consonants, transfers the final element of the coda to the onset of the following word. But it might well have been the listener who made such a transfer in order to facilitate interpretation (and solve problems produced by other mishearings)[2]. In order to test this hypothesis, we must look for evidence that a transfer by the speaker has in fact taken place.
The evidence we are looking for would bear on the conversion of a member of the coda to the first member of the onset; e.g., a final allophone would be converted to an initial allophone. English is relatively rich in the differentiation of initial and final allophones .
The gross contrasts between initial and final allophones are shown in (6-14). (6) shows the three-way contrasts in the sequencing of /t/ and /r/ that were once used to illustrate the different types of "juncture"; today we would simply characterize them by differences in phrase structure. The first transition in Nitrate shows a stressed /ay / nucleus with centralization before the voiceless consonant, which is assigned to the onset of the second syllable with secondary stress; it shows a weak aspiration but enough to produce devoicing and fricativization of the following /r/. The case of night rate shows the same nucleus before an unreleased final /t/ in the coda of night, followed by a fully voiced /r/ as the onset of rate. The third, Nye trait, ('trait of Senator Nye'.) shows a vowel in syllable-final position with a lengthened low nucleus and a more developed glide, and beginning the second syllable, a fully aspirated initial /t/ with consequent devoicing and fricativization of the /r/. (7) shows a parallel set of allophonic alternations with a final consonant cluster /st/ in castrate, cast rate and Cass trait 'characteristic of a man named Cass.'
Now let us examine the allophonic changes that would signal the resyllabification in English that would explain the sonority hierarchy. Examples (8-10) show the situation when a consonant cluster is followed by a vowel in the next word. In (8), the final /t/ is re-assigned to initial position, and takes on the full aspiration of an initial /t/ in a stressed syllable, so that last hour rhymes with glass tower. In (9) the weaker degree of aspiration of initial /t/ in an unstressed syllable is shown. When the cluster terminates in a /d/, as in (10), the contrast is much weaker. Word-final /d/ is a devoiced lenis stop, while initial /d/ in a stressed syllable is longer and fully voiced, so that sailed over will rhyme with hail Dover.
Case (11) shows that a final /t/ would be absorbed into a following affricate, so that missed you and Miss Chew would be homonyms. In (12), the /t/ combines with and aspirates an initial /w/ to override the difference between first win and worse twin. In (13), a /t/ is combined with an initial /h/ to give an good replica of an onset aspirated /t/, so that matched high equals match tie, and in (14), past her and pastor become indistinguishable.
It should be immediately obvious that these transfers would override the junctural differences that maintain the distinctions of (6) and (7).
At this point however we are pulled up short. First, the process of resyllabification is generally considered to apply to a single consonant between two vowels. Kenstowicz 1994 is quite specific on this point. So too, was Kahn in 1980 when he developed the concept that we are not dealing with a total transfer, but rather a linking between syllables creating an ambisyllabic consonant, as in (15).
(15) Rule IV from Kahn 1980:45.
p>
(16)
p> Thus Kahn would object on several grounds to using resyllabification to explain consonant cluster simplification. First, that relinking does not apply to complex codas or onsets, and second, that the phonetic stigmata of syllable origins persist. This means that efforts to apply resyllabification to final clusters, far from linking variation studies with phonological theory, are efforts to extend it in a direction never predicted before.
I should note however, an interesting passage in Rubach's recent article on shortening and ambisyllabicity in English, in Phonology 13,2, which hints at the extension of this rule to consider clusters.
(17) shows Rubach's rule of Onset Ambisyllabicity.
(17) Onset Syllabicity (Rubach 1996:222)
.
If onset ambisyllabicity can apply within words to clusters, why not across words, that is, before words beginning with /y/? This possibility, which as we will see is in fact realized productively, is frequently discussed in the fast speech rules that concern fixed collocations, like [wUdz&'} for What did you and [wødz&'} for What did you, and so on.
It may be that Kahn and Kenstowicz are wrong. Perhaps the second element of a cluster is regularly relinked to a following onset. What Kiparsky, Guy and Reynolds are proposing implies is that the fast speech rules of connected speech may override the phonetic reflexes of word boundaries. Given the phonetic alternations of (6-14), the idea of resyllabification takes on a firm and attractive meaning, since in a great many cases it would be demonstrated by a considerable physical difference in the realization of segments. With attention to these facts, we should be able to find out whether in fact speakers do reassign final consonants to following onsets. Before proceeding to examine the evidence of recorded speech on this point, it will be helpful to look more closely at the natural misunderstandings recoded in the CDC data base.
The strength of the natural misunderstanding data base lies in the fact that we have recorded carefully the social setting under which the misunderstanding took place, and how it came to light. However, we do not have a phonetic record of what was said, and must reconstruct this from the misunderstanding itself. If resyllabification takes place regularly as a means of realizing consonant clusters, there should be a high percentage of final clusters in the data set, and the misunderstandings should show the consequences of the reassignments noted in (5-13). In other words, the recorded "misunderstanding" may be a faithful phonetic perception of what was said. Most importantly, we may be able to deduce from these reported "errors" in perception the changes that actually took place in production.
The total number of misunderstandings in the data base that showed resyllabification in the data set is 34, almost 5% of the total. The fact that syllables are cut differently by the listener from what the speaker intended does not necessarily show that resyllabification was at work in production. That hypothesis must assume that the cause of the listener's mishearing is the phonetic production of the speaker, who in order to facilitate the articulation of the two consonants, transfers the final element of the coda to the onset of the following word. If resyllabification takes place regularly as a means of realizing consonant clusters, there should be a high percentage of final clusters in the data set, and the misunderstandings should show the consequences of the reassignments noted in (6-14). In other words, the recorded "misunderstanding" may be a faithful phonetic perception of what was said. Most importantly, we may be able to deduce from these reported "errors" in perception the changes that actually took place in production.
Appendix A lists the 34 misunderstandings that involved resyllabification. Section 1 of Appendix A presents 13 cases that involve the transfer of individual consonants from coda to onset or onset to coda. These are the exact cases that resyllabification as developed in Kahn 1980 was intended to deal with. Since they may be considered ambisyllabic, they are not expected to show be opposed phonetically by the phonetic features of coda and onset allophones, and so they are not immediately relevant to our problem.
Section 2 of Appendix A lists the misunderstandings involving gemination and degemination: 14 in all, which again are not relevant to the issue of resyllabification. The listener is free to interpret the stream of speech as involving one or two consonants at any time, since in spontaneous speech, degemination is automatic and obligatory.
Section 3 of Appendix A shows 4 cases that involve consonant clusters with final /s/, which are again not t relevant to the issue of resyllabification in -t,d deletion. It is only the final 3 cases of Section 4, headed Depalatalization that are even indirectly relevant. They are shown here as (19-21).
(19) A: [speaking of Nixon] Didn't he go to China recently?
B: =>
retire recently.
(20) A: Hey Joe! (on the phone)
B: => Pedro
(21) A: Some problems just were jumping out at you.
B:
=> jumping attitude
These three cases show the listener inferring lower degrees of palatalization than was intended by the speaker: /c&/ is misheard as a simple aspirated /t/, /j/ as /dr/, and /t+y/ as some form of fronted consonant plus vowel combination. This is a typical inverse misunderstanding. It does not mean of course that the listener heard speakers use such depalatalized forms, but rather that the listener drew on the knowledge that the aspirated /t/ of retire and /d/ of Pedro may be produced as a palatalized affricates, and /t/ before the highly fronted /u/ of attitude may be as much palatalized as if it had been followed by a /y/. These will relate to the process of resyllabification that we find in the study of speech production. We can infer that the listener might also misunderstand last year as las' cheer.
One can make several generalizations from this examination of the natural misunderstandings. First, the vast majority of reassignments of syllable position by the interpreter involve no inferences about changes in the allophonic form of consonants by the speaker that would correspond to resyllabification. On the contrary, the consonants involved are primarily those that show no such allophonic differences in form. This suggests but does not prove that such allophonic differences are significant aids to interpretation, and inhibit such re-assignments by the listener.
Table 1 sums up the data from the 34 natural misunderstandings involving possible changes of syllable structure. We see that the majority -- 19 -- involve /n/ and /s/. One involves final /r/, and 12 final stops. Of these 8 concern the apical stops /t/ and /d/, but only 3 refer even indirectly to the possibility of a resyllabification of the second element of consonant clusters. These three all concern palatalization, which as we will see in studies of spontaneous speech, is the only phonetic domain where resyllabification is an active phonetic process.