Phonological and Phonetic Biases in Speech Perception

This project was supported by a SBE Doctoral Dissertation Research Improvement Grant from the National Science Foundation (Award No. 0951846): 2010-2012.

You can download the entire dissertation here.


This dissertation investigates how knowledge of phonological generalizations influences speech perception, with a particular focus on evidence that phonological processing is autonomous from (rather than interactive with) auditory processing. A OT model of comprehension is proposed in which auditory cue constraints and markedness constraints interact to determine a surface representation, which is taken to be isomorphic to the listener’s perceptual response under some psychophysical conditions. Constraint ranking is argued to be stochastic in this model on the basis that the probability of computing the least marked surface representation (and perceptual response) is greater when the input auditory representation is ambiguous between two alternative categories than when it strongly favors a category that completes a more marked surface representation (and perceptual response).

Behavioral tasks provide a measure of the perception of three different kinds of phonological generalizations: assimilation processes, phonotactic restrictions, and allophonic variation. The diversity of effects observed with the three kinds of generalizations implicate different types of markedness constraints.

Assimilation processes in English and French

Studies of the perception of assimilated speech show that listeners effectively perceive the category that exemplifies the input to assimilation over one that represents the output. For instance, listeners report hearing “t” more in viable assimilations like freigh[p] bearer than in non-viable assimilations such as freigh[p] carrier. This leads to a greater probability of confusion of the categories that represent the input and output categories of assimilation when the context is viable for assimilation, than when it isn’t. If listeners use phonological categories to tell sounds apart, [t] should be confused with [p] more often before [b] than before [k] since the latter is not the appropriate context for a /t/ to [p] assimilation. The same expectation arises with French voicing assimilation (e.g. ro[p]e sale ’dirty dress, la[g] gelé ’frozen lake’; cf. *ca[b]e neuve ‘new cape’): segments that are the input and output of assimilation, such as [b] and [p], should be more confusable in assimilation-viable contexts, such as [s], than in non-viable contexts, such as [n].

This prediction was tested in AX and 4IAX discrimination tasks that presented listeners with viability contrasts. There is reason to believe that AX taps categorical representations of speech while 4IAX taps pre-categorical representations. This psychophysical contrast provides for a test of whether phonological processing is autonomous from or interactive with auditory processing. English and French listeners performed discrimination of both place and voicing assimilation.

Allophonic variation: German dorsal fricatives

In the case of allophonic variation, I tested for effects of knowledge in a somewhat different way. True allophony means exhaustive complementary distribution, therefore there is no analogue to the viable vs. unviable context comparison discussed for assimilation processes.

The empirical case of interest is the distribution of the voiceless dorsal fricatives in German. The palatal fricative [ç] occurs after front vowels ([ɪç] ‘I’) and consonants ([mʏnçən] ‘Munich’), while the velar fricative [x] (or uvular fricative [X] for some dialects/speakers) occurs after back vowels ([bux] ‘book’). I tested whether AXB categorization of a phonetic continuum from one allophone to another depends upon the disambiguating context. German and English listeners had to decide whether or not /ç/ had occurred (in the X position in the trial) by comparing it to both A and B, which were always opposite endpoints of the continuum.

Test yourself on sample stimuli! Count how many times you hear [ç] and [x] in these two continua.

Back vowel context [WAV]

Front vowel context [WAV]

Phonotactic restrictions: Knowledge about /ɹ/ in non-rhotic English dialects

The phenomena of interest are the distributional restrictions on /ɹ/ in certain non-rhotic dialects of English. /ɹ/ is banned from codas and certain (heterosyllabic) vowel-vowel sequences cannot surface. The former restriction is obeyed by deleting /ɹ/ (e.g. park your car [pa:k jə ka:]), while the latter is obeyed by retaining an underlying /ɹ/or epenthesizing one (e.g. the car is [ðə kaɹ ɪz], the idea is [ðə aɪdiəɹ ɪz]).

I tested the hypothesis that knowledge of these restrictions and their repairs sets up a bias against perceiving an illegal form. This was tested by presenting a [ɹ]-zero continuum in coda and intervocalic contexts in a categorial discrimination (AXB) experiment. Listeners of rhotic and non-rhotic dialects spoken had to decide whether or not /ɹ/ had occurred (in the X position in the trial) by comparing it to both A and B, which were always opposite endpoints of the continuum.

Effectively then, listeners were categorizing the X stimulus as either containing /ɹ/ or not. Responses from the [ɹ]-zero continuum were then compared from those obtained with [l]-zero and [j]-zero control continua. The prediction is that non-rhotic listeners should categorize more of the [ɹ]-zero continuum as the same as the /ɹ/-endpoint in the intervocalic context – the alternative non-/ɹ/ endpoint would constitute an illicit V.V sequence. Likewise, fewer “r” judgments of the same [ɹ]-zero continuum are predicted in the coda context, since an “r” judgment would constitute perceiving an illicit coda /ɹ/.

Test yourself on sample stimuli! Count how many times you hear [ɹ] in these two continua.

Intervocalic context [WAV]

Coda context [WAV]

Comments are closed.