Preliminaries to a theory of speech disfluencies

Elizabeth Shriberg

Abstract

This thesis examines disfluencies (e.g., 'um', repeated words, and a variety of forms of self-repair) in the spontaneous speech of adult normal speakers of American English. Despite their prevalence, disfluencies have traditionally been viewed as irregular events and have received little attention. The goal of the thesis is to provide evidence that, on the contrary, disfluencies show remarkably regular trends in a number of dimensions. These regularities have consequences for models of human language production; they can also be exploited to improve performance in speech applications.

The method includes analysis of over 5000 hand-annotated disfluencies from a database (250,000 words) containing three different styles of spontaneous speech: task-oriented human-computer dialog, task-oriented human-human dialog, and human-human conversation on a prescribed topic. The approach is theory-neutral and strongly data-driven. The annotations correspond to observable characteristics (features) in the data, including: 1) the speech domain; 2) the speaker; 3) the sentence in which a disfluency occurs; 4) word-related characteristics of the disfluency; and 5) simple acoustic characteristics of the disfluency. A methodology is developed for representing these features in a database format, and an algorithm is provided for automatic disfluency type classification based on this representation.

Results show regular trends in disfluency rates by sentence length, by disfluency position, by presence of another disfluency in the same sentence, by disfluency type, and by combinations of these features both across and within speakers. Regularities are also found for word-related features of the disfluency, including the number of excised words, the rate of cut-off words, and the rate of editing phrases. Additional analyses describe characteristics of overlapping disfluencies and prosodic characteristics of the simplest disfluency types. Across analyses, data from the three different speech styles are compared; where relevant, simpleparametric models are provided.

In sum, disfluencies show regularities in a variety of dimensions. These regularities can help guide and constrain models of spoken language production. In addition they can be modeled in applications to improve the automatic processing of spontaneous speech.

Shriberg, E. 1994 Preliminaries to a theory of speech disfluencies. Unpublished Ph.D. thesis, University of California, Berkeley.

Preliminaries to a theory of speech disfluencies

Elizabeth Shriberg

Key points relevant to the study of filled pauses

Comments