Previous ] Home ] Up ] Next ]

A prosody-only decision-tree model for disfluency detection

Elizabeth Shriberg, Rebecca Bates, and Andreas Stolcke

Speech disfluencies (filled pauses, repetitions, repairs, and false starts) are pervasive in spontaneous speech. The ability to detect and correct disfluencies automatically is important for effective natural language understanding, as well as to improve speech models in general. Previous approaches to disfluency detection have relied heavily on lexical information, which makes them less applicable when word recognition is unreliable. We have developed a disfluency detection method using decision tree classifiers that use only local and automatically extracted prosodic features. Because the model doesn't rely on lexical information, it is widely applicable even when word recognition is unreliable. The model performed significantly better than chance at detecting four disfluency types. It also outperformed a language model in the detection of false starts, given the correct transcripュtion. Combining the prosody model with a specialized language model improved accuracy over either model alone for the detection of false starts. Results suggest that a prosody only model can aid the automatic detection of disfluencies in spontaneous speech.
Shriberg, E., R. Bates, & A. Stolcke 1997 A prosody-only decision-tree model for disfluency detection. In Proceedings of Eurospeech 97, Rhodes, Greece.

Key points relevant to the study of filled pauses



Previous ] Home ] Up ] Next ]

send feedback

This site is maintained by Ralph L. Rose
Last Revised: 99/08/26

Note! This is the original FPRC ca. 1998. It is made available for archival purposes only. Click here to return to the current FPRC.