Word
predictability after hesitations: A corpus-based study
|
Elizabeth Shriberg and Andreas Stolcke
|
|
Abstract |
We ask whether lexical hesitations in spontaneous speech tend to
precede words that are difficult to predict. We define predictability in terms of both
transition probability and entropy, in the context of an N-gram language model. Results
show that transition probability is significantly lower at hesitation transitions, and
that this is attributable to both the following word and the word history. In addition,
results suggest that fluent transitions in sentences with a hesitation elsewhere are
significantly more likely than transitions in fluent sentences to contain
out-of-vocabulary words and novel word combinations. Such findings could be used to
improve statistical language modeling for spontaneous-speech applications. |
|
|
Shriberg,
E. & A. Stolcke 1996 Word predictability after hesitations: A corpus-based
study. In Proceedings of the International Conference on Spoken Language Processing
3: 1868-1871. |