Automatic
linguistic segmentation of conversational speech
|
Andreas Stolcke and Elizabeth Shriberg
|
|
Abstract |
As speech recognition moves toward more unconstrained domains such as
conversational speech, we encounter a need to be able to segment (or resegment) waveforms
and recognizer output into linguistically meaningful units, such a sentences. Toward this
end, we present a simple automatic segmenter of transcripts based on N-gram language
modeling. We also study the relevance of several word-level features for segmentation
performance. Using only word-level information, we achieve 85% recall and 70% precision on
linguistic boundary detection. |
|
|
Stolcke,
A. & E. Shriberg 1996 Automatic linguistic segmentation of conversational
speech. In Proceedings of the International Conference on Spoken Language
Processing 2: 1005-1008. |