Stochastic n-gram language models have been successfully applied in continuous speech recognition for several years. Such language models provide many computational advantages but also require huge text corpora for parameter estimation. Moreover, the texts must exactly reflect, in a statistical sense, the user’s language. Estimating a language model on a sample that is not representative severely affects speech recognition performance. A solution to this problem is provided by the Bayesian learning framework. Beyond the classical estimates, a Bayes derived interpolation model is proposed. Empirical comparisons have been carried out on a 10,000-word radiological reporting domain. Results are provided in terms of perplexity and recognition accuracy

Bayesian Estimation Methods for N-Gram Language Model Adaptation

Federico, Marcello
1996

Abstract

Stochastic n-gram language models have been successfully applied in continuous speech recognition for several years. Such language models provide many computational advantages but also require huge text corpora for parameter estimation. Moreover, the texts must exactly reflect, in a statistical sense, the user’s language. Estimating a language model on a sample that is not representative severely affects speech recognition performance. A solution to this problem is provided by the Bayesian learning framework. Beyond the classical estimates, a Bayes derived interpolation model is proposed. Empirical comparisons have been carried out on a 10,000-word radiological reporting domain. Results are provided in terms of perplexity and recognition accuracy
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11582/1226
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact