Although Automatic Speech Recognition (ASR) systems based on hidden Markov models (HMMs) are popular and effective under many circumstances, they suffer from limitations that limit applicability of ASR technology in the real world. Between the end of the Eighties and the beginning of the Nineties, several searchers began applying Artificial Neural Networks (ANN) to ASR, with the aim to overcome such limitations. ANNs allowed for significant results on reduced-scale tasks, e.g. phoneme recognition, but they substantially failed in dealing with long time-sequences of speech signals. As a consequence, 'hybrid' systems were proposed, by combining HMMs and ANNs within a single architecture, in order to take advantage from the properties of both. This tutorial reviews some fundamental concepts of ASR, HMMs and ANNs for ASR. It then surveys major hybrid models for ASR, summarizing a variety of different architectures, novel training algorithms and experimental results from a highly specialistic and non-homogeneous literature. Five classes of hybrid systems are presented: (i) ANNs that emulate HMMs; (ii) connectionist estimate of posterior probabilities in a HMMs; (iii) joint HMM/ANN optimization over a single, overall training criterion; (iv) connectionist vector quantization for discrete HMMs; (v) ANNs for 'rescoring' the HMM hypothesis

A Tutorial on Connectionist and Hybrid HMM/Connectionist Systems for Speech Recognition

Trentin, Edmondo;Gori, Marco
1999-01-01

Abstract

Although Automatic Speech Recognition (ASR) systems based on hidden Markov models (HMMs) are popular and effective under many circumstances, they suffer from limitations that limit applicability of ASR technology in the real world. Between the end of the Eighties and the beginning of the Nineties, several searchers began applying Artificial Neural Networks (ANN) to ASR, with the aim to overcome such limitations. ANNs allowed for significant results on reduced-scale tasks, e.g. phoneme recognition, but they substantially failed in dealing with long time-sequences of speech signals. As a consequence, 'hybrid' systems were proposed, by combining HMMs and ANNs within a single architecture, in order to take advantage from the properties of both. This tutorial reviews some fundamental concepts of ASR, HMMs and ANNs for ASR. It then surveys major hybrid models for ASR, summarizing a variety of different architectures, novel training algorithms and experimental results from a highly specialistic and non-homogeneous literature. Five classes of hybrid systems are presented: (i) ANNs that emulate HMMs; (ii) connectionist estimate of posterior probabilities in a HMMs; (iii) joint HMM/ANN optimization over a single, overall training criterion; (iv) connectionist vector quantization for discrete HMMs; (v) ANNs for 'rescoring' the HMM hypothesis
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11582/1779
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact