CHF236.00
Download est disponible immédiatement
We would like to take this opportunity to thank all of those individ uals who helped us assemble this text, including the people of Lockheed Sanders and Nestor, Inc., whose encouragement and support were greatly appreciated. In addition, we would like to thank the members of the Lab oratory for Engineering Man-Machine Systems (LEMS) and the Center for Neural Science at Brown University for their frequent and helpful discussions on a number of topics discussed in this text. Although we both attended Brown from 1983 to 1985, and had offices in the same building, it is surprising that we did not meet until 1988. We also wish to thank Kluwer Academic Publishers for their profes sionalism and patience, and the reviewers for their constructive criticism. Thanks to John McCarthy for performing the final proof, and to John Adcock, Chip Bachmann, Deborah Farrow, Nathan Intrator, Michael Perrone, Ed Real, Lance Riek and Paul Zemany for their comments and assistance. We would also like to thank Khrisna Nathan, our most unbi ased and critical reviewer, for his suggestions for improving the content and accuracy of this text. A special thanks goes to Steve Hoffman, who was instrumental in helping us perform the experiments described in Chapter 9.
Contenu
1 Introduction.- 1.1 Motivation.- 1.2 A Few Words on Speech Recognition.- 1.3 A Few Words on Neural Networks.- 1.4 Contents.- 2 The Mammalian Auditory System.- 2.1 Introduction to Auditory Processing.- 2.2 The Anatomy and Physiology of Neurons.- 2.3 Neuroanatomy of the Auditory System.- 2.3.1 The Ear.- 2.3.2 The Cochlea.- 2.3.3 The Eighth Nerve.- 2.3.4 The Cochlear Nucleus.- 2.3.5 The Superior Olivary Complex.- 2.3.6 The Inferior Colliculus.- 2.3.7 The Medial Geniculate Nucleus.- 2.3.8 The Auditory Cortex.- 2.4 Recurrent Connectivity in the Auditory Pathway.- 2.5 Summary.- 3 An Artificial Neural Network Primer.- 3.1 A Neural Network Primer for Speech Scientists.- 3.2 Elements of Artificial Neural Networks.- 3.2.1 Similarity Measures and Activation Functions.- 3.2.2 Networks and Mappings.- 3.3 Learning in Neural Networks.- 3.4 Supervised Learning.- 3.4.1 The Perceptron and Gradient-Descent Learning.- 3.4.2 Associative Memories.- 3.4.3 The Hopfield Network.- 3.5 Multi-Layer Networks.- 3.5.1 The Restricted Coulomb Energy Network.- 3.5.2 The Backward Error Propagation Network.- 3.5.3 The Charge Clustering Network.- 3.5.4 Recurrent Back Propagation.- 3.6 Unsupervised Learning.- 3.6.1 The BCM Network.- 3.6.2 The Kohonen Feature Map.- 3.7 Summary.- 4 A Speech Technology Primer.- 4.1 A Speech Primer for Neural Scientists.- 4.2 Human Speech Production/Perception.- 4.2.1 Information in the Speech Signal.- 4.3 ASR Technology.- 4.3.1 A General Speech Recognition Model.- 4.4 Signal Processing and Feature Extraction.- 4.4.1 Linear Predictive Coding.- 4.4.2 Feature Extraction and Modeling.- 4.4.3 Vector Quantization.- 4.5 Time Alignment and Pattern Matching.- 4.5.1 Dynamic Time Warping.- 4.5.2 Hidden Markov Models.- 4.5.3 Pronunciation Network Word Models.- 4.6 Language Models.- 4.6.1 Parsers.- 4.6.2 Statistical Models.- 4.7 Summary.- 5 Methods in Neural Network Applications.- 5.1 The Allure of Neural Networks for Speech Processing.- 5.2 The Computational Properties of ANNs.- 5.2.1 Computability and Network Size.- 5.3 ANN Limitations: The Scaling Problem.- 5.3.1 The Scaling of Learning.- 5.3.2 The Scaling of Generalization.- 5.4 Structured ANN Solutions.- 5.4.1 Hierarchical Modules.- 5.4.2 Hybrid Systems.- 5.4.3 Multiple Neural Network Systems.- 5.4.4 Integrating Neural Speech Modules.- 5.5 Summary.- 6 Signal Processing and Feature Extraction.- 6.1 The Importance of Signal Representations.- 6.2 The Signal Processing Problem Domain.- 6.3 Biologically Motivated Signal Processing.- 6.3.1 Review of Speech Representation in the Auditory Nerve.- 6.3.2 The Silicon Cochlea and Temporal-Place Representations for ASR.- 6.3.3 The Role of Automatic Gain Control in Noisy Environments.- 6.4 ANNs for Conventional Signal Processing.- 6.4.1 Adaptive Filtering.- 6.4.2 A Noise Reduction Network.- 6.5 Feature Representations.- 6.5.1 Unsupervised Feature Extraction for Phoneme Classification.- 6.5.2 Feature Maps.- 6.6 Summary.- 7 Time Alignment and Pattern Matching.- 7.1 Modeling Spectro-Temporal Structure.- 7.2 Time Normalization Via Pre-Processing.- 7.2.1 Interpolation and Decimation Techniques.- 7.2.2 Feature-Set Transformations.- 7.3 The Dynamic Programming Neural Network.- 7.3.1 The DPNN Architecture.- 7.3.2 The Time Warping Structure.- 7.3.3 The DPNN Training Procedure.- 7.3.4 Application to Speaker-Independent Digit Recognition.- 7.4 HMM Motivated Networks.- 7.4.1 The Viterbi Network.- 7.4.2 The HMM Network.- 7.5 Recurrent Networks for Temporal Modeling.- 7.5.1 The Temporal Flow Model.- 7.5.2 Temporal Flow Experiments.- 7.6 The Time Delay Neural Network.- 7.6.1 The TDNN Temporal Architecture.- 7.6.2 TDNN Training.- 7.6.3 Application to Phoneme Classification.- 7.6.4 Interpreting the TDNN Spectro-Temporal Representation.- 7.6.5 Phoneme Classification Summary.- 7.6.6 TDNNs for Word Discrimination.- 7.7 Summary.- 8 Natural Language Processing.- 8.1 The Importance of Language Processing.- 8.2 Syntactic Models.- 8.2.1 NETgrams: An ANN Word Category Predictor.- 8.2.2 An ANN for Word Category Disambiguation.- 8.2.3 Recurrent Networks and Formal Languages.- 8.3 Semantic Models.- 8.3.1 Pronoun Reference ANNs.- 8.4 Knowledge Representation.- 8.4.1 Knowledge Representation in a Hopfield Network.- 8.5 Summary.- 9 ANN Keyword Recognition.- 9.1 Keyword Spotting.- 9.2 The Primary KWS System.- 9.2.1 Experimental Data.- 9.3 DUR Experiments.- 9.3.1 Selecting a Fixed-Length Feature Representation.- 9.3.2 Single and Multiple Networks.- 9.3.3 Experiments with Hybrid Systems.- 9.4 Secondary Processing Experiments.- 9.4.1 The Pattern Matching Approach.- 9.4.2 An Investigation of Temporal Models.- 9.5 Summary.- 10 Neural Networks and Speech Processing.- 10.1 Speech Processing Applications.- 10.1.1 Speech Synthesis.- 10.1.2 Speech Coding.- 10.1.3 Speaker Separation.- 10.1.4 Speech Enhancement.- 10.1.5 Speaker Verification/Identification.- 10.1.6 Language Identification.- 10.1.7 Keyword/Keyphrase Spotting.- 10.2 Summary of Efforts in ASR.- 10.2.1 The Past: Institutions Involved in ASR.- 10.2.2 The Current Status of ANNs in ASR.- 10.2.3 The Future: Challenges and Goals.- 10.3 Concluding Remarks.