|
Computer Speech : Recognition, Compression, Synthesis (Springer Series
in Information Sciences, 35)
by Manfred R. Schroeder
Table of Contents
Introduction
Speech: Natural and Artificial
Voice Coders
Voiceprints for Combat and for Fighting
Crime
The Electronic Secretary
The Human Voice as a Key
Clipped Speech
Frequency Division
The First Circle of Hell: Speech in the Soviet Union
Linking Fast Trains to the Telephone Network
Digital Decapitation
Man into Woman and Back
Reading Aids for the Blind
High-Speed Recorded Books
Spectral Compression for the Hard-of-Hearing
Restoration of Helium Speech
Noise Suppression
Slow Speed for Better Comprehension
Multiband Hearing Aids and Binaural Speech Processors
Improving Public Address Systems
Raising Intelligibility in Reverberant Spaces
Conclusion
A Brief History of Speech
Animal Talk
Wolfgang Ritter von Kempelen
From Kratzenstein to Helmholtz
Helmholtz and Rayleigh
The Bells: Alexander Melville and Alexander Graham Bell
Modern Times
The Vocal Tract
Articulatory Dynamics
The Vocoder and Some of Its Progeny
Formant Vocoders
Correlation Vocoders
The Voice-Excited Vocoder
Center Clipping for Spectrum Flattening
Linear Prediction
Subjective Error Criteria
Neural Networks
Wavelets
Conclusion
Speech Recognition and Speaker Identification
Speech Recognition
Dialogue Systems
Speaker Identification
Word Spotting
Pinpointing Disasters by Speaker
Identification
Speaker Identification for Forensic Purposes
Dynamic Programming
Markov Models
Shannon's Outguessing Machine--A Hidden
Markov Model Analyzer
Hidden Markov Models in Speech Recognition
Neural Networks
The Perceptron
Multilayer Networks
Backward Error Propagation
Kohonen Self-Organizing Maps
Hopfield Nets and Associative Memory
Whole Word Recognition
Robust Speech Recognition
The Modulation Transfer Function
Speech Compression
Vocoders
Digital Simulation
Linear Prediction
Linear Prediction and Resonances
The Innovation Sequence
Single Pulse Excitation
Multipulse Excitation
Adaptive Predictive Coding
Masking of Quantizing Noise
Instantaneous Quantizing Versus Block
Coding
Delays
Code Excited Linear Prediction (CELP)
Algebraic Codes
Efficient Coding of Parameters
Waveform Coding
Transform Coding
Audio Compression
Speech Synthesis
Model-Based Speech Synthesis
Synthesis by Concatenation
Prosody
Speech Production
Sources and Filters
The Vocal Source
The Vocal Tract
Radiation from the Lips
The Acoustic Tube Model of the Vocal Tract
Discrete Time Description
The Speech Signal
Spectral Envelope and Fine Structure
Unvoiced Sounds
The Voiced--Unvoiced Classification
The Formant Frequencies
Hearing
Historical Antecedents
Thomas Seebeck and Georg Simon Ohm
More on Monaural Phase Sensitivity
Hermann von Helmholtz and Georg von Bekesy
Thresholds of Hearing
Pulsation Threshold and Continuity Effect
Anatomy and Basic Capabilities of the Ear
The Pinnae and the Outer Ear Canal
The Middle Ear
The Inner Ear
Mechanical to Neural Transduction
Some Astounding Monaural Phase Effects
Masking
Loudness
Scaling in Psychology
Pitch Perception and Uncertainty
Binaural Hearing--Listening with Both Ears
Directional Hearing
Precedence and Haas Effects
Vertical Localization
Virtual Sound Sources and Quasi-Stereophony
Binaural Release from Masking
Binaural Beats and Pitch
Direction and Pitch Confused
Pseudo-Stereophony
Virtual Sound Images
Philharmonic Hall, New York
The Proper Reproduction of Spatial Sound Fields
The Importance of Lateral Sound
How to Increase Lateral Sounds in Real Halls
Summary
Basic Signal Concepts
The Sampling Theorem and Some Notational
Conventions
Fourier Transforms
The Autocorrelation Function
The Convolution Integral and the Delta Function
The Cross-Correlation Function and the
Cross-Spectrum
A Bit of Number Theory
The Hilbert Transform and the Analytic Signal
Hilbert Envelope and Instantaneous Frequency
Causality and the Kramers--Kronig Relations
Anticausal Functions
Minimum-Phase Systems and Complex Frequencies
Allpass Systems
Dereverberation
Matched Filtering
Phase and Group Delay
Heisenberg Uncertainty and The Fourier Transform
Prolate Spheroidal Wave Functions and Uncertainty
Time and Frequency Windows
The Wigner--Ville Distribution
The Cepstrum: Measurement of Fundamental Frequency
Line Spectral Frequencies
A. Acoustic Theory and Modeling of the Vocal Tract
Introduction
Acoustics of a Hard-Walled, Lossless Tube
Field Equations
Time-Invariant Case
Formants as Eigenvalues
Losses and Nonrigid Walls
Discrete Modeling of a Tube
Time-Domain Modeling
Frequency-Domain Modeling, Two-Port Theory
Tube Models and Linear Prediction
Notes on the Inverse Problem
Analytic and Numerical Methods
Empirical Methods
B. Direct Relations Between Cepstrum and
Predictor Coefficients
Derivation of the Main Result
Direct Computation of Predictor
Coefficients from the Cepstrum
A Simple Check
Connection with Algebraic Roots and Symmetric Functions
Connection with Statistical Moments and
Cumulants
Computational Complexity
An Application of Root-Power Sums to Pitch
Detection
References
General Reading
Selected Journals
A Sampling of Societies and Major Meetings
Glossary of Speech and Computer Terms
Name Index
Subject Index
The Author
[11月26-28 上海 北京]
·
·
·
|