How do speech recognition systems work?
It is difficult to give a definitive account of how speech recognition systems operate since the requirements for each application are so specific that this leads to a tailor made solutions. A good example however, is the operation of a large vocabulary speech to text system and this is described below. 1) The analogue speech is digitised by a A to D converter. This step can be performed by a generic sound card or by a software specific card which also contains a Digital Signal Processor (DSP) chip. The advantage of using a dedicated DSP chip is it performs computationally intensive manipulation of the data. This includes performing a Fast Fourier Transform on a centisecond ‘slice’ of the input waveform to convert a signal into its constituent frequency components that then form the basis of the recognition process. The DSP can also perform functions such as adaptive filtering on the input signal in an attempt to remove steady state background noises such as the hum from a computer fan