What is CMU Sphinx?
CMU Sphinx is a large vocabulary, speaker independent speech recognition codebase and suite of tools. The code is available for download and use. The main page is at http://www.speech.cs.cmu.edu/sphinx/, with CVS hosted on SourceForge. • What is the difference between sphinx2 and sphinx3? sphinx2 is the real-time engine, and sphinx3 is slower but potentially more accurate. Sphinx2 is “semicontinuous” (uses tied mixtures), and Sphinx3 can use fully continuous observation densities (untied, so that each state has its own distribution statistics). Sphinx3 is not as fully developed for portability, but the codebase is cleaner than that of Sphinx2. Use sphinx2 for speed, and sphinx3 if you can afford to take longer than real-time to decode. Both are available under CVS. • What is a phoneset? The phoneset is the list of ‘phones’, or speech sounds, that the engine can recognize. When you build acoustic models and pronunciations for words, they can be made to use any set of units, but they mus