Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

WHY SPHINX IIIS PERFORMANCE IS POORER THAN RECOGNIZER X?

0
Posted

WHY SPHINX IIIS PERFORMANCE IS POORER THAN RECOGNIZER X?

0

Q. Sphinx III’s default acoustic and language models appear to be not able to take care of tasks like dictation. Why? (By Arthur Chan at 20040910) Design of a speech recognizer is largely affected by the goal of the recognizer. In the case of CMU Sphinx, most of the effort were driven by DARPA research in 90s. The broadcast news models were trained in the so called eval97 task. Where transcription are required to be done for broadcast news. The above explains why the model don’t really work well for task like dictation. The data simply just for the use of dictation. Commercial speech application also requires a lot of specific tuning and application engineering. For example, most commercial dictation engine use more well-processed training material to train the acoustic model and language model. They also apply techniques such as speaker adaptation. CMU was very unfortunately don’t have enough resource to carry out these researches. Maintained by Evandro B.

Related Questions

What is your question?

*Sadly, we had to bring back ads too. Hopefully more targeted.

Experts123