如果你的预算有限,斯芬克斯是迄今为止最好的选择。
然而,它也使一个
huge
你使用什么模型,你如何调整它们,这些都是不同的。
和
how you tune your audio source. absolutely everything has to match otherwise it just wont work. given the problem you described id be willing to bet a substantial sum that you've got you got your models mixed up 和 your mic is not correctly calibrated. also, if you have an accent it probably will not work - this is not an issue with the decoder but with the acoustic models - if no one with a voice/accent similar to yours was included in the training data you'll get poor results.
话虽如此,你是否看过他们的开放源代码模型页面?
http://www.speech.cs.cmu.edu/sphinx/models/
depending on what you are trying to do you should be able to obtain about 90% accuracy on free speech with the 16kHz WSJ models 和 the gigaword LMs NVP. i caution however that ASR is a massive undertaking 和 hasn't yet reached commodity status.