Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Acoustic likelihoods for words as written out by the decoder and (force-)aligner are both positive and negative, while they are exclusively negative in the lattices. How is this possible?

April 26, 2017acoustic aligner decoder likelihoods negative positive words written

0

10 Posted

Acoustic likelihoods for words as written out by the decoder and (force-)aligner are both positive and negative, while they are exclusively negative in the lattices. How is this possible?

1 Answer

0

10 Posted

The acoustic likelihoods for each word as seen in the decoder and aligner outputs are scaled at each frame by the maximum score for that frame. The final (total) scaling factor is written out in the decoder MATCHSEG output as the number following the letter “S”. “T” is the total score without the scaling factor. The real score is the sum of S and T. The real score for each word is written out in the logfile only if you ask for the backtrace (otherwise that table is not printed). In the falign output, only the real scores are written. The real scores of words are both positive and negative, and large numbers because they use a very small logbase (1.0001 is the default value for both the decoder and the aligner). In the lattices, only the scaled scores are stored and total scaling factor is not written out. This would not affect any rescoring of a lattice, but might affect (positively or negatively) the combination of lattices because the scaling factors may be different for each lattice