针对传统的 HMM 模型中状态持续时间不长的不足,且在计算量大的情况下,语音识别精度不高,训练时间长,训练误差较高,提出了一种基于语音状态持续时间长的 HMM 模型。 首先,令状态转移矩阵的对角线元素全为 0,去掉自转移弧,再增添以参数化的函数描述持续时间的高斯分布,再通过帧与帧相互 之间的关联程度,将每帧都计算进去;其次,通过重估公式反复计算每条弧被指定的转变概率和可见符号序列输出最原始的数值概率,直至收敛,停止运算。 最后,在 HMM 模型改进前后实验中得到更小的训练误差,下降速度更快,计算量较之前减少多,更容易达到收敛,其概率输出与它前面一个概率输出的差值与该概率 输出值的比值大于 HMM 模型设定的初始值。 与传统 HMM 模型实验比较,基于持续时间状态的 HMM 模型可以在一定程度上降低训练次数和训练时间,提高识别语音的精确度,基本完成了语音识别系统的功能。
In view of the shortage of the traditional HMM model with a short state duration, and the low accuracy of speech recognition, long training time and high training error in the case of large computation, a HMM model based on a long state duration of speech was proposed. First, the diagonal elements of the state transition matrix are all 0, the self-transition arc is removed, and a Gaussian distribution describing the duration with a parameterized function is added. Then, each frame is calculated according to the degree of correlation between frames, and the specified transition probability of each arc and the most primitive numerical probability of the visible symbol sequence output are repeatedly calculated by the re-evaluation formula until convergence, and the operation is stopped. The ratio of the difference between its probability output and its previous probability output and the probability output value is greater than the initial value set by the HMM model. Compared with the traditional HMM model experiment, the HMM model based on the duration state can reduce the number of training times and shorten the training time to a certain extent, improve the accuracy of speech recognition, and basically complete the function of the speech recognition system.
黄 清, 方木云.一种基于 HMM 算法改进的语音识别系统[J].重庆工商大学学报（自然科学版）,2022,39(5):56-61
HUANG Qing, FANG Mu-yun. An Improved Speech Recognition System Based on HMM Algorithm[J]. Journal of Chongqing Technology and Business University(Natural Science Edition）,2022,39(5):56-61