摘要
针对目前语音转录文本错误率较高的问题,本文提出一种基于MacBERT的文本先检错后纠错模型,对语音转录后文本进行校正。检错阶段使用MacBERT-BiLSTM-CRF模型检查文本是否有错及出错位置。纠错阶段从置信度和字音相似度两个维度出发,划定“置信度-字音相似度”曲线判断候选字是否进行纠错。候选字的置信度使用MacBERT语言模型计算,并提出一种基于拼音码的字音相似度计算方法。在语音公开数据集Thchs-30上通过调用百度语音识别API进行实验,相比现有方法,在检错阶段和纠错阶段的精确率、召回率、F1值都得到了提高,其中纠错阶段精确率达到83.32%,提高了转录文本的正确性。
Aiming at the high error rate of speech transcription text,proposes a text error detection and correction model based on MacBERT to correct the text after speech transcription.In the error detection stage,the MacBERTBiLSTM-CRF model is used to check whether the text is wrong and where it is.In the error correction stage,starting from the two dimensions of confidence and phonetic similarity,a curve of"confidence-phonetic similarity"is delineated to determine whether candidate words are to be corrected for errors.The confidence of the candidate words is calculated using the MacBERT language model,and a phonetic similarity calculation method based on pinyin code is proposed.Experiments were conducted on the public speech dataset Thchs-30 by calling Baidu speech recognition API.Compared with the existing methods,the precision rate,recall rate and F1 value in the error detection stage and error correction stage have been improved.Among them,the error correction stage The accuracy rate reaches 83.32%,which improves the accuracy of the transcribed text.
作者
邢月晗
郑岩
Xing Yuehan;Zheng Yan(Beijing University of Posts and Telecommunications,School of Artificial Intelligence,Beijing 100876,China)
出处
《电子测量技术》
北大核心
2023年第6期57-61,共5页
Electronic Measurement Technology
基金
教育部-中国移动科研基金(MCM20190701)项目资助