摘要
在现代影视声音制作流程中,音效编辑主要依赖人力进行素材筛选工作,耗时费力。为了有效提升音效素材的筛选效率,提出了基于机器学习的影视音效识别分类系统,通过将梅尔倒谱系数及其差分、短时能量和短时过零率作为声学特征,将长短时记忆网络作为识别分类模型,为影视音效素材的识别分类研究提供了新思路。
In the modern film sound production process,sound effect editing mainly relies on manpower to screen material,which is time-consuming and laborious.In order to improve the screening efficiency of sound effect materials,a film sound effect recognition classification system based on machine learning is proposed.Through the Mel cepstrum coefficient and its difference,short-term energy and short-term zero-crossing rate as acoustic features,long-short term memory network as classification model,it provides ideas for the film audio material classification.
作者
吴昊
张莹
杨嘉乐
杨元元
WU Hao;ZHANG Ying;YANG Jiale;YANG Yuanyuan(Shanghai Film Academy,Shanghai University,Shanghai 200072,China)
出处
《电声技术》
2020年第7期30-34,共5页
Audio Engineering
关键词
机器学习
长短时模型
音效识别
梅尔频率倒谱系数
machine learning
long-short term memory model
sound recognition
Mel frequency cepstrum coefficient