摘要
提出了一种基于深度卷积生成对抗网络(Deep Convolutional Generative Adversarial Networks,DCGAN)的语音生成技术,通过大量学习语音库,能够自主生成全新的语音。生成式对抗网络是一种近年来大热的深度学习模型,其由一个判别网络(Discriminator,D)和一个生成网络(Generator,G)组成。使用Tensorflow作为学习框架,利用DCGAN模型对大量语音进行训练。在基本训练过程中,语音生成网络G的目标就是尽量生成真实的、接近自然的语音去欺骗语音判别网络D,而D的目标就是尽量把G生成的语音和真实的语音区分出来,语音生成网络努力生成的语音让判别网络认为是真实的语音,利用G和D构成动态"博弈过程",最终生成接近原始学习内容的自然语音信号,实现语音的自动生成。
This paper presents a speech generation technology based on Deep Convolutional Generative Adversarial Networks( DCGAN),which can generate new speech by learning a large number of voice libraries. Generative confrontation network is a kind of deep learning model in recent years,which consists of a discriminant network Discriminator( D) and a generating network Generator( G). In this paper,tensorflow is used as the learning framework,and the large number of speech is trained by DCGAN model. In the basic training process,the voice generation network G's goal is to generate voice which is as real as possible,close to the natural voice to deceive the voice to identify the network D.And D's goal is to try to distinguish the G-generated voice and the real voice. The voice generation network tries to generate the voice which is identified as the real voice by the network. Thus,G and D constitute a dynamic " game process",and ultimately generate the natural voice signal close to the original learning content to achieve automatic voice generation.
作者
朱纯
王翰林
魏天远
王伟
ZHU Chun;WANG Han-lin;WEI Tian-yuan;WANG Wei(Department of Software Engineering, Southeast University, Nanjing 211189, China;Department of Biomedical Engineering, Nanjing Medical University, Nanjing 211166, China)
出处
《仪表技术》
2018年第2期13-15,20,共4页
Instrumentation Technology
基金
江苏省高等学校大学生创新创业训练计划(201610312053X)
关键词
深度学习
人工智能
生成对抗网络
语音生成
deep learning
artificial intelligence
generation of confrontation network
speech generation