期刊文献+
共找到12篇文章
< 1 >
每页显示 20 50 100
Voice activity detection based on deep belief networks using likelihood ratio 被引量:3
1
作者 KIM Sang-Kyun PARK Young-Jin LEE Sangmin 《Journal of Central South University》 SCIE EI CAS CSCD 2016年第1期145-149,共5页
A novel technique is proposed to improve the performance of voice activity detection(VAD) by using deep belief networks(DBN) with a likelihood ratio(LR). The likelihood ratio is derived from the speech and noise spect... A novel technique is proposed to improve the performance of voice activity detection(VAD) by using deep belief networks(DBN) with a likelihood ratio(LR). The likelihood ratio is derived from the speech and noise spectral components that are assumed to follow the Gaussian probability density function(PDF). The proposed algorithm employs DBN learning in order to classify voice activity by using the input signal to calculate the likelihood ratio. Experiments show that the proposed algorithm yields improved results in various noise environments, compared to the conventional VAD algorithms. Furthermore, the DBN based algorithm decreases the detection probability of error with [0.7, 2.6] compared to the support vector machine based algorithm. 展开更多
关键词 voice activity detection likelihood ratio deep belief networks
下载PDF
Speech enhancement through voice activity detection using speech absence probability based on Teager energy 被引量:2
2
作者 PARKYun-sik LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2013年第2期424-432,共9页
In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (... In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (LSAP) based on the TE of noisy speech as a feature parameter for voice activity detection (VAD) in each frequency subband, rather than conventional LSAP. Results show that the TE operator can enhance the abiTity to discriminate speech and noise and further suppress noise components. Therefore, TE-based LSAP provides a better representation of LSAP, resulting in improved VAD for estimating noise power in a speech enhancement algorithm. In addition, the presented method utilizes TE-based global SAP (GSAP) derived in each frame as the weighting parameter for modifying the adopted TE operator and improving its performance. The proposed algorithm was evaluated by objective and subjective quality tests under various environments, and was shown to produce better results than the conventional method. 展开更多
关键词 speech enhancement Teager energy speech absence probability voice activity detection
下载PDF
VOICE ACTIVITY DETECTION UNDER RAYLEIGH DISTRIBUTION 被引量:1
3
作者 Li Yu Chen Jianming Tan Hongzhou 《Journal of Electronics(China)》 2009年第4期552-556,共5页
This paper presents an improved Voice Activity Detection (VAD) algorithm which uses the Signal-to-Noise Ratio (SNR) measure. We assume that noise Power Spectral Density (PSD) in each spectral bin follows a Rayle... This paper presents an improved Voice Activity Detection (VAD) algorithm which uses the Signal-to-Noise Ratio (SNR) measure. We assume that noise Power Spectral Density (PSD) in each spectral bin follows a Rayleigh distribution. Rayleigh distributions with its asymmetric tail characteristics give a better description of the noise PSD distribution than Gaussian distribution. Under this asstlmption, a new threshold updating expression is derived. Since the analytical integral of the false alarm probability, the threshold updating expression can be represented without the inverse complementary error function and low computational complexity is achieved in our system. Experimental results show that the proposed VAD outperforms or at least is comparable with the VAD scheme presented by Davis under several noise environments and has a lower computational complexity. 展开更多
关键词 Statistical Voice activity detection (VAD) Threshold update Rayleigh distribution Computational complexity
下载PDF
IMPROVING VOICE ACTIVITY DETECTION VIA WEIGHTING LIKELIHOOD AND DIMENSION REDUCTION
4
作者 Wang Huanliang Han Jiqing Li Haifeng Zheng Tieran 《Journal of Electronics(China)》 2008年第3期330-336,共7页
The performance of the traditional Voice Activity Detection (VAD) algorithms declines sharply in lower Signal-to-Noise Ratio (SNR) environments. In this paper, a feature weighting likelihood method is proposed for... The performance of the traditional Voice Activity Detection (VAD) algorithms declines sharply in lower Signal-to-Noise Ratio (SNR) environments. In this paper, a feature weighting likelihood method is proposed for noise-robust VAD. The contribution of dynamic features to likelihood score can be increased via the method, which improves consequently the noise robustness of VAD. Divergence based dimension reduction method is proposed for saving computation, which reduces these feature dimensions with smaller divergence value at the cost of degrading the performance a little. Experimental results on Aurora Ⅱ database show that the detection performance in noise environments can remarkably be improved by the proposed method when the model trained in clean data is used to detect speech endpoints. Using weighting likelihood on the dimension-reduced features obtains comparable, even better, performance compared to original full-dimensional feature. 展开更多
关键词 Voice activity detection (VAD) Weighting likelihood DIVERGENCE Dimension reduction Noise robustness
下载PDF
Speech detection method based on a multi-window analysis 被引量:1
5
作者 Luo Xinwei Liu Ting +4 位作者 Huang Ming Xu Xiaogang Cao Hongli Bai Xianghua Xu Dayong 《Journal of Southeast University(English Edition)》 EI CAS 2021年第4期343-349,共7页
Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram o... Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram of the signal is calculated based on a multi-window T-F analysis,and a speech test statistic is constructed based on the characteristic difference between the signal and background noise.Second,the dynamic double-threshold processing is used for preliminary detection,and then the global double-threshold value is obtained using K-means clustering.Finally,the detection results are obtained by sequential decision.The experimental results show that the overall performance of the method is better than that of traditional methods under various SNR conditions and background noises.This method also has the advantages of low complexity,strong robustness,and adaptability to multi-national languages. 展开更多
关键词 voice activity detection multi-window spectral analysis K-means clustering threshold adjustment sequential decision
下载PDF
Audio-visual voice activity detection 被引量:1
6
作者 LIU Peng WANG Zuo-ying 《Frontiers of Electrical and Electronic Engineering in China》 CSCD 2006年第4期425-430,共6页
In speech signal processing systems,frame-energy based voice activity detection(VAD)method may be interfered with the background noise and non-stationary characteristic of the frame-energy in voice segment.The purpose... In speech signal processing systems,frame-energy based voice activity detection(VAD)method may be interfered with the background noise and non-stationary characteristic of the frame-energy in voice segment.The purpose of this paper is to improve the performance and robustness of VAD by introducing visual information.Meanwhile,data-driven linear transformation is adopted in visual feature extraction,and a general statistical VAD model is designed.Using the general model and a two-stage fusion strategy presented in this paper,a concrete multimodal VAD system is built.Experiments show that a 55.0%relative reduction in frame error rate and a 98.5%relative reduction in sentence-breaking error rate are obtained when using multimodal VAD,compared to frame-energy based audio VAD.The results show that using multimodal method,sentence-breaking errors are almost avoided,and frame-detection performance is clearly improved,which proves the effectiveness of the visual modal in VAD. 展开更多
关键词 speech recognition voice activity detection MULTIMODAL
原文传递
Fast Echo Canceller in IP Telephony Gateway
7
作者 黄永峰 李星 《Journal of Beijing Institute of Technology》 EI CAS 2003年第1期109-112,共4页
The length of the echo path in the IP telephony system is very long. Generally, the echo canceller is implemented on the IP telephony gateway which needs to perform concurrently multi-channel echo cancellation and voi... The length of the echo path in the IP telephony system is very long. Generally, the echo canceller is implemented on the IP telephony gateway which needs to perform concurrently multi-channel echo cancellation and voice compression. Hence, the most key technique to design the echo canceller is to reduce greatly the computational requirement. For this reason a number of innovative features to implement a fast echo canceller are presented. The key components of this canceller include: the separation of adaptive and cancel filters, non-real-time adaptation and real-time cancellation, sharing VAD algorithms with the speech codec, the incorporation of delay indexing with zero coefficients, and windowing the adaptive filter coefficients to reduce the cost of DSP during the cancellation. Finally, the performance of the echo canceller is summarized; the results of evaluation show that the performance gains for echo cancellation are significant. 展开更多
关键词 echo cancellation voice activity detection adaptive filter
下载PDF
Research on natural language recognition algorithm based on sample entropy
8
作者 Juan Lai 《International Journal of Technology Management》 2013年第2期47-49,共3页
Sample entropy can reflect the change of level of new information in signal sequence as well as the size of the new information. Based on the sample entropy as the features of speech classification, the paper firstly ... Sample entropy can reflect the change of level of new information in signal sequence as well as the size of the new information. Based on the sample entropy as the features of speech classification, the paper firstly extract the sample entropy of mixed signal, mean and variance to calculate each signal sample entropy, finally uses the K mean clustering to recognize. The simulation results show that: the recognition rate can be increased to 89.2% based on sample entropy. 展开更多
关键词 sample entropy voice activity detection speech processing
下载PDF
Novel DTD and VAD assisted voice detection algorithm for VoIP systems
9
作者 Ming Meng Wang Ke Ji Hong 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2016年第4期9-16,76,共9页
Echo cancellation plays an important role in current Internet protocol(IP) based voice interactive systems. Voice state detection is an essential part in echo cancellation. It mainly comprises two parts: double tal... Echo cancellation plays an important role in current Internet protocol(IP) based voice interactive systems. Voice state detection is an essential part in echo cancellation. It mainly comprises two parts: double talk detection(DTD) and voice activity detection(VAD). DTD is used to detect doubletalk and prevent filter divergence in the presence of near-end speech, and VAD is used to determine the near-end voice activity and output silence indicator when near-end is silent. However, DTD straightforwardly proceeded may mistakenly declare double talk under double silent condition, coefficients update under the far-end silence condition may lead to filter divergence, and current VAD algorithms may misjudge the residual echo from the near end to be far-end voice. Therefore, a voice detection algorithm combining DTD and far-end VAD is proposed. DTD is implemented when VAD declares far-end speech, filtering and coefficients update will be halted when VAD declares far-end silence, and the far-end VAD adopted is multi-feature VAD based on short-time energy and correlation. The new algorithm can improve the accuracy of DTD, prevent filter divergence, and exclude the circumstance that far-end signal only contains residual echo from near end. Actual test results show that the voice state decision of the new algorithm is accurate, and the performance of echo cancellation is improved. 展开更多
关键词 echo cancellation double talk detection(DTD) voice activity detection(VAD) adaptive filter
原文传递
Enhancing Parkinson's disease severity assessment through voice-based wavelet scattering,optimized model selection,and weighted majority voting 被引量:1
10
作者 Farhad Abedinzadeh Torghabeh Seyyed Abed Hosseini Elham Ahmadi Moghadam 《Medicine in Novel Technology and Devices》 2023年第4期51-63,共13页
Parkinson's disease(PD)is a neurodegenerative disorder characterized by motor and non-motor symptoms that significantly impact an individual's quality of life.Voice changes have shown promise as early indicato... Parkinson's disease(PD)is a neurodegenerative disorder characterized by motor and non-motor symptoms that significantly impact an individual's quality of life.Voice changes have shown promise as early indicators of PD,making voice analysis a valuable tool for early detection and intervention.This study aims to assess and detect the severity of PD through voice analysis using the mobile device voice recordings dataset.The dataset consisted of recordings from PD patients at different stages of the disease and healthy control subjects.A novel approach was employed,incorporating a voice activity detection algorithm for speech segmentation and the wavelet scattering transform for feature extraction.A Bayesian optimization technique is used to fine-tune the hyperparameters of seven commonly used classifiers and optimize the performance of machine learning classifiers for PD severity detection.AdaBoost and K-nearest neighbor consistently demonstrated superior performance across various evaluation metrics among the classifiers.Furthermore,a weighted majority voting(WMV)technique is implemented,leveraging the predictions of multiple models to achieve a near-perfect accuracy of 98.62%,improving classification accuracy.The results highlight the promising potential of voice analysis in PD diagnosis and monitoring.Integrating advanced signal processing techniques and machine learning models provides reliable and accessible tools for PD assessment,facilitating early intervention and improving patient outcomes.This study contributes to the field by demonstrating the effectiveness of the proposed methodology and the significant role of WMV in enhancing classification accuracy for PD severity detection. 展开更多
关键词 Parkinson's disease Speech impairment Voice activity detection Model selection Bayesian optimization Weighted majority voting
下载PDF
Real-Time Implementation of an Efficient Speech Enhancement Algorithm for Digital Hearing Aids
11
作者 高杰 张辉 胡广书 《Tsinghua Science and Technology》 SCIE EI CAS 2006年第4期475-480,共6页
In order to remove background noise and improve the quality of speech for digital hearing aids, a single-channel speech enhancement algorithm is proposed. The algorithm is implemented and assessed on a digital hearing... In order to remove background noise and improve the quality of speech for digital hearing aids, a single-channel speech enhancement algorithm is proposed. The algorithm is implemented and assessed on a digital hearing aid platform based on the TI DSP TMS320VC5502 chip. Assuming that background noise is stationary or varies slowly, an energy-based voice activity detection algorithm is adopted by adaptively tracking the minima and maxima of the power envelope in noisy speech. The target speech is then enhanced by using a Wiener filter, on the basis of a short-term power spectral estimation. To deal with the distracting musical noise of the processed speech, phase randomization, along with adjacent spectral averaging, is adopted. Objective measures and an informal hearing test both show an improved performance as well as obvious attenuation of residual noise. The low power consumption and high efficiency render the whole algorithm very applicable for use in digital hearing aids. 展开更多
关键词 voice activity detection power envelope Wiener filter speech enhancement
原文传递
Speech enhancement with a GSC-like structure employing sparse coding
12
作者 Li-chun YANG Yun-tao QIAN 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2014年第12期1154-1163,共10页
Speech communication is often influenced by various types of interfering signals. To improve the quality of the desired signal, a generalized sidelobe canceller(GSC), which uses a reference signal to estimate the inte... Speech communication is often influenced by various types of interfering signals. To improve the quality of the desired signal, a generalized sidelobe canceller(GSC), which uses a reference signal to estimate the interfering signal, is attracting attention of researchers. However, the interference suppression of GSC is limited since a little residual desired signal leaks into the reference signal. To overcome this problem, we use sparse coding to suppress the residual desired signal while preserving the reference signal. Sparse coding with the learned dictionary is usually used to reconstruct the desired signal. As the training samples of a desired signal for dictionary learning are not observable in the real environment, the reconstructed desired signal may contain a lot of residual interfering signal. In contrast,the training samples of the interfering signal during the absence of the desired signal for interferer dictionary learning can be achieved through voice activity detection(VAD). Since the reference signal of an interfering signal is coherent to the interferer dictionary, it can be well restructured by sparse coding, while the residual desired signal will be removed. The performance of GSC will be improved since the estimate of the interfering signal with the proposed reference signal is more accurate than ever. Simulation and experiments on a real acoustic environment show that our proposed method is effective in suppressing interfering signals. 展开更多
关键词 Generalized sidelobe canceller Speech enhancement Voice activity detection Dictionary learning Sparse coding
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部