Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
The demand for adopting neural networks in resource-constrained embedded devices is continuously increasing.Quantization is one of the most promising solutions to reduce computational cost and memory storage on embedd...The demand for adopting neural networks in resource-constrained embedded devices is continuously increasing.Quantization is one of the most promising solutions to reduce computational cost and memory storage on embedded devices.In order to reduce the complexity and overhead of deploying neural networks on Integeronly hardware,most current quantization methods use a symmetric quantization mapping strategy to quantize a floating-point neural network into an integer network.However,although symmetric quantization has the advantage of easier implementation,it is sub-optimal for cases where the range could be skewed and not symmetric.This often comes at the cost of lower accuracy.This paper proposed an activation redistribution-based hybrid asymmetric quantizationmethod for neural networks.The proposedmethod takes data distribution into consideration and can resolve the contradiction between the quantization accuracy and the ease of implementation,balance the trade-off between clipping range and quantization resolution,and thus improve the accuracy of the quantized neural network.The experimental results indicate that the accuracy of the proposed method is 2.02%and 5.52%higher than the traditional symmetric quantization method for classification and detection tasks,respectively.The proposed method paves the way for computationally intensive neural network models to be deployed on devices with limited computing resources.Codes will be available on https://github.com/ycjcy/Hybrid-Asymmetric-Quantization.展开更多
The amount of oxygen blown into the converter is one of the key parameters for the control of the converter blowing process,which directly affects the tap-to-tap time of converter. In this study, a hybrid model based ...The amount of oxygen blown into the converter is one of the key parameters for the control of the converter blowing process,which directly affects the tap-to-tap time of converter. In this study, a hybrid model based on oxygen balance mechanism (OBM) and deep neural network (DNN) was established for predicting oxygen blowing time in converter. A three-step method was utilized in the hybrid model. First, the oxygen consumption volume was predicted by the OBM model and DNN model, respectively. Second, a more accurate oxygen consumption volume was obtained by integrating the OBM model and DNN model. Finally, the converter oxygen blowing time was calculated according to the oxygen consumption volume and the oxygen supply intensity of each heat. The proposed hybrid model was verified using the actual data collected from an integrated steel plant in China, and compared with multiple linear regression model, OBM model, and neural network model including extreme learning machine, back propagation neural network, and DNN. The test results indicate that the hybrid model with a network structure of 3 hidden layer layers, 32-16-8 neurons per hidden layer, and 0.1 learning rate has the best prediction accuracy and stronger generalization ability compared with other models. The predicted hit ratio of oxygen consumption volume within the error±300 m^(3)is 96.67%;determination coefficient (R^(2)) and root mean square error (RMSE) are0.6984 and 150.03 m^(3), respectively. The oxygen blow time prediction hit ratio within the error±0.6 min is 89.50%;R2and RMSE are0.9486 and 0.3592 min, respectively. As a result, the proposed model can effectively predict the oxygen consumption volume and oxygen blowing time in the converter.展开更多
We design a new hybrid quantum-classical convolutional neural network(HQCCNN)model based on parameter quantum circuits.In this model,we use parameterized quantum circuits(PQCs)to redesign the convolutional layer in cl...We design a new hybrid quantum-classical convolutional neural network(HQCCNN)model based on parameter quantum circuits.In this model,we use parameterized quantum circuits(PQCs)to redesign the convolutional layer in classical convolutional neural networks,forming a new quantum convolutional layer to achieve unitary transformation of quantum states,enabling the model to more accurately extract hidden information from images.At the same time,we combine the classical fully connected layer with PQCs to form a new hybrid quantum-classical fully connected layer to further improve the accuracy of classification.Finally,we use the MNIST dataset to test the potential of the HQCCNN.The results indicate that the HQCCNN has good performance in solving classification problems.In binary classification tasks,the classification accuracy of numbers 5 and 7 is as high as 99.71%.In multivariate classification,the accuracy rate also reaches 98.51%.Finally,we compare the performance of the HQCCNN with other models and find that the HQCCNN has better classification performance and convergence speed.展开更多
Rockburst is a phenomenon in which free surfaces are formed during excavation,which subsequently causes the sudden release of energy in the construction of mines and tunnels.Light rockburst only peels off rock slices ...Rockburst is a phenomenon in which free surfaces are formed during excavation,which subsequently causes the sudden release of energy in the construction of mines and tunnels.Light rockburst only peels off rock slices without ejection,while severe rockburst causes casualties and property loss.The frequency and degree of rockburst damage increases with the excavation depth.Moreover,rockburst is the leading engineering geological hazard in the excavation process,and thus the prediction of its intensity grade is of great significance to the development of geotechnical engineering.Therefore,the prediction of rockburst intensity grade is one problem that needs to be solved urgently.By comprehensively considering the occurrence mechanism of rockburst,this paper selects the stress index(σθ/σc),brittleness index(σ_(c)/σ_(t)),and rock elastic energy index(Wet)as the rockburst evaluation indexes through the Spearman coefficient method.This overcomes the low accuracy problem of a single evaluation index prediction method.Following this,the BGD-MSR-DNN rockburst intensity grade prediction model based on batch gradient descent and a multi-scale residual deep neural network is proposed.The batch gradient descent(BGD)module is used to replace the gradient descent algorithm,which effectively improves the efficiency of the network and reduces the model training time.Moreover,the multi-scale residual(MSR)module solves the problem of network degradation when there are too many hidden layers of the deep neural network(DNN),thus improving the model prediction accuracy.The experimental results reveal the BGDMSR-DNN model accuracy to reach 97.1%,outperforming other comparable models.Finally,actual projects such as Qinling Tunnel and Daxiangling Tunnel,reached an accuracy of 100%.The model can be applied in mines and tunnel engineering to realize the accurate and rapid prediction of rockburst intensity grade.展开更多
With limited number of labeled samples,hyperspectral image(HSI)classification is a difficult Problem in current research.The graph neural network(GNN)has emerged as an approach to semi-supervised classification,and th...With limited number of labeled samples,hyperspectral image(HSI)classification is a difficult Problem in current research.The graph neural network(GNN)has emerged as an approach to semi-supervised classification,and the application of GNN to hyperspectral images has attracted much attention.However,in the existing GNN-based methods a single graph neural network or graph filter is mainly used to extract HSI features,which does not take full advantage of various graph neural networks(graph filters).Moreover,the traditional GNNs have the problem of oversmoothing.To alleviate these shortcomings,we introduce a deep hybrid multi-graph neural network(DHMG),where two different graph filters,i.e.,the spectral filter and the autoregressive moving average(ARMA)filter,are utilized in two branches.The former can well extract the spectral features of the nodes,and the latter has a good suppression effect on graph noise.The network realizes information interaction between the two branches and takes good advantage of different graph filters.In addition,to address the problem of oversmoothing,a dense network is proposed,where the local graph features are preserved.The dense structure satisfies the needs of different classification targets presenting different features.Finally,we introduce a GraphSAGEbased network to refine the graph features produced by the deep hybrid network.Extensive experiments on three public HSI datasets strongly demonstrate that the DHMG dramatically outperforms the state-ofthe-art models.展开更多
Many scholars have focused on applying machine learning models in bottom hole pressure (BHP) prediction. However, the complex and uncertain conditions in deep wells make it difficult to capture spatial and temporal co...Many scholars have focused on applying machine learning models in bottom hole pressure (BHP) prediction. However, the complex and uncertain conditions in deep wells make it difficult to capture spatial and temporal correlations of measurement while drilling (MWD) data with traditional intelligent models. In this work, we develop a novel hybrid neural network, which integrates the Convolution Neural Network (CNN) and the Gate Recurrent Unit (GRU) for predicting BHP fluctuations more accurately. The CNN structure is used to analyze spatial local dependency patterns and the GRU structure is used to discover depth variation trends of MWD data. To further improve the prediction accuracy, we explore two types of GRU-based structure: skip-GRU and attention-GRU, which can capture more long-term potential periodic correlation in drilling data. Then, the different model structures tuned by the Bayesian optimization (BO) algorithm are compared and analyzed. Results indicate that the hybrid models can extract spatial-temporal information of data effectively and predict more accurately than random forests, extreme gradient boosting, back propagation neural network, CNN and GRU. The CNN-attention-GRU model with BO algorithm shows great superiority in prediction accuracy and robustness due to the hybrid network structure and attention mechanism, having the lowest mean absolute percentage error of 0.025%. This study provides a reference for solving the problem of extracting spatial and temporal characteristics and guidance for managed pressure drilling in complex formations.展开更多
We propose new hybrid Lagrange neural networks called LaNets to predict the numerical solutions of partial differential equations.That is,we embed Lagrange interpolation and small sample learning into deep neural netw...We propose new hybrid Lagrange neural networks called LaNets to predict the numerical solutions of partial differential equations.That is,we embed Lagrange interpolation and small sample learning into deep neural network frameworks.Concretely,we first perform Lagrange interpolation in front of the deep feedforward neural network.The Lagrange basis function has a neat structure and a strong expression ability,which is suitable to be a preprocessing tool for pre-fitting and feature extraction.Second,we introduce small sample learning into training,which is beneficial to guide themodel to be corrected quickly.Taking advantages of the theoretical support of traditional numerical method and the efficient allocation of modern machine learning,LaNets achieve higher predictive accuracy compared to the state-of-the-artwork.The stability and accuracy of the proposed algorithmare demonstrated through a series of classical numerical examples,including one-dimensional Burgers equation,onedimensional carburizing diffusion equations,two-dimensional Helmholtz equation and two-dimensional Burgers equation.Experimental results validate the robustness,effectiveness and flexibility of the proposed algorithm.展开更多
Memristor-based neuromorphic computing shows great potential for high-speed and high-throughput signal processing applications,such as electroencephalogram(EEG)signal processing.Nonetheless,the size of one-transistor ...Memristor-based neuromorphic computing shows great potential for high-speed and high-throughput signal processing applications,such as electroencephalogram(EEG)signal processing.Nonetheless,the size of one-transistor one-resistor(1T1R)memristor arrays is limited by the non-ideality of the devices,which prevents the hardware implementation of large and complex networks.In this work,we propose the depthwise separable convolution and bidirectional gate recurrent unit(DSC-BiGRU)network,a lightweight and highly robust hybrid neural network based on 1T1R arrays that enables efficient processing of EEG signals in the temporal,frequency and spatial domains by hybridizing DSC and BiGRU blocks.The network size is reduced and the network robustness is improved while ensuring the network classification accuracy.In the simulation,the measured non-idealities of the 1T1R array are brought into the network through statistical analysis.Compared with traditional convolutional networks,the network parameters are reduced by 95%and the network classification accuracy is improved by 21%at a 95%array yield rate and 5%tolerable error.This work demonstrates that lightweight and highly robust networks based on memristor arrays hold great promise for applications that rely on low consumption and high efficiency.展开更多
Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale regi...Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.展开更多
Whole brain functional connectivity(FC)patterns obtained from resting-state functional magnetic resonance imaging(rs-fMRI)have been widely used in the diagnosis of brain disorders such as autism spectrum disorder(ASD)...Whole brain functional connectivity(FC)patterns obtained from resting-state functional magnetic resonance imaging(rs-fMRI)have been widely used in the diagnosis of brain disorders such as autism spectrum disorder(ASD).Recently,an increasing number of studies have focused on employing deep learning techniques to analyze FC patterns for brain disease classification.However,the high dimensionality of the FC features and the interpretation of deep learning results are issues that need to be addressed in the FC-based brain disease classification.In this paper,we proposed a multi-scale attention-based deep neural network(MSA-DNN)model to classify FC patterns for the ASD diagnosis.The model was implemented by adding a flexible multi-scale attention(MSA)module to the auto-encoder based backbone DNN,which can extract multi-scale features of the FC patterns and change the level of attention for different FCs by continuous learning.Our model will reinforce the weights of important FC features while suppress the unimportant FCs to ensure the sparsity of the model weights and enhance the model interpretability.We performed systematic experiments on the large multi-sites ASD dataset with both ten-fold and leaveone-site-out cross-validations.Results showed that our model outperformed classical methods in brain disease classification and revealed robust intersite prediction performance.We also localized important FC features and brain regions associated with ASD classification.Overall,our study further promotes the biomarker detection and computer-aided classification for ASD diagnosis,and the proposed MSA module is flexible and easy to implement in other classification networks.展开更多
Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label c...Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label convolutional neural network( MSMLCNN) is proposed to predict multiple pedestrian attributes simultaneously. The pedestrian attribute classification problem is firstly transformed into a multi-label problem including multiple binary attributes needed to be classified. Then,the multi-label problem is solved by fully connecting all binary attributes to multi-scale features with logistic regression functions. Moreover,the multi-scale features are obtained by concatenating those featured maps produced from multiple pooling layers of the MSMLCNN at different scales. Extensive experiment results show that the proposed MSMLCNN outperforms state-of-the-art pedestrian attribute classification methods with a large margin.展开更多
The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods ha...The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods have achieved significant improvements in image super-resolution(SR),current CNNbased techniques mainly contain massive parameters and a high computational complexity,limiting their practical applications.In this paper,we present a fast and lightweight framework,named weighted multi-scale residual network(WMRN),for a better tradeoff between SR performance and computational efficiency.With the modified residual structure,depthwise separable convolutions(DS Convs)are employed to improve convolutional operations’efficiency.Furthermore,several weighted multi-scale residual blocks(WMRBs)are stacked to enhance the multi-scale representation capability.In the reconstruction subnetwork,a group of Conv layers are introduced to filter feature maps to reconstruct the final high-quality image.Extensive experiments were conducted to evaluate the proposed model,and the comparative results with several state-of-the-art algorithms demonstrate the effectiveness of WMRN.展开更多
Recently,speech enhancement methods based on Generative Adversarial Networks have achieved good performance in time-domain noisy signals.However,the training of Generative Adversarial Networks has such problems as con...Recently,speech enhancement methods based on Generative Adversarial Networks have achieved good performance in time-domain noisy signals.However,the training of Generative Adversarial Networks has such problems as convergence difficulty,model collapse,etc.In this work,an end-to-end speech enhancement model based on Wasserstein Generative Adversarial Networks is proposed,and some improvements have been made in order to get faster convergence speed and better generated speech quality.Specifically,in the generator coding part,each convolution layer adopts different convolution kernel sizes to conduct convolution operations for obtaining speech coding information from multiple scales;a gated linear unit is introduced to alleviate the vanishing gradient problem with the increase of network depth;the gradient penalty of the discriminator is replaced with spectral normalization to accelerate the convergence rate of themodel;a hybrid penalty termcomposed of L1 regularization and a scale-invariant signal-to-distortion ratio is introduced into the loss function of the generator to improve the quality of generated speech.The experimental results on both TIMIT corpus and Tibetan corpus show that the proposed model improves the speech quality significantly and accelerates the convergence speed of the model.展开更多
Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.Th...Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.展开更多
As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symboli...As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symbolic applications of PHM technology in modern manufacturing systems and industry.In this paper,a multi-scale Convolutional Gated Recurrent Unit network(MCGRU)is proposed to address raw sensory data for tool wear prediction.At the bottom of MCGRU,six parallel and independent branches with different kernel sizes are designed to form a multi-scale convolutional neural network,which augments the adaptability to features of different time scales.These features of different scales extracted from raw data are then fed into a Deep Gated Recurrent Unit network to capture long-term dependencies and learn significant representations.At the top of the MCGRU,a fully connected layer and a regression layer are built for cutting tool wear prediction.Two case studies are performed to verify the capability and effectiveness of the proposed MCGRU network and results show that MCGRU outperforms several state-of-the-art baseline models.展开更多
A hybrid neural network model, in which RH process (theoretical) model is combined organically with neural network (NN) and case-base reasoning (CBR), was established. The CBR method was used to select the operation m...A hybrid neural network model, in which RH process (theoretical) model is combined organically with neural network (NN) and case-base reasoning (CBR), was established. The CBR method was used to select the operation mode and the RH operational guide parameters for different steel grades according to the initial conditions of molten steel, and a three-layer BP neural network was adopted to deal with nonlinear factors for improving and compensating the limitations of technological model for RH process control and end-point prediction. It was verified that the hybrid neural network is effective for improving the precision and calculation efficiency of the model.展开更多
This paper presents a novel adaptive scheme for energy management in stand-alone hybrid power systems. The proposed management system is designed to manage the power flow between the hybrid power system and energy sto...This paper presents a novel adaptive scheme for energy management in stand-alone hybrid power systems. The proposed management system is designed to manage the power flow between the hybrid power system and energy storage elements in order to satisfy the load requirements based on artificial neural network (ANN) and fuzzy logic controllers. The neural network controller is employed to achieve the maximum power point (MPP) for different types of photovoltaic (PV) panels. The advance fuzzy logic controller is developed to distribute the power among the hybrid system and to manage the charge and discharge current flow for performance optimization. The developed management system performance was assessed using a hybrid system comprised PV panels, wind turbine (WT), battery storage, and proton exchange membrane fuel cell (PEMFC). To improve the generating performance of the PEMFC and prolong its life, stack temperature is controlled by a fuzzy logic controller. The dynamic behavior of the proposed model is examined under different operating conditions. Real-time measured parameters are used as inputs for the developed system. The proposed model and its control strategy offer a proper tool for optimizing hybrid power system performance, such as that used in smart-house applications.展开更多
El Niño-Southern Oscillation(ENSO)can be currently predicted reasonably well six months and longer,but large biases and uncertainties remain in its real-time prediction.Various approaches have been taken to impro...El Niño-Southern Oscillation(ENSO)can be currently predicted reasonably well six months and longer,but large biases and uncertainties remain in its real-time prediction.Various approaches have been taken to improve understanding of ENSO processes,and different models for ENSO predictions have been developed,including linear statistical models based on principal oscillation pattern(POP)analyses,convolutional neural networks(CNNs),and so on.Here,we develop a novel hybrid model,named as POP-Net,by combining the POP analysis procedure with CNN-long short-term memory(LSTM)algorithm to predict the Niño-3.4 sea surface temperature(SST)index.ENSO predictions are compared with each other from the corresponding three models:POP model,CNN-LSTM model,and POP-Net,respectively.The POP-based pre-processing acts to enhance ENSO-related signals of interest while filtering unrelated noise.Consequently,an improved prediction is achieved in the POP-Net relative to others.The POP-Net shows a high-correlation skill for 17-month lead time prediction(correlation coefficients exceeding 0.5)during the 1994-2020 validation period.The POP-Net also alleviates the spring predictability barrier(SPB).It is concluded that value-added artificial neural networks for improved ENSO predictions are possible by including the process-oriented analyses to enhance signal representations.展开更多
In order to solve the problem of trajectory tracking for a class of novel serial-parallel hybrid humanoid arm(HHA), which has parameters uncertainty, frictions, disturbance, abrasion and pulse forces derived from moto...In order to solve the problem of trajectory tracking for a class of novel serial-parallel hybrid humanoid arm(HHA), which has parameters uncertainty, frictions, disturbance, abrasion and pulse forces derived from motors, a multistep dynamics modeling strategy is proposed and a robust controller based on neural network(NN)-adaptive algorithm is designed. At the first step of dynamics modeling, the dynamics model of the reduced HHA is established by Lagrange method. At the second step of dynamics modeling, the parameter uncertain part resulting mainly from the idealization of the HHA is learned by adaptive algorithm. In the trajectory tracking controller, the radial basis function(RBF) NN, whose optimal weights are learned online by adaptive algorithm, is used to learn the upper limit function of the total uncertainties including frictions, disturbances, abrasion and pulse forces. To a great extent, the conservatism of this robust trajectory tracking controller is reduced, and by this controller the HHA can impersonate mostly human actions. The proof and simulation results testify the validity of the adaptive strategy for parameter learning and the neural network-adaptive strategy for the trajectory tracking control.展开更多
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金The Qian Xuesen Youth Innovation Foundation from China Aerospace Science and Technology Corporation(Grant Number 2022JY51).
文摘The demand for adopting neural networks in resource-constrained embedded devices is continuously increasing.Quantization is one of the most promising solutions to reduce computational cost and memory storage on embedded devices.In order to reduce the complexity and overhead of deploying neural networks on Integeronly hardware,most current quantization methods use a symmetric quantization mapping strategy to quantize a floating-point neural network into an integer network.However,although symmetric quantization has the advantage of easier implementation,it is sub-optimal for cases where the range could be skewed and not symmetric.This often comes at the cost of lower accuracy.This paper proposed an activation redistribution-based hybrid asymmetric quantizationmethod for neural networks.The proposedmethod takes data distribution into consideration and can resolve the contradiction between the quantization accuracy and the ease of implementation,balance the trade-off between clipping range and quantization resolution,and thus improve the accuracy of the quantized neural network.The experimental results indicate that the accuracy of the proposed method is 2.02%and 5.52%higher than the traditional symmetric quantization method for classification and detection tasks,respectively.The proposed method paves the way for computationally intensive neural network models to be deployed on devices with limited computing resources.Codes will be available on https://github.com/ycjcy/Hybrid-Asymmetric-Quantization.
基金financially supported by the National Natural Science Foundation of China (Nos.51974023 and52374321)the funding of State Key Laboratory of Advanced Metallurgy,University of Science and Technology Beijing,China (No.41620007)。
文摘The amount of oxygen blown into the converter is one of the key parameters for the control of the converter blowing process,which directly affects the tap-to-tap time of converter. In this study, a hybrid model based on oxygen balance mechanism (OBM) and deep neural network (DNN) was established for predicting oxygen blowing time in converter. A three-step method was utilized in the hybrid model. First, the oxygen consumption volume was predicted by the OBM model and DNN model, respectively. Second, a more accurate oxygen consumption volume was obtained by integrating the OBM model and DNN model. Finally, the converter oxygen blowing time was calculated according to the oxygen consumption volume and the oxygen supply intensity of each heat. The proposed hybrid model was verified using the actual data collected from an integrated steel plant in China, and compared with multiple linear regression model, OBM model, and neural network model including extreme learning machine, back propagation neural network, and DNN. The test results indicate that the hybrid model with a network structure of 3 hidden layer layers, 32-16-8 neurons per hidden layer, and 0.1 learning rate has the best prediction accuracy and stronger generalization ability compared with other models. The predicted hit ratio of oxygen consumption volume within the error±300 m^(3)is 96.67%;determination coefficient (R^(2)) and root mean square error (RMSE) are0.6984 and 150.03 m^(3), respectively. The oxygen blow time prediction hit ratio within the error±0.6 min is 89.50%;R2and RMSE are0.9486 and 0.3592 min, respectively. As a result, the proposed model can effectively predict the oxygen consumption volume and oxygen blowing time in the converter.
基金Project supported by the Natural Science Foundation of Shandong Province,China (Grant No.ZR2021MF049)the Joint Fund of Natural Science Foundation of Shandong Province (Grant Nos.ZR2022LLZ012 and ZR2021LLZ001)。
文摘We design a new hybrid quantum-classical convolutional neural network(HQCCNN)model based on parameter quantum circuits.In this model,we use parameterized quantum circuits(PQCs)to redesign the convolutional layer in classical convolutional neural networks,forming a new quantum convolutional layer to achieve unitary transformation of quantum states,enabling the model to more accurately extract hidden information from images.At the same time,we combine the classical fully connected layer with PQCs to form a new hybrid quantum-classical fully connected layer to further improve the accuracy of classification.Finally,we use the MNIST dataset to test the potential of the HQCCNN.The results indicate that the HQCCNN has good performance in solving classification problems.In binary classification tasks,the classification accuracy of numbers 5 and 7 is as high as 99.71%.In multivariate classification,the accuracy rate also reaches 98.51%.Finally,we compare the performance of the HQCCNN with other models and find that the HQCCNN has better classification performance and convergence speed.
基金funded by State Key Laboratory for GeoMechanics and Deep Underground Engineering&Institute for Deep Underground Science and Engineering,Grant Number XD2021021BUCEA Post Graduate Innovation Project under Grant,Grant Number PG2023092.
文摘Rockburst is a phenomenon in which free surfaces are formed during excavation,which subsequently causes the sudden release of energy in the construction of mines and tunnels.Light rockburst only peels off rock slices without ejection,while severe rockburst causes casualties and property loss.The frequency and degree of rockburst damage increases with the excavation depth.Moreover,rockburst is the leading engineering geological hazard in the excavation process,and thus the prediction of its intensity grade is of great significance to the development of geotechnical engineering.Therefore,the prediction of rockburst intensity grade is one problem that needs to be solved urgently.By comprehensively considering the occurrence mechanism of rockburst,this paper selects the stress index(σθ/σc),brittleness index(σ_(c)/σ_(t)),and rock elastic energy index(Wet)as the rockburst evaluation indexes through the Spearman coefficient method.This overcomes the low accuracy problem of a single evaluation index prediction method.Following this,the BGD-MSR-DNN rockburst intensity grade prediction model based on batch gradient descent and a multi-scale residual deep neural network is proposed.The batch gradient descent(BGD)module is used to replace the gradient descent algorithm,which effectively improves the efficiency of the network and reduces the model training time.Moreover,the multi-scale residual(MSR)module solves the problem of network degradation when there are too many hidden layers of the deep neural network(DNN),thus improving the model prediction accuracy.The experimental results reveal the BGDMSR-DNN model accuracy to reach 97.1%,outperforming other comparable models.Finally,actual projects such as Qinling Tunnel and Daxiangling Tunnel,reached an accuracy of 100%.The model can be applied in mines and tunnel engineering to realize the accurate and rapid prediction of rockburst intensity grade.
文摘With limited number of labeled samples,hyperspectral image(HSI)classification is a difficult Problem in current research.The graph neural network(GNN)has emerged as an approach to semi-supervised classification,and the application of GNN to hyperspectral images has attracted much attention.However,in the existing GNN-based methods a single graph neural network or graph filter is mainly used to extract HSI features,which does not take full advantage of various graph neural networks(graph filters).Moreover,the traditional GNNs have the problem of oversmoothing.To alleviate these shortcomings,we introduce a deep hybrid multi-graph neural network(DHMG),where two different graph filters,i.e.,the spectral filter and the autoregressive moving average(ARMA)filter,are utilized in two branches.The former can well extract the spectral features of the nodes,and the latter has a good suppression effect on graph noise.The network realizes information interaction between the two branches and takes good advantage of different graph filters.In addition,to address the problem of oversmoothing,a dense network is proposed,where the local graph features are preserved.The dense structure satisfies the needs of different classification targets presenting different features.Finally,we introduce a GraphSAGEbased network to refine the graph features produced by the deep hybrid network.Extensive experiments on three public HSI datasets strongly demonstrate that the DHMG dramatically outperforms the state-ofthe-art models.
基金The authors express their appreciation to National Key Research and Development Project“Key Scientific Issues of Revolutionary Technology”(2019YFA0708300)Strategic Cooperation Technology Projects of CNPC and CUPB(ZLZX2020-03)+1 种基金Distinguished Young Foundation of National Natural Science Foundation of China(52125401)Science Foundation of China University of Petroleum,Beijing(2462022SZBH002).
文摘Many scholars have focused on applying machine learning models in bottom hole pressure (BHP) prediction. However, the complex and uncertain conditions in deep wells make it difficult to capture spatial and temporal correlations of measurement while drilling (MWD) data with traditional intelligent models. In this work, we develop a novel hybrid neural network, which integrates the Convolution Neural Network (CNN) and the Gate Recurrent Unit (GRU) for predicting BHP fluctuations more accurately. The CNN structure is used to analyze spatial local dependency patterns and the GRU structure is used to discover depth variation trends of MWD data. To further improve the prediction accuracy, we explore two types of GRU-based structure: skip-GRU and attention-GRU, which can capture more long-term potential periodic correlation in drilling data. Then, the different model structures tuned by the Bayesian optimization (BO) algorithm are compared and analyzed. Results indicate that the hybrid models can extract spatial-temporal information of data effectively and predict more accurately than random forests, extreme gradient boosting, back propagation neural network, CNN and GRU. The CNN-attention-GRU model with BO algorithm shows great superiority in prediction accuracy and robustness due to the hybrid network structure and attention mechanism, having the lowest mean absolute percentage error of 0.025%. This study provides a reference for solving the problem of extracting spatial and temporal characteristics and guidance for managed pressure drilling in complex formations.
基金supported by NSFC(No.11971296)National Key Research and Development Program of China(No.2021YFA1003004).
文摘We propose new hybrid Lagrange neural networks called LaNets to predict the numerical solutions of partial differential equations.That is,we embed Lagrange interpolation and small sample learning into deep neural network frameworks.Concretely,we first perform Lagrange interpolation in front of the deep feedforward neural network.The Lagrange basis function has a neat structure and a strong expression ability,which is suitable to be a preprocessing tool for pre-fitting and feature extraction.Second,we introduce small sample learning into training,which is beneficial to guide themodel to be corrected quickly.Taking advantages of the theoretical support of traditional numerical method and the efficient allocation of modern machine learning,LaNets achieve higher predictive accuracy compared to the state-of-the-artwork.The stability and accuracy of the proposed algorithmare demonstrated through a series of classical numerical examples,including one-dimensional Burgers equation,onedimensional carburizing diffusion equations,two-dimensional Helmholtz equation and two-dimensional Burgers equation.Experimental results validate the robustness,effectiveness and flexibility of the proposed algorithm.
基金Project supported by the National Key Research and Development Program of China(Grant No.2019YFB2205102)the National Natural Science Foundation of China(Grant Nos.61974164,62074166,61804181,62004219,62004220,and 62104256).
文摘Memristor-based neuromorphic computing shows great potential for high-speed and high-throughput signal processing applications,such as electroencephalogram(EEG)signal processing.Nonetheless,the size of one-transistor one-resistor(1T1R)memristor arrays is limited by the non-ideality of the devices,which prevents the hardware implementation of large and complex networks.In this work,we propose the depthwise separable convolution and bidirectional gate recurrent unit(DSC-BiGRU)network,a lightweight and highly robust hybrid neural network based on 1T1R arrays that enables efficient processing of EEG signals in the temporal,frequency and spatial domains by hybridizing DSC and BiGRU blocks.The network size is reduced and the network robustness is improved while ensuring the network classification accuracy.In the simulation,the measured non-idealities of the 1T1R array are brought into the network through statistical analysis.Compared with traditional convolutional networks,the network parameters are reduced by 95%and the network classification accuracy is improved by 21%at a 95%array yield rate and 5%tolerable error.This work demonstrates that lightweight and highly robust networks based on memristor arrays hold great promise for applications that rely on low consumption and high efficiency.
基金Supported by the National Natural Science Foundation of China(61903336,61976190)the Natural Science Foundation of Zhejiang Province(LY21F030015)。
文摘Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.
基金This work was supported by the National Natural Science Foundation of China(No.61906006).
文摘Whole brain functional connectivity(FC)patterns obtained from resting-state functional magnetic resonance imaging(rs-fMRI)have been widely used in the diagnosis of brain disorders such as autism spectrum disorder(ASD).Recently,an increasing number of studies have focused on employing deep learning techniques to analyze FC patterns for brain disease classification.However,the high dimensionality of the FC features and the interpretation of deep learning results are issues that need to be addressed in the FC-based brain disease classification.In this paper,we proposed a multi-scale attention-based deep neural network(MSA-DNN)model to classify FC patterns for the ASD diagnosis.The model was implemented by adding a flexible multi-scale attention(MSA)module to the auto-encoder based backbone DNN,which can extract multi-scale features of the FC patterns and change the level of attention for different FCs by continuous learning.Our model will reinforce the weights of important FC features while suppress the unimportant FCs to ensure the sparsity of the model weights and enhance the model interpretability.We performed systematic experiments on the large multi-sites ASD dataset with both ten-fold and leaveone-site-out cross-validations.Results showed that our model outperformed classical methods in brain disease classification and revealed robust intersite prediction performance.We also localized important FC features and brain regions associated with ASD classification.Overall,our study further promotes the biomarker detection and computer-aided classification for ASD diagnosis,and the proposed MSA module is flexible and easy to implement in other classification networks.
基金Supported by the National Natural Science Foundation of China(No.61602191,61672521,61375037,61473291,61572501,61572536,61502491,61372107,61401167)the Natural Science Foundation of Fujian Province(No.2016J01308)+3 种基金the Scientific and Technology Funds of Quanzhou(No.2015Z114)the Scientific and Technology Funds of Xiamen(No.3502Z20173045)the Promotion Program for Young and Middle aged Teacher in Science and Technology Research of Huaqiao University(No.ZQN-PY418,ZQN-YX403)the Scientific Research Funds of Huaqiao University(No.16BS108)
文摘Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label convolutional neural network( MSMLCNN) is proposed to predict multiple pedestrian attributes simultaneously. The pedestrian attribute classification problem is firstly transformed into a multi-label problem including multiple binary attributes needed to be classified. Then,the multi-label problem is solved by fully connecting all binary attributes to multi-scale features with logistic regression functions. Moreover,the multi-scale features are obtained by concatenating those featured maps produced from multiple pooling layers of the MSMLCNN at different scales. Extensive experiment results show that the proposed MSMLCNN outperforms state-of-the-art pedestrian attribute classification methods with a large margin.
基金the National Natural Science Foundation of China(61772149,61866009,61762028,U1701267,61702169)Guangxi Science and Technology Project(2019GXNSFFA245014,ZY20198016,AD18281079,AD18216004)+1 种基金the Natural Science Foundation of Hunan Province(2020JJ3014)Guangxi Colleges and Universities Key Laboratory of Intelligent Processing of Computer Images and Graphics(GIIP202001).
文摘The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods have achieved significant improvements in image super-resolution(SR),current CNNbased techniques mainly contain massive parameters and a high computational complexity,limiting their practical applications.In this paper,we present a fast and lightweight framework,named weighted multi-scale residual network(WMRN),for a better tradeoff between SR performance and computational efficiency.With the modified residual structure,depthwise separable convolutions(DS Convs)are employed to improve convolutional operations’efficiency.Furthermore,several weighted multi-scale residual blocks(WMRBs)are stacked to enhance the multi-scale representation capability.In the reconstruction subnetwork,a group of Conv layers are introduced to filter feature maps to reconstruct the final high-quality image.Extensive experiments were conducted to evaluate the proposed model,and the comparative results with several state-of-the-art algorithms demonstrate the effectiveness of WMRN.
基金supported by the National Science Foundation under Grant No.62066039.
文摘Recently,speech enhancement methods based on Generative Adversarial Networks have achieved good performance in time-domain noisy signals.However,the training of Generative Adversarial Networks has such problems as convergence difficulty,model collapse,etc.In this work,an end-to-end speech enhancement model based on Wasserstein Generative Adversarial Networks is proposed,and some improvements have been made in order to get faster convergence speed and better generated speech quality.Specifically,in the generator coding part,each convolution layer adopts different convolution kernel sizes to conduct convolution operations for obtaining speech coding information from multiple scales;a gated linear unit is introduced to alleviate the vanishing gradient problem with the increase of network depth;the gradient penalty of the discriminator is replaced with spectral normalization to accelerate the convergence rate of themodel;a hybrid penalty termcomposed of L1 regularization and a scale-invariant signal-to-distortion ratio is introduced into the loss function of the generator to improve the quality of generated speech.The experimental results on both TIMIT corpus and Tibetan corpus show that the proposed model improves the speech quality significantly and accelerates the convergence speed of the model.
文摘Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.
基金Supported in part by Natural Science Foundation of China(Grant Nos.51835009,51705398)Shaanxi Province 2020 Natural Science Basic Research Plan(Grant No.2020JQ-042)Aeronautical Science Foundation(Grant No.2019ZB070001).
文摘As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symbolic applications of PHM technology in modern manufacturing systems and industry.In this paper,a multi-scale Convolutional Gated Recurrent Unit network(MCGRU)is proposed to address raw sensory data for tool wear prediction.At the bottom of MCGRU,six parallel and independent branches with different kernel sizes are designed to form a multi-scale convolutional neural network,which augments the adaptability to features of different time scales.These features of different scales extracted from raw data are then fed into a Deep Gated Recurrent Unit network to capture long-term dependencies and learn significant representations.At the top of the MCGRU,a fully connected layer and a regression layer are built for cutting tool wear prediction.Two case studies are performed to verify the capability and effectiveness of the proposed MCGRU network and results show that MCGRU outperforms several state-of-the-art baseline models.
基金Item Sponsored by National Natural Science Foundation of China(50074026)
文摘A hybrid neural network model, in which RH process (theoretical) model is combined organically with neural network (NN) and case-base reasoning (CBR), was established. The CBR method was used to select the operation mode and the RH operational guide parameters for different steel grades according to the initial conditions of molten steel, and a three-layer BP neural network was adopted to deal with nonlinear factors for improving and compensating the limitations of technological model for RH process control and end-point prediction. It was verified that the hybrid neural network is effective for improving the precision and calculation efficiency of the model.
文摘This paper presents a novel adaptive scheme for energy management in stand-alone hybrid power systems. The proposed management system is designed to manage the power flow between the hybrid power system and energy storage elements in order to satisfy the load requirements based on artificial neural network (ANN) and fuzzy logic controllers. The neural network controller is employed to achieve the maximum power point (MPP) for different types of photovoltaic (PV) panels. The advance fuzzy logic controller is developed to distribute the power among the hybrid system and to manage the charge and discharge current flow for performance optimization. The developed management system performance was assessed using a hybrid system comprised PV panels, wind turbine (WT), battery storage, and proton exchange membrane fuel cell (PEMFC). To improve the generating performance of the PEMFC and prolong its life, stack temperature is controlled by a fuzzy logic controller. The dynamic behavior of the proposed model is examined under different operating conditions. Real-time measured parameters are used as inputs for the developed system. The proposed model and its control strategy offer a proper tool for optimizing hybrid power system performance, such as that used in smart-house applications.
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDA19060102)the National Natural Science Foundation of China[NSFCGrant Nos.41690122(41690120),and 42030410].
文摘El Niño-Southern Oscillation(ENSO)can be currently predicted reasonably well six months and longer,but large biases and uncertainties remain in its real-time prediction.Various approaches have been taken to improve understanding of ENSO processes,and different models for ENSO predictions have been developed,including linear statistical models based on principal oscillation pattern(POP)analyses,convolutional neural networks(CNNs),and so on.Here,we develop a novel hybrid model,named as POP-Net,by combining the POP analysis procedure with CNN-long short-term memory(LSTM)algorithm to predict the Niño-3.4 sea surface temperature(SST)index.ENSO predictions are compared with each other from the corresponding three models:POP model,CNN-LSTM model,and POP-Net,respectively.The POP-based pre-processing acts to enhance ENSO-related signals of interest while filtering unrelated noise.Consequently,an improved prediction is achieved in the POP-Net relative to others.The POP-Net shows a high-correlation skill for 17-month lead time prediction(correlation coefficients exceeding 0.5)during the 1994-2020 validation period.The POP-Net also alleviates the spring predictability barrier(SPB).It is concluded that value-added artificial neural networks for improved ENSO predictions are possible by including the process-oriented analyses to enhance signal representations.
基金supported by National Basic Research and Development Program of China (973 Program, Grant No. 2006CB705402)
文摘In order to solve the problem of trajectory tracking for a class of novel serial-parallel hybrid humanoid arm(HHA), which has parameters uncertainty, frictions, disturbance, abrasion and pulse forces derived from motors, a multistep dynamics modeling strategy is proposed and a robust controller based on neural network(NN)-adaptive algorithm is designed. At the first step of dynamics modeling, the dynamics model of the reduced HHA is established by Lagrange method. At the second step of dynamics modeling, the parameter uncertain part resulting mainly from the idealization of the HHA is learned by adaptive algorithm. In the trajectory tracking controller, the radial basis function(RBF) NN, whose optimal weights are learned online by adaptive algorithm, is used to learn the upper limit function of the total uncertainties including frictions, disturbances, abrasion and pulse forces. To a great extent, the conservatism of this robust trajectory tracking controller is reduced, and by this controller the HHA can impersonate mostly human actions. The proof and simulation results testify the validity of the adaptive strategy for parameter learning and the neural network-adaptive strategy for the trajectory tracking control.