A novel multicomponent high-Cr CoNi-based superalloy with superior comprehensive performance was prepared,and the evaluation of its high-temperature microstructural stability,oxidation resistance,and mechanical proper...A novel multicomponent high-Cr CoNi-based superalloy with superior comprehensive performance was prepared,and the evaluation of its high-temperature microstructural stability,oxidation resistance,and mechanical properties was conducted mainly using its cast polycrystalline alloy.The results disclosed that the morphology of theγ′phase remained stable,and the coarsening rate was slow during the long-term aging at 900–1000℃.The activation energy forγ′precipitate coarsening of alloy 9CoNi-Cr was(402±51)kJ/mol,which is higher compared with those of CMSX-4 and some other Ni-based and Co-based superalloys.Importantly,there was no indica-tion of the formation of topologically close-packed phases during this process.All these factors demonstrated the superior microstructural stability of the alloy.The mass gain of alloy 9CoNi-Cr was 0.6 mg/cm^(2) after oxidation at 1000℃ for 100 h,and the oxidation resistance was comparable to advanced Ni-based superalloys CMSX-4,which can be attributed to the formation of a continuous Al_(2)O_(3) protective layer.Moreover,the compressive yield strength of this cast polycrystalline alloy at high temperatures is clearly higher than that of the conventional Ni-based cast superalloy and the compressive minimum creep rate at 950℃ is comparable to that of the conventional Ni-based cast superalloy,demonstrating the alloy’s good mechanical properties at high temperature.This is partially because high Cr is bene-ficial in improving theγandγ′phase strengths of alloy 9CoNi-Cr.展开更多
Objective: Plant-based diets have multiple health benefits for cancers;however, little is known about the association between plant-based dietary patterns and esophageal cancer(EC).This study presents an investigation...Objective: Plant-based diets have multiple health benefits for cancers;however, little is known about the association between plant-based dietary patterns and esophageal cancer(EC).This study presents an investigation of the prospective associations among three predefined indices of plant-based dietary patterns and the risk of EC.Methods: We performed endoscopic screening for 15,709 participants aged 40-69 years from two high-risk areas of China from January 2005 to December 2009 and followed the cohort until December 31, 2022. The overall plant-based diet index(PDI), healthful plant-based diet index(h PDI), and unhealthful plant-based diet index(u PDI), were calculated using survey responses to assess dietary patterns. We applied Cox proportional hazard regression to estimate the multivariable hazard ratios(HRs) and 95% confidence intervals(95% CIs) of EC across 3plant-based diet indices and further stratified the analysis by subgroups.Results: The final study sample included 15,184 participants in the cohort. During a follow-up of 219,365person-years, 176 patients with EC were identified. When the highest quartile was compared with the lowest quartile, the pooled multivariable-adjusted HR of EC was 0.50(95% CI, 0.32-0.77) for h PDI. In addition, the HR per 10-point increase in the h PDI score was 0.42(95% CI, 0.27-0.66) for ECs. Conversely, u PDI was positively associated with the risk of EC, and the HR was 1.80(95% CI, 1.16-2.82). The HR per 10-point increase in the u PDI score was 1.90(95% CI, 1.26-2.88) for ECs. The associations between these scores and the risk of EC were consistent in most subgroups. These results remained robust in sensitivity analyses.Conclusions: A healthy plant-based dietary pattern was associated with a reduced risk of EC. Emphasizing the healthiness and quality of plant-based diets may be important for preventing the development of EC.展开更多
3D reconstruction based on single view aims to reconstruct the entire 3D shape of an object from one perspective.When existing methods reconstruct the mesh surface of complex objects,the surface details are difficult ...3D reconstruction based on single view aims to reconstruct the entire 3D shape of an object from one perspective.When existing methods reconstruct the mesh surface of complex objects,the surface details are difficult to predict and the reconstruction visual effect is poor because the mesh representation is not easily integrated into the deep learning framework;the 3D topology is easily limited by predefined templates and inflexible,and unnecessary mesh self-intersections and connections will be generated when reconstructing complex topology,thus destroying the surface details;the training of the reconstruction network is limited by the large amount of information attached to the mesh vertices,and the training time of the reconstructed network is too long.In this paper,we propose a method for fast mesh reconstruction from single view based on Graph Convolutional Network(GCN)and topology modification.We use GCN to ensure the generation of high-quality mesh surfaces and use topology modification to improve the flexibility of the topology.Meanwhile,a feature fusion method is proposed to make full use of the features of each stage of the image hierarchically.We use 3D open dataset ShapeNet to train our network and add a new weight parameter to speed up the training process.Extensive experiments demonstrate that our method can not only reconstruct object meshes on complex topological surfaces,but also has better qualitative and quantitative results.展开更多
Robust watermarking requires finding invariant features under multiple attacks to ensure correct extraction.Deep learning has extremely powerful in extracting features,and watermarking algorithms based on deep learnin...Robust watermarking requires finding invariant features under multiple attacks to ensure correct extraction.Deep learning has extremely powerful in extracting features,and watermarking algorithms based on deep learning have attracted widespread attention.Most existing methods use 3×3 small kernel convolution to extract image features and embed the watermarking.However,the effective perception fields for small kernel convolution are extremely confined,so the pixels that each watermarking can affect are restricted,thus limiting the performance of the watermarking.To address these problems,we propose a watermarking network based on large kernel convolution and adaptive weight assignment for loss functions.It uses large-kernel depth-wise convolution to extract features for learning large-scale image information and subsequently projects the watermarking into a highdimensional space by 1×1 convolution to achieve adaptability in the channel dimension.Subsequently,the modification of the embedded watermarking on the cover image is extended to more pixels.Because the magnitude and convergence rates of each loss function are different,an adaptive loss weight assignment strategy is proposed to make theweights participate in the network training together and adjust theweight dynamically.Further,a high-frequency wavelet loss is proposed,by which the watermarking is restricted to only the low-frequency wavelet sub-bands,thereby enhancing the robustness of watermarking against image compression.The experimental results show that the peak signal-to-noise ratio(PSNR)of the encoded image reaches 40.12,the structural similarity(SSIM)reaches 0.9721,and the watermarking has good robustness against various types of noise.展开更多
Fall behavior is closely related to high mortality in the elderly,so fall detection becomes an important and urgent research area.However,the existing fall detection methods are difficult to be applied in daily life d...Fall behavior is closely related to high mortality in the elderly,so fall detection becomes an important and urgent research area.However,the existing fall detection methods are difficult to be applied in daily life due to a large amount of calculation and poor detection accuracy.To solve the above problems,this paper proposes a dense spatial-temporal graph convolutional network based on lightweight OpenPose.Lightweight OpenPose uses MobileNet as a feature extraction network,and the prediction layer uses bottleneck-asymmetric structure,thus reducing the amount of the network.The bottleneck-asymmetrical structure compresses the number of input channels of feature maps by 1×1 convolution and replaces the 7×7 convolution structure with the asymmetric structure of 1×7 convolution,7×1 convolution,and 7×7 convolution in parallel.The spatial-temporal graph convolutional network divides the multi-layer convolution into dense blocks,and the convolutional layers in each dense block are connected,thus improving the feature transitivity,enhancing the network’s ability to extract features,thus improving the detection accuracy.Two representative datasets,Multiple Cameras Fall dataset(MCF),and Nanyang Technological University Red Green Blue+Depth Action Recognition dataset(NTU RGB+D),are selected for our experiments,among which NTU RGB+D has two evaluation benchmarks.The results show that the proposed model is superior to the current fall detection models.The accuracy of this network on the MCF dataset is 96.3%,and the accuracies on the two evaluation benchmarks of the NTU RGB+D dataset are 85.6%and 93.5%,respectively.展开更多
The image emotion classification task aims to use the model to automatically predict the emotional response of people when they see the image.Studies have shown that certain local regions are more likely to inspire an...The image emotion classification task aims to use the model to automatically predict the emotional response of people when they see the image.Studies have shown that certain local regions are more likely to inspire an emotional response than the whole image.However,existing methods perform poorly in predicting the details of emotional regions and are prone to overfitting during training due to the small size of the dataset.Therefore,this study proposes an image emotion classification network based on multilayer attentional interaction and adaptive feature aggregation.To perform more accurate emotional region prediction,this study designs a multilayer attentional interaction module.The module calculates spatial attention maps for higher-layer semantic features and fusion features through amultilayer shuffle attention module.Through layer-by-layer up-sampling and gating operations,the higher-layer features guide the lower-layer features to learn,eventually achieving sentiment region prediction at the optimal scale.To complement the important information lost by layer-by-layer fusion,this study not only adds an intra-layer fusion to the multilayer attention interaction module but also designs an adaptive feature aggregation module.The module uses global average pooling to compress spatial information and connect channel information from all layers.Then,the module adaptively generates a set of aggregated weights through two fully connected layers to augment the original features of each layer.Eventually,the semantics and details of the different layers are aggregated through gating operations and residual connectivity to complement the lost information.To reduce overfitting on small datasets,the network is pre-trained on the FI dataset,and further weight fine-tuning is performed on the small dataset.The experimental results on the FI,Twitter I and Emotion ROI(Region of Interest)datasets show that the proposed network exceeds existing image emotion classification methods,with accuracies of 90.27%,84.66%and 84.96%.展开更多
The data in the blockchain cannot be tampered with and the users are anonymous,which enables the blockchain to be a natural carrier for covert communication.However,the existing methods of covert communication in bloc...The data in the blockchain cannot be tampered with and the users are anonymous,which enables the blockchain to be a natural carrier for covert communication.However,the existing methods of covert communication in blockchain suffer from the predefined channel structure,the capacity of a single transaction is not high,and the fixed transaction behaviors will lower the concealment of the communication channel.Therefore,this paper proposes a derivation matrix-based covert communication method in blockchain.It uses dual-key to derive two types of blockchain addresses and then constructs an address matrix by dividing addresses into multiple layers to make full use of the redundancy of addresses.Subsequently,to solve the problem of the lack of concealment caused by the fixed transaction behaviors,divide the rectangular matrix into square blocks with overlapping regions and then encrypt different blocks sequentially to make the transaction behaviors of the channel addresses match better with those of the real addresses.Further,the linear congruence algorithm is used to generate random sequence,which provides a random order for blocks encryption,and thus enhances the security of the encryption algorithm.Experimental results show that this method can effectively reduce the abnormal transaction behaviors of addresses while ensuring the channel transmission efficiency.展开更多
Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually us...Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually use convolutional neural networks to extract features directly from raw gesture data for gesture recognition,but the networks are affected by much interference information in the input data and thus fit to some unimportant features.In this paper,we proposed a novel method for encoding spatio-temporal information,which can enhance the key features required for gesture recognition,such as shape,structure,contour,position and hand motion of gestures,thereby improving the accuracy of gesture recognition.This encoding method can encode arbitrarily multiple frames of gesture data into a single frame of the spatio-temporal feature map and use the spatio-temporal feature map as the input to the neural network.This can guide the model to fit important features while avoiding the use of complex recurrent network structures to extract temporal features.In addition,we designed two sub-networks and trained the model using a sub-network pre-training strategy that trains the sub-networks first and then the entire network,so as to avoid the subnetworks focusing too much on the information of a single category feature and being overly influenced by each other’s features.Experimental results on two public gesture datasets show that the proposed spatio-temporal information encoding method achieves advanced accuracy.展开更多
Convolution Neural Networks(CNN)can quickly diagnose COVID-19 patients by analyzing computed tomography(CT)images of the lung,thereby effectively preventing the spread of COVID-19.However,the existing CNN-based COVID-...Convolution Neural Networks(CNN)can quickly diagnose COVID-19 patients by analyzing computed tomography(CT)images of the lung,thereby effectively preventing the spread of COVID-19.However,the existing CNN-based COVID-19 diagnosis models do consider the problem that the lung images of COVID-19 patients in the early stage and incubation period are extremely similar to those of the non-COVID-19 population.Which reduces the model’s classification sensitivity,resulting in a higher probability of the model misdiagnosing COVID-19 patients as non-COVID-19 people.To solve the problem,this paper first attempts to apply triplet loss and center loss to the field of COVID-19 image classification,combining softmax loss to design a jointly supervised metric loss function COVID Triplet-Center Loss(COVID-TCL).Triplet loss can increase inter-class discreteness,and center loss can improve intra-class compactness.Therefore,COVID-TCL can help the CNN-based model to extract more discriminative features and strengthen the diagnostic capacity of COVID-19 patients in the early stage and incubation period.Meanwhile,we use the extreme gradient boosting(XGBoost)as a classifier to design a COVID-19 images classification model of CNN-XGBoost architecture,to further improve the CNN-based model’s classification effect and operation efficiency.The experiment shows that the classification accuracy of the model proposed in this paper is 97.41%,and the sensitivity is 97.61%,which is higher than the other 7 reference models.The COVID-TCL can effectively improve the classification sensitivity of the CNN-based model,the CNN-XGBoost architecture can further improve the CNN-based model’s classification effect.展开更多
LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previou...LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previous object detection methods,due to the pre-processing of the original LIDAR point cloud into voxels or pillars,lose the coordinate information of the original point cloud,slow detection speed,and gain inaccurate bounding box positioning.To address the issues above,this study proposes a new two-stage network structure to extract point cloud features directly by PointNet++,which effectively preserves the original point cloud coordinate information.To improve the detection accuracy,a shell-based modeling method is proposed.It roughly determines which spherical shell the coordinates belong to.Then,the results are refined to ground truth,thereby narrowing the localization range and improving the detection accuracy.To improve the recall of 3D object detection with bounding boxes,this paper designs a self-attention module for 3D object detection with a skip connection structure.Some of these features are highlighted by weighting them on the feature dimensions.After training,it makes the feature weights that are favorable for object detection get larger.Thus,the extracted features are more adapted to the object detection task.Extensive comparison experiments and ablation experiments conducted on the KITTI dataset verify the effectiveness of our proposed method in improving recall and precision.展开更多
The ORB-SLAM2 based on the constant velocity model is difficult to determine the search window of the reprojection of map points when the objects are in variable velocity motion,which leads to a false matching,with an...The ORB-SLAM2 based on the constant velocity model is difficult to determine the search window of the reprojection of map points when the objects are in variable velocity motion,which leads to a false matching,with an inaccurate pose estimation or failed tracking.To address the challenge above,a new method of feature point matching is proposed in this paper,which combines the variable velocity model with the reverse optical flow method.First,the constant velocity model is extended to a new variable velocity model,and the expanded variable velocity model is used to provide the initial pixel shifting for the reverse optical flow method.Then the search range of feature points is accurately determined according to the results of the reverse optical flow method,thereby improving the accuracy and reliability of feature matching,with strengthened interframe tracking effects.Finally,we tested on TUM data set based on the RGB-D camera.Experimental results show that this method can reduce the probability of tracking failure and improve localization accuracy on SLAM(Simultaneous Localization and Mapping)systems.Compared with the traditional ORB-SLAM2,the test error of this method on each sequence in the TUM data set is significantly reduced,and the root mean square error is only 63.8%of the original system under the optimal condition.展开更多
Signature verification,which is a method to distinguish the authenticity of signature images,is a biometric verification technique that can effectively reduce the risk of forged signatures in financial,legal,and other...Signature verification,which is a method to distinguish the authenticity of signature images,is a biometric verification technique that can effectively reduce the risk of forged signatures in financial,legal,and other business envir-onments.However,compared with ordinary images,signature images have the following characteristics:First,the strokes are slim,i.e.,there is less effective information.Second,the signature changes slightly with the time,place,and mood of the signer,i.e.,it has high intraclass differences.These challenges lead to the low accuracy of the existing methods based on convolutional neural net-works(CNN).This study proposes an end-to-end multi-path attention inverse dis-crimination network that focuses on the signature stroke parts to extract features by reversing the foreground and background of signature images,which effectively solves the problem of little effective information.To solve the problem of high intraclass variability of signature images,we add multi-path attention modules between discriminative streams and inverse streams to enhance the discriminative features of signature images.Moreover,a multi-path discrimination loss function is proposed,which does not require the feature representation of the samples with the same class label to be infinitely close,as long as the gap between inter-class distance and the intra-class distance is bigger than the set classification threshold,which radically resolves the problem of high intra-class difference of signature images.In addition,this loss can also spur the network to explore the detailed infor-mation on the stroke parts,such as the crossing,thickness,and connection of strokes.We respectively tested on CEDAR,BHSig-Bengali,BHSig-Hindi,and GPDS Synthetic datasets with accuracies of 100%,96.24%,93.86%,and 83.72%,which are more accurate than existing signature verification methods.This is more helpful to the task of signature authentication in justice and finance.展开更多
The leakage of medical audio data in telemedicine seriously violates the privacy of patients.In order to avoid the leakage of patient information in telemedicine,a two-stage reversible robust audio watermarking algori...The leakage of medical audio data in telemedicine seriously violates the privacy of patients.In order to avoid the leakage of patient information in telemedicine,a two-stage reversible robust audio watermarking algorithm is proposed to protect medical audio data.The scheme decomposes the medical audio into two independent embedding domains,embeds the robust watermark and the reversible watermark into the two domains respectively.In order to ensure the audio quality,the Hurst exponent is used to find a suitable position for watermark embedding.Due to the independence of the two embedding domains,the embedding of the second-stage reversible watermark will not affect the first-stage watermark,so the robustness of the first-stage watermark can be well maintained.In the second stage,the correlation between the sampling points in the medical audio is used to modify the hidden bits of the histogram to reduce the modification of the medical audio and reduce the distortion caused by reversible embedding.Simulation experiments show that this scheme has strong robustness against signal processing operations such as MP3 compression of 48 db,additive white Gaussian noise(AWGN)of 20 db,low-pass filtering,resampling,re-quantization and other attacks,and has good imperceptibility.展开更多
In a telemedicine diagnosis system,the emergence of 3D imaging enables doctors to make clearer judgments,and its accuracy also directly affects doctors’diagnosis of the disease.In order to ensure the safe transmissio...In a telemedicine diagnosis system,the emergence of 3D imaging enables doctors to make clearer judgments,and its accuracy also directly affects doctors’diagnosis of the disease.In order to ensure the safe transmission and storage of medical data,a 3D medical watermarking algorithm based on wavelet transform is proposed in this paper.The proposed algorithm employs the principal component analysis(PCA)transform to reduce the data dimension,which can minimize the error between the extracted components and the original data in the mean square sense.Especially,this algorithm helps to create a bacterial foraging model based on particle swarm optimization(BF-PSO),by which the optimal wavelet coefficient is found for embedding and is used as the absolute feature of watermark embedding,thereby achieving the optimal balance between embedding capacity and imperceptibility.A series of experimental results from MATLAB software based on the standard MRI brain volume dataset demonstrate that the proposed algorithm has strong robustness and make the 3D model have small deformation after embedding the watermark.展开更多
Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of int...Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of intelligent transportation system.Most existing vehicle re-identification models adopt the joint learning of global and local features.However,they directly use the extracted global features,resulting in insufficient feature expression.Moreover,local features are primarily obtained through advanced annotation and complex attention mechanisms,which require additional costs.To solve this issue,a multi-feature learning model with enhanced local attention for vehicle re-identification(MFELA)is proposed in this paper.The model consists of global and local branches.The global branch utilizes both middle and highlevel semantic features of ResNet50 to enhance the global representation capability.In addition,multi-scale pooling operations are used to obtain multiscale information.While the local branch utilizes the proposed Region Batch Dropblock(RBD),which encourages the model to learn discriminative features for different local regions and simultaneously drops corresponding same areas randomly in a batch during training to enhance the attention to local regions.Then features from both branches are combined to provide a more comprehensive and distinctive feature representation.Extensive experiments on VeRi-776 and VehicleID datasets prove that our method has excellent performance.展开更多
With the increasing application of surveillance cameras,vehicle re-identication(Re-ID)has attracted more attention in the eld of public security.Vehicle Re-ID meets challenge attributable to the large intra-class diff...With the increasing application of surveillance cameras,vehicle re-identication(Re-ID)has attracted more attention in the eld of public security.Vehicle Re-ID meets challenge attributable to the large intra-class differences caused by different views of vehicles in the traveling process and obvious inter-class similarities caused by similar appearances.Plentiful existing methods focus on local attributes by marking local locations.However,these methods require additional annotations,resulting in complex algorithms and insufferable computation time.To cope with these challenges,this paper proposes a vehicle Re-ID model based on optimized DenseNet121 with joint loss.This model applies the SE block to automatically obtain the importance of each channel feature and assign the corresponding weight to it,then features are transferred to the deep layer by adjusting the corresponding weights,which reduces the transmission of redundant information in the process of feature reuse in DenseNet121.At the same time,the proposed model leverages the complementary expression advantages of middle features of the CNN to enhance the feature expression ability.Additionally,a joint loss with focal loss and triplet loss is proposed in vehicle Re-ID to enhance the model’s ability to discriminate difcult-to-separate samples by enlarging the weight of the difcult-to-separate samples during the training process.Experimental results on the VeRi-776 dataset show that mAP and Rank-1 reach 75.5%and 94.8%,respectively.Besides,Rank-1 on small,medium and large sub-datasets of Vehicle ID dataset reach 81.3%,78.9%,and 76.5%,respectively,which surpasses most existing vehicle Re-ID methods.展开更多
The key to preventing the COVID-19 is to diagnose patients quickly and accurately.Studies have shown that using Convolutional Neural Networks(CNN)to analyze chest Computed Tomography(CT)images is helpful for timely CO...The key to preventing the COVID-19 is to diagnose patients quickly and accurately.Studies have shown that using Convolutional Neural Networks(CNN)to analyze chest Computed Tomography(CT)images is helpful for timely COVID-19 diagnosis.However,personal privacy issues,public chest CT data sets are relatively few,which has limited CNN’s application to COVID-19 diagnosis.Also,many CNNs have complex structures and massive parameters.Even if equipped with the dedicated Graphics Processing Unit(GPU)for acceleration,it still takes a long time,which is not conductive to widespread application.To solve above problems,this paper proposes a lightweight CNN classification model based on transfer learning.Use the lightweight CNN MobileNetV2 as the backbone of the model to solve the shortage of hardware resources and computing power.In order to alleviate the problem of model overfitting caused by insufficient data set,transfer learning is used to train the model.The study first exploits the weight parameters trained on the ImageNet database to initialize the MobileNetV2 network,and then retrain the model based on the CT image data set provided by Kaggle.Experimental results on a computer equipped only with the Central Processing Unit(CPU)show that it consumes only 1.06 s on average to diagnose a chest CT image.Compared to other lightweight models,the proposed model has a higher classification accuracy and reliability while having a lightweight architecture and few parameters,which can be easily applied to computers without GPU acceleration.Code:github.com/ZhouJie-520/paper-codes.展开更多
Vehicle type recognition(VTR)is an important research topic due to its significance in intelligent transportation systems.However,recognizing vehicle type on the real-world images is challenging due to the illuminatio...Vehicle type recognition(VTR)is an important research topic due to its significance in intelligent transportation systems.However,recognizing vehicle type on the real-world images is challenging due to the illumination change,partial occlusion under real traffic environment.These difficulties limit the performance of current state-of-art methods,which are typically based on single-stage classification without considering feature availability.To address such difficulties,this paper proposes a two-stage vehicle type recognition method combining the most effective Gabor features.The first stage leverages edge features to classify vehicles by size into big or small via a similarity k-nearest neighbor classifier(SKNNC).Further the more specific vehicle type such as bus,truck,sedan or van is recognized by the second stage classification,which leverages the most effective Gabor features extracted by a set of Gabor wavelet kernels on the partitioned key patches via a kernel sparse representation-based classifier(KSRC).A verification and correction step based on minimum residual analysis is proposed to enhance the reliability of the VTR.To improve VTR efficiency,the most effective Gabor features are selected through gray relational analysis that leverages the correlation between Gabor feature image and the original image.Experimental results demonstrate that the proposed method not only improves the accuracy of VTR but also enhances the recognition robustness to illumination change and partial occlusion.展开更多
In the current dire situation of the corona virus COVID-19,remote consultations were proposed to avoid cross-infection and regional differences in medical resources.However,the safety of digital medical imaging in rem...In the current dire situation of the corona virus COVID-19,remote consultations were proposed to avoid cross-infection and regional differences in medical resources.However,the safety of digital medical imaging in remote consultations has also attracted more and more attention from the medical industry.To ensure the integrity and security of medical images,this paper proposes a robust watermarking algorithm to authenticate and recover from the distorted medical images based on regions of interest(ROI)and integer wavelet transform(IWT).First,the medical image is divided into two different parts,regions of interest and non-interest regions.Then the integrity of ROI is verified using the hash algorithm,and the recovery data of the ROI region is calculated at the same time.Also,binary images with the basic information of patients are processed by logistic chaotic map encryption,and then the synthetic watermark is embedded in the medical carrier image using IWT transform.The performance of the proposed algorithm is tested by the simulation experiments based on the MATLAB program in CT images of the lungs.Experimental results show that the algorithm can precisely locate the distorted areas of an image and recover the original ROI on the basis of verifying image reliability.The maximum peak signal to noise ratio(PSNR)value of 51.24 has been achieved,which proves that the watermark is invisible and has strong robustness against noise,compression,and filtering attacks.展开更多
基金supported by the National Natural Science Foundation of China(Nos.52331005,52201100,52171095,and 92060113)the China Postdoctoral Science Foundation(No.2022M710346)+2 种基金Science and Technology on Advanced High Temperature Structural Materials Laboratory,China(No.6142903210207)the Fundamental Research Funds for the Central Universities,China(No.FRF-GF-20-30B)the National Key Research and Development Program of China(No.2017YFB0702902).
文摘A novel multicomponent high-Cr CoNi-based superalloy with superior comprehensive performance was prepared,and the evaluation of its high-temperature microstructural stability,oxidation resistance,and mechanical properties was conducted mainly using its cast polycrystalline alloy.The results disclosed that the morphology of theγ′phase remained stable,and the coarsening rate was slow during the long-term aging at 900–1000℃.The activation energy forγ′precipitate coarsening of alloy 9CoNi-Cr was(402±51)kJ/mol,which is higher compared with those of CMSX-4 and some other Ni-based and Co-based superalloys.Importantly,there was no indica-tion of the formation of topologically close-packed phases during this process.All these factors demonstrated the superior microstructural stability of the alloy.The mass gain of alloy 9CoNi-Cr was 0.6 mg/cm^(2) after oxidation at 1000℃ for 100 h,and the oxidation resistance was comparable to advanced Ni-based superalloys CMSX-4,which can be attributed to the formation of a continuous Al_(2)O_(3) protective layer.Moreover,the compressive yield strength of this cast polycrystalline alloy at high temperatures is clearly higher than that of the conventional Ni-based cast superalloy and the compressive minimum creep rate at 950℃ is comparable to that of the conventional Ni-based cast superalloy,demonstrating the alloy’s good mechanical properties at high temperature.This is partially because high Cr is bene-ficial in improving theγandγ′phase strengths of alloy 9CoNi-Cr.
基金supported by grants from the Beijing Nova Program (No. Z201100006820069)CAMS Innovation Fund for Medical Sciences (CIFMS, No. 2021-I2M-1-023, 2021-I2M-1-010)Talent Incentive Program of Cancer Hospital Chinese Academy of Medical Sciences (Hope Star)。
文摘Objective: Plant-based diets have multiple health benefits for cancers;however, little is known about the association between plant-based dietary patterns and esophageal cancer(EC).This study presents an investigation of the prospective associations among three predefined indices of plant-based dietary patterns and the risk of EC.Methods: We performed endoscopic screening for 15,709 participants aged 40-69 years from two high-risk areas of China from January 2005 to December 2009 and followed the cohort until December 31, 2022. The overall plant-based diet index(PDI), healthful plant-based diet index(h PDI), and unhealthful plant-based diet index(u PDI), were calculated using survey responses to assess dietary patterns. We applied Cox proportional hazard regression to estimate the multivariable hazard ratios(HRs) and 95% confidence intervals(95% CIs) of EC across 3plant-based diet indices and further stratified the analysis by subgroups.Results: The final study sample included 15,184 participants in the cohort. During a follow-up of 219,365person-years, 176 patients with EC were identified. When the highest quartile was compared with the lowest quartile, the pooled multivariable-adjusted HR of EC was 0.50(95% CI, 0.32-0.77) for h PDI. In addition, the HR per 10-point increase in the h PDI score was 0.42(95% CI, 0.27-0.66) for ECs. Conversely, u PDI was positively associated with the risk of EC, and the HR was 1.80(95% CI, 1.16-2.82). The HR per 10-point increase in the u PDI score was 1.90(95% CI, 1.26-2.88) for ECs. The associations between these scores and the risk of EC were consistent in most subgroups. These results remained robust in sensitivity analyses.Conclusions: A healthy plant-based dietary pattern was associated with a reduced risk of EC. Emphasizing the healthiness and quality of plant-based diets may be important for preventing the development of EC.
基金This work was supported,in part,by the Natural Science Foundation of Jiangsu Province under Grant Numbers BK20201136,BK20191401in part,by the National Nature Science Foundation of China under Grant Numbers 61502240,61502096,61304205,61773219in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘3D reconstruction based on single view aims to reconstruct the entire 3D shape of an object from one perspective.When existing methods reconstruct the mesh surface of complex objects,the surface details are difficult to predict and the reconstruction visual effect is poor because the mesh representation is not easily integrated into the deep learning framework;the 3D topology is easily limited by predefined templates and inflexible,and unnecessary mesh self-intersections and connections will be generated when reconstructing complex topology,thus destroying the surface details;the training of the reconstruction network is limited by the large amount of information attached to the mesh vertices,and the training time of the reconstructed network is too long.In this paper,we propose a method for fast mesh reconstruction from single view based on Graph Convolutional Network(GCN)and topology modification.We use GCN to ensure the generation of high-quality mesh surfaces and use topology modification to improve the flexibility of the topology.Meanwhile,a feature fusion method is proposed to make full use of the features of each stage of the image hierarchically.We use 3D open dataset ShapeNet to train our network and add a new weight parameter to speed up the training process.Extensive experiments demonstrate that our method can not only reconstruct object meshes on complex topological surfaces,but also has better qualitative and quantitative results.
基金supported,in part,by the National Nature Science Foundation of China under grant numbers 62272236in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)fund.
文摘Robust watermarking requires finding invariant features under multiple attacks to ensure correct extraction.Deep learning has extremely powerful in extracting features,and watermarking algorithms based on deep learning have attracted widespread attention.Most existing methods use 3×3 small kernel convolution to extract image features and embed the watermarking.However,the effective perception fields for small kernel convolution are extremely confined,so the pixels that each watermarking can affect are restricted,thus limiting the performance of the watermarking.To address these problems,we propose a watermarking network based on large kernel convolution and adaptive weight assignment for loss functions.It uses large-kernel depth-wise convolution to extract features for learning large-scale image information and subsequently projects the watermarking into a highdimensional space by 1×1 convolution to achieve adaptability in the channel dimension.Subsequently,the modification of the embedded watermarking on the cover image is extended to more pixels.Because the magnitude and convergence rates of each loss function are different,an adaptive loss weight assignment strategy is proposed to make theweights participate in the network training together and adjust theweight dynamically.Further,a high-frequency wavelet loss is proposed,by which the watermarking is restricted to only the low-frequency wavelet sub-bands,thereby enhancing the robustness of watermarking against image compression.The experimental results show that the peak signal-to-noise ratio(PSNR)of the encoded image reaches 40.12,the structural similarity(SSIM)reaches 0.9721,and the watermarking has good robustness against various types of noise.
基金supported,in part,by the National Nature Science Foundation of China under Grant Numbers 62272236,62376128in part,by the Natural Science Foundation of Jiangsu Province under Grant Numbers BK20201136,BK20191401.
文摘Fall behavior is closely related to high mortality in the elderly,so fall detection becomes an important and urgent research area.However,the existing fall detection methods are difficult to be applied in daily life due to a large amount of calculation and poor detection accuracy.To solve the above problems,this paper proposes a dense spatial-temporal graph convolutional network based on lightweight OpenPose.Lightweight OpenPose uses MobileNet as a feature extraction network,and the prediction layer uses bottleneck-asymmetric structure,thus reducing the amount of the network.The bottleneck-asymmetrical structure compresses the number of input channels of feature maps by 1×1 convolution and replaces the 7×7 convolution structure with the asymmetric structure of 1×7 convolution,7×1 convolution,and 7×7 convolution in parallel.The spatial-temporal graph convolutional network divides the multi-layer convolution into dense blocks,and the convolutional layers in each dense block are connected,thus improving the feature transitivity,enhancing the network’s ability to extract features,thus improving the detection accuracy.Two representative datasets,Multiple Cameras Fall dataset(MCF),and Nanyang Technological University Red Green Blue+Depth Action Recognition dataset(NTU RGB+D),are selected for our experiments,among which NTU RGB+D has two evaluation benchmarks.The results show that the proposed model is superior to the current fall detection models.The accuracy of this network on the MCF dataset is 96.3%,and the accuracies on the two evaluation benchmarks of the NTU RGB+D dataset are 85.6%and 93.5%,respectively.
基金This study was supported,in part,by the National Nature Science Foundation of China under Grant 62272236in part,by the Natural Science Foundation of Jiangsu Province under Grant BK20201136,BK20191401.
文摘The image emotion classification task aims to use the model to automatically predict the emotional response of people when they see the image.Studies have shown that certain local regions are more likely to inspire an emotional response than the whole image.However,existing methods perform poorly in predicting the details of emotional regions and are prone to overfitting during training due to the small size of the dataset.Therefore,this study proposes an image emotion classification network based on multilayer attentional interaction and adaptive feature aggregation.To perform more accurate emotional region prediction,this study designs a multilayer attentional interaction module.The module calculates spatial attention maps for higher-layer semantic features and fusion features through amultilayer shuffle attention module.Through layer-by-layer up-sampling and gating operations,the higher-layer features guide the lower-layer features to learn,eventually achieving sentiment region prediction at the optimal scale.To complement the important information lost by layer-by-layer fusion,this study not only adds an intra-layer fusion to the multilayer attention interaction module but also designs an adaptive feature aggregation module.The module uses global average pooling to compress spatial information and connect channel information from all layers.Then,the module adaptively generates a set of aggregated weights through two fully connected layers to augment the original features of each layer.Eventually,the semantics and details of the different layers are aggregated through gating operations and residual connectivity to complement the lost information.To reduce overfitting on small datasets,the network is pre-trained on the FI dataset,and further weight fine-tuning is performed on the small dataset.The experimental results on the FI,Twitter I and Emotion ROI(Region of Interest)datasets show that the proposed network exceeds existing image emotion classification methods,with accuracies of 90.27%,84.66%and 84.96%.
基金This work was supported,in part,by the National Nature Science Foundation of China under grant numbers 62272236in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund。
文摘The data in the blockchain cannot be tampered with and the users are anonymous,which enables the blockchain to be a natural carrier for covert communication.However,the existing methods of covert communication in blockchain suffer from the predefined channel structure,the capacity of a single transaction is not high,and the fixed transaction behaviors will lower the concealment of the communication channel.Therefore,this paper proposes a derivation matrix-based covert communication method in blockchain.It uses dual-key to derive two types of blockchain addresses and then constructs an address matrix by dividing addresses into multiple layers to make full use of the redundancy of addresses.Subsequently,to solve the problem of the lack of concealment caused by the fixed transaction behaviors,divide the rectangular matrix into square blocks with overlapping regions and then encrypt different blocks sequentially to make the transaction behaviors of the channel addresses match better with those of the real addresses.Further,the linear congruence algorithm is used to generate random sequence,which provides a random order for blocks encryption,and thus enhances the security of the encryption algorithm.Experimental results show that this method can effectively reduce the abnormal transaction behaviors of addresses while ensuring the channel transmission efficiency.
基金This work was supported,in part,by the National Nature Science Foundation of China under grant numbers 62272236in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually use convolutional neural networks to extract features directly from raw gesture data for gesture recognition,but the networks are affected by much interference information in the input data and thus fit to some unimportant features.In this paper,we proposed a novel method for encoding spatio-temporal information,which can enhance the key features required for gesture recognition,such as shape,structure,contour,position and hand motion of gestures,thereby improving the accuracy of gesture recognition.This encoding method can encode arbitrarily multiple frames of gesture data into a single frame of the spatio-temporal feature map and use the spatio-temporal feature map as the input to the neural network.This can guide the model to fit important features while avoiding the use of complex recurrent network structures to extract temporal features.In addition,we designed two sub-networks and trained the model using a sub-network pre-training strategy that trains the sub-networks first and then the entire network,so as to avoid the subnetworks focusing too much on the information of a single category feature and being overly influenced by each other’s features.Experimental results on two public gesture datasets show that the proposed spatio-temporal information encoding method achieves advanced accuracy.
基金This work was supported,in part,by the Natural Science Foundation of Jiangsu Province under Grant Numbers BK20201136,BK20191401in part,by the National Nature Science Foundation of China under Grant Numbers 62272236,61502096,61304205,61773219,61502240in part,by the Public Welfare Fund Project of Zhejiang Province Grant Numbers LGG20E050001.
文摘Convolution Neural Networks(CNN)can quickly diagnose COVID-19 patients by analyzing computed tomography(CT)images of the lung,thereby effectively preventing the spread of COVID-19.However,the existing CNN-based COVID-19 diagnosis models do consider the problem that the lung images of COVID-19 patients in the early stage and incubation period are extremely similar to those of the non-COVID-19 population.Which reduces the model’s classification sensitivity,resulting in a higher probability of the model misdiagnosing COVID-19 patients as non-COVID-19 people.To solve the problem,this paper first attempts to apply triplet loss and center loss to the field of COVID-19 image classification,combining softmax loss to design a jointly supervised metric loss function COVID Triplet-Center Loss(COVID-TCL).Triplet loss can increase inter-class discreteness,and center loss can improve intra-class compactness.Therefore,COVID-TCL can help the CNN-based model to extract more discriminative features and strengthen the diagnostic capacity of COVID-19 patients in the early stage and incubation period.Meanwhile,we use the extreme gradient boosting(XGBoost)as a classifier to design a COVID-19 images classification model of CNN-XGBoost architecture,to further improve the CNN-based model’s classification effect and operation efficiency.The experiment shows that the classification accuracy of the model proposed in this paper is 97.41%,and the sensitivity is 97.61%,which is higher than the other 7 reference models.The COVID-TCL can effectively improve the classification sensitivity of the CNN-based model,the CNN-XGBoost architecture can further improve the CNN-based model’s classification effect.
基金This work was supported,in part,by the National Nature Science Foundation of China under grant numbers 62272236in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previous object detection methods,due to the pre-processing of the original LIDAR point cloud into voxels or pillars,lose the coordinate information of the original point cloud,slow detection speed,and gain inaccurate bounding box positioning.To address the issues above,this study proposes a new two-stage network structure to extract point cloud features directly by PointNet++,which effectively preserves the original point cloud coordinate information.To improve the detection accuracy,a shell-based modeling method is proposed.It roughly determines which spherical shell the coordinates belong to.Then,the results are refined to ground truth,thereby narrowing the localization range and improving the detection accuracy.To improve the recall of 3D object detection with bounding boxes,this paper designs a self-attention module for 3D object detection with a skip connection structure.Some of these features are highlighted by weighting them on the feature dimensions.After training,it makes the feature weights that are favorable for object detection get larger.Thus,the extracted features are more adapted to the object detection task.Extensive comparison experiments and ablation experiments conducted on the KITTI dataset verify the effectiveness of our proposed method in improving recall and precision.
基金This work was supported by The National Natural Science Foundation of China under Grant No.61304205 and NO.61502240The Natural Science Foundation of Jiangsu Province under Grant No.BK20191401 and No.BK20201136Postgraduate Research&Practice Innovation Program of Jiangsu Province under Grant No.SJCX21_0364 and No.SJCX21_0363.
文摘The ORB-SLAM2 based on the constant velocity model is difficult to determine the search window of the reprojection of map points when the objects are in variable velocity motion,which leads to a false matching,with an inaccurate pose estimation or failed tracking.To address the challenge above,a new method of feature point matching is proposed in this paper,which combines the variable velocity model with the reverse optical flow method.First,the constant velocity model is extended to a new variable velocity model,and the expanded variable velocity model is used to provide the initial pixel shifting for the reverse optical flow method.Then the search range of feature points is accurately determined according to the results of the reverse optical flow method,thereby improving the accuracy and reliability of feature matching,with strengthened interframe tracking effects.Finally,we tested on TUM data set based on the RGB-D camera.Experimental results show that this method can reduce the probability of tracking failure and improve localization accuracy on SLAM(Simultaneous Localization and Mapping)systems.Compared with the traditional ORB-SLAM2,the test error of this method on each sequence in the TUM data set is significantly reduced,and the root mean square error is only 63.8%of the original system under the optimal condition.
基金This work was supported,in part,by the National Nature Science Foundation of China under grant numbers 62272236in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘Signature verification,which is a method to distinguish the authenticity of signature images,is a biometric verification technique that can effectively reduce the risk of forged signatures in financial,legal,and other business envir-onments.However,compared with ordinary images,signature images have the following characteristics:First,the strokes are slim,i.e.,there is less effective information.Second,the signature changes slightly with the time,place,and mood of the signer,i.e.,it has high intraclass differences.These challenges lead to the low accuracy of the existing methods based on convolutional neural net-works(CNN).This study proposes an end-to-end multi-path attention inverse dis-crimination network that focuses on the signature stroke parts to extract features by reversing the foreground and background of signature images,which effectively solves the problem of little effective information.To solve the problem of high intraclass variability of signature images,we add multi-path attention modules between discriminative streams and inverse streams to enhance the discriminative features of signature images.Moreover,a multi-path discrimination loss function is proposed,which does not require the feature representation of the samples with the same class label to be infinitely close,as long as the gap between inter-class distance and the intra-class distance is bigger than the set classification threshold,which radically resolves the problem of high intra-class difference of signature images.In addition,this loss can also spur the network to explore the detailed infor-mation on the stroke parts,such as the crossing,thickness,and connection of strokes.We respectively tested on CEDAR,BHSig-Bengali,BHSig-Hindi,and GPDS Synthetic datasets with accuracies of 100%,96.24%,93.86%,and 83.72%,which are more accurate than existing signature verification methods.This is more helpful to the task of signature authentication in justice and finance.
基金This work was supported,in part,by the Natural Science Foundation of Jiangsu Province under Grant Numbers BK20201136,BK20191401in part,by the National Nature Science Foundation of China under Grant Numbers 61502240,61502096,61304205,61773219in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.Conflicts of Interest:The aut。
文摘The leakage of medical audio data in telemedicine seriously violates the privacy of patients.In order to avoid the leakage of patient information in telemedicine,a two-stage reversible robust audio watermarking algorithm is proposed to protect medical audio data.The scheme decomposes the medical audio into two independent embedding domains,embeds the robust watermark and the reversible watermark into the two domains respectively.In order to ensure the audio quality,the Hurst exponent is used to find a suitable position for watermark embedding.Due to the independence of the two embedding domains,the embedding of the second-stage reversible watermark will not affect the first-stage watermark,so the robustness of the first-stage watermark can be well maintained.In the second stage,the correlation between the sampling points in the medical audio is used to modify the hidden bits of the histogram to reduce the modification of the medical audio and reduce the distortion caused by reversible embedding.Simulation experiments show that this scheme has strong robustness against signal processing operations such as MP3 compression of 48 db,additive white Gaussian noise(AWGN)of 20 db,low-pass filtering,resampling,re-quantization and other attacks,and has good imperceptibility.
基金supported,in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘In a telemedicine diagnosis system,the emergence of 3D imaging enables doctors to make clearer judgments,and its accuracy also directly affects doctors’diagnosis of the disease.In order to ensure the safe transmission and storage of medical data,a 3D medical watermarking algorithm based on wavelet transform is proposed in this paper.The proposed algorithm employs the principal component analysis(PCA)transform to reduce the data dimension,which can minimize the error between the extracted components and the original data in the mean square sense.Especially,this algorithm helps to create a bacterial foraging model based on particle swarm optimization(BF-PSO),by which the optimal wavelet coefficient is found for embedding and is used as the absolute feature of watermark embedding,thereby achieving the optimal balance between embedding capacity and imperceptibility.A series of experimental results from MATLAB software based on the standard MRI brain volume dataset demonstrate that the proposed algorithm has strong robustness and make the 3D model have small deformation after embedding the watermark.
基金This work was supported,in part,by the National Nature Science Foundation of China under Grant Numbers 61502240,61502096,61304205,61773219in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401+1 种基金in part,by the Postgraduate Research&Practice Innovation Program of Jiangsu Province under Grant Numbers SJCX21_0363in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of intelligent transportation system.Most existing vehicle re-identification models adopt the joint learning of global and local features.However,they directly use the extracted global features,resulting in insufficient feature expression.Moreover,local features are primarily obtained through advanced annotation and complex attention mechanisms,which require additional costs.To solve this issue,a multi-feature learning model with enhanced local attention for vehicle re-identification(MFELA)is proposed in this paper.The model consists of global and local branches.The global branch utilizes both middle and highlevel semantic features of ResNet50 to enhance the global representation capability.In addition,multi-scale pooling operations are used to obtain multiscale information.While the local branch utilizes the proposed Region Batch Dropblock(RBD),which encourages the model to learn discriminative features for different local regions and simultaneously drops corresponding same areas randomly in a batch during training to enhance the attention to local regions.Then features from both branches are combined to provide a more comprehensive and distinctive feature representation.Extensive experiments on VeRi-776 and VehicleID datasets prove that our method has excellent performance.
基金supported,in part,by the National Nature Science Foundation of China under Grant Numbers 61502240,61502096,61304205,61773219in part,by the Natural Science Foundation of Jiangsu Province under Grant Numbers BK20201136,BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘With the increasing application of surveillance cameras,vehicle re-identication(Re-ID)has attracted more attention in the eld of public security.Vehicle Re-ID meets challenge attributable to the large intra-class differences caused by different views of vehicles in the traveling process and obvious inter-class similarities caused by similar appearances.Plentiful existing methods focus on local attributes by marking local locations.However,these methods require additional annotations,resulting in complex algorithms and insufferable computation time.To cope with these challenges,this paper proposes a vehicle Re-ID model based on optimized DenseNet121 with joint loss.This model applies the SE block to automatically obtain the importance of each channel feature and assign the corresponding weight to it,then features are transferred to the deep layer by adjusting the corresponding weights,which reduces the transmission of redundant information in the process of feature reuse in DenseNet121.At the same time,the proposed model leverages the complementary expression advantages of middle features of the CNN to enhance the feature expression ability.Additionally,a joint loss with focal loss and triplet loss is proposed in vehicle Re-ID to enhance the model’s ability to discriminate difcult-to-separate samples by enlarging the weight of the difcult-to-separate samples during the training process.Experimental results on the VeRi-776 dataset show that mAP and Rank-1 reach 75.5%and 94.8%,respectively.Besides,Rank-1 on small,medium and large sub-datasets of Vehicle ID dataset reach 81.3%,78.9%,and 76.5%,respectively,which surpasses most existing vehicle Re-ID methods.
基金This work was supported,in part,by the Natural Science Foundation of Jiangsu Province under Grant Numbers BK20201136,BK20191401in part,by the National Nature Science Foundation of China under Grant Numbers 61502240,61502096,61304205,61773219in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘The key to preventing the COVID-19 is to diagnose patients quickly and accurately.Studies have shown that using Convolutional Neural Networks(CNN)to analyze chest Computed Tomography(CT)images is helpful for timely COVID-19 diagnosis.However,personal privacy issues,public chest CT data sets are relatively few,which has limited CNN’s application to COVID-19 diagnosis.Also,many CNNs have complex structures and massive parameters.Even if equipped with the dedicated Graphics Processing Unit(GPU)for acceleration,it still takes a long time,which is not conductive to widespread application.To solve above problems,this paper proposes a lightweight CNN classification model based on transfer learning.Use the lightweight CNN MobileNetV2 as the backbone of the model to solve the shortage of hardware resources and computing power.In order to alleviate the problem of model overfitting caused by insufficient data set,transfer learning is used to train the model.The study first exploits the weight parameters trained on the ImageNet database to initialize the MobileNetV2 network,and then retrain the model based on the CT image data set provided by Kaggle.Experimental results on a computer equipped only with the Central Processing Unit(CPU)show that it consumes only 1.06 s on average to diagnose a chest CT image.Compared to other lightweight models,the proposed model has a higher classification accuracy and reliability while having a lightweight architecture and few parameters,which can be easily applied to computers without GPU acceleration.Code:github.com/ZhouJie-520/paper-codes.
基金supported in part by the National Natural Science Foundation of China(Nos.61304205 and 61502240)the Natural Science Foundation of Jiangsu Province(BK20191401)the Innovation and Entrepreneurship Training Project of College Students(202010300290,202010300211,202010300116E).
文摘Vehicle type recognition(VTR)is an important research topic due to its significance in intelligent transportation systems.However,recognizing vehicle type on the real-world images is challenging due to the illumination change,partial occlusion under real traffic environment.These difficulties limit the performance of current state-of-art methods,which are typically based on single-stage classification without considering feature availability.To address such difficulties,this paper proposes a two-stage vehicle type recognition method combining the most effective Gabor features.The first stage leverages edge features to classify vehicles by size into big or small via a similarity k-nearest neighbor classifier(SKNNC).Further the more specific vehicle type such as bus,truck,sedan or van is recognized by the second stage classification,which leverages the most effective Gabor features extracted by a set of Gabor wavelet kernels on the partitioned key patches via a kernel sparse representation-based classifier(KSRC).A verification and correction step based on minimum residual analysis is proposed to enhance the reliability of the VTR.To improve VTR efficiency,the most effective Gabor features are selected through gray relational analysis that leverages the correlation between Gabor feature image and the original image.Experimental results demonstrate that the proposed method not only improves the accuracy of VTR but also enhances the recognition robustness to illumination change and partial occlusion.
基金This work was supported,in part,by the National Nature Science Foundation of China under grant numbers 61502240,61502096,61304205,61773219in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘In the current dire situation of the corona virus COVID-19,remote consultations were proposed to avoid cross-infection and regional differences in medical resources.However,the safety of digital medical imaging in remote consultations has also attracted more and more attention from the medical industry.To ensure the integrity and security of medical images,this paper proposes a robust watermarking algorithm to authenticate and recover from the distorted medical images based on regions of interest(ROI)and integer wavelet transform(IWT).First,the medical image is divided into two different parts,regions of interest and non-interest regions.Then the integrity of ROI is verified using the hash algorithm,and the recovery data of the ROI region is calculated at the same time.Also,binary images with the basic information of patients are processed by logistic chaotic map encryption,and then the synthetic watermark is embedded in the medical carrier image using IWT transform.The performance of the proposed algorithm is tested by the simulation experiments based on the MATLAB program in CT images of the lungs.Experimental results show that the algorithm can precisely locate the distorted areas of an image and recover the original ROI on the basis of verifying image reliability.The maximum peak signal to noise ratio(PSNR)value of 51.24 has been achieved,which proves that the watermark is invisible and has strong robustness against noise,compression,and filtering attacks.