Global images of auroras obtained by cameras on spacecraft are a key tool for studying the near-Earth environment.However,the cameras are sensitive not only to auroral emissions produced by precipitating particles,but...Global images of auroras obtained by cameras on spacecraft are a key tool for studying the near-Earth environment.However,the cameras are sensitive not only to auroral emissions produced by precipitating particles,but also to dayglow emissions produced by photoelectrons induced by sunlight.Nightglow emissions and scattered sunlight can contribute to the background signal.To fully utilize such images in space science,background contamination must be removed to isolate the auroral signal.Here we outline a data-driven approach to modeling the background intensity in multiple images by formulating linear inverse problems based on B-splines and spherical harmonics.The approach is robust,flexible,and iteratively deselects outliers,such as auroral emissions.The final model is smooth across the terminator and accounts for slow temporal variations and large-scale asymmetries in the dayglow.We demonstrate the model by using the three far ultraviolet cameras on the Imager for Magnetopause-to-Aurora Global Exploration(IMAGE)mission.The method can be applied to historical missions and is relevant for upcoming missions,such as the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)mission.展开更多
Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibilit...Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
The pancreas is neither part of the five Zang organs(五脏) nor the six Fu organs(六腑).Thus,it has received little attention in Chinese medical literature.In the late 19th century,medical missionaries in China started...The pancreas is neither part of the five Zang organs(五脏) nor the six Fu organs(六腑).Thus,it has received little attention in Chinese medical literature.In the late 19th century,medical missionaries in China started translating and introducing anatomical and physiological knowledge about the pancreas.As for the word pancreas,an early and influential translation was “sweet meat”(甜肉),proposed by Benjamin Hobson(合信).The translation “sweet meat” is not faithful to the original meaning of “pancreas”,but is a term coined by Hobson based on his personal habits,and the word “sweet” appeared by chance.However,in the decades since the term “sweet meat” became popular,Chinese medicine practitioners,such as Tang Zonghai(唐宗海),reinterpreted it by drawing new medical illustrations for “sweet meat” and giving new connotations to the word “sweet”.This discussion and interpretation of “sweet meat” in modern China,particularly among Chinese medicine professionals,is not only a dissemination and interpretation of the knowledge of “pancreas”,but also a construction of knowledge around the term “sweet meat”.展开更多
Artificial Intelligence(AI)is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy(VTDR),which is a leading cause of visual impairment and blindness worldwide.However,previous automated VTDR ...Artificial Intelligence(AI)is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy(VTDR),which is a leading cause of visual impairment and blindness worldwide.However,previous automated VTDR detection methods have mainly relied on manual feature extraction and classification,leading to errors.This paper proposes a novel VTDR detection and classification model that combines different models through majority voting.Our proposed methodology involves preprocessing,data augmentation,feature extraction,and classification stages.We use a hybrid convolutional neural network-singular value decomposition(CNN-SVD)model for feature extraction and selection and an improved SVM-RBF with a Decision Tree(DT)and K-Nearest Neighbor(KNN)for classification.We tested our model on the IDRiD dataset and achieved an accuracy of 98.06%,a sensitivity of 83.67%,and a specificity of 100%for DR detection and evaluation tests,respectively.Our proposed approach outperforms baseline techniques and provides a more robust and accurate method for VTDR detection.展开更多
Recent advances in spectral sensing techniques and machine learning(ML)methods have enabled the estimation of plant physiochemical traits.Nitrogen(N)is a primary limiting factor for terrestrial forest growth,but tradi...Recent advances in spectral sensing techniques and machine learning(ML)methods have enabled the estimation of plant physiochemical traits.Nitrogen(N)is a primary limiting factor for terrestrial forest growth,but traditional methods for N determination are labor-intensive,time-consuming,and destructive.In this study,we present a rapid,non-destructive method to predict leaf N concentration(LNC)in Metasequoia glyptostroboides plantations under N and phosphorus(P)fertilization using ML techniques and unmanned aerial vehicle(UAV)-based RGB(red,green,blue)images.Nine spectral vegetation indices(VIs)were extracted from the RGB images.The spectral reflectance and VIs were used as input features to construct models for estimating LNC based on support vector machine,ran-dom forest(RF),and multiple linear regression,gradient boosting regression and classification and regression trees(CART).The results show that RF is the best fitting model for estimating LNC with a coefficient of determination(R2)of 0.73.Using this model,we evaluated the effects of N and P treatments on LNC and found a significant increase with N and a decrease with P.Height,diameter at breast height(DBH),and crown width of all M.glyptostroboides were analyzed by Pearson correlation with the predicted LNC.DBH was significantly correlated with LNC under N treat-ment.Our results highlight the potential of combining UAV RGB images with an ML algorithm as an efficient,scalable,and cost-effective method for LNC quantification.Future research can extend this approach to different tree species and different plant traits,paving the way for large-scale,time-efficient plant growth monitoring.展开更多
The intuitive fuzzy set has found important application in decision-making and machine learning.To enrich and utilize the intuitive fuzzy set,this study designed and developed a deep neural network-based glaucoma eye ...The intuitive fuzzy set has found important application in decision-making and machine learning.To enrich and utilize the intuitive fuzzy set,this study designed and developed a deep neural network-based glaucoma eye detection using fuzzy difference equations in the domain where the retinal images converge.Retinal image detections are categorized as normal eye recognition,suspected glaucomatous eye recognition,and glaucomatous eye recognition.Fuzzy degrees associated with weighted values are calculated to determine the level of concentration between the fuzzy partition and the retinal images.The proposed model was used to diagnose glaucoma using retinal images and involved utilizing the Convolutional Neural Network(CNN)and deep learning to identify the fuzzy weighted regularization between images.This methodology was used to clarify the input images and make them adequate for the process of glaucoma detection.The objective of this study was to propose a novel approach to the early diagnosis of glaucoma using the Fuzzy Expert System(FES)and Fuzzy differential equation(FDE).The intensities of the different regions in the images and their respective peak levels were determined.Once the peak regions were identified,the recurrence relationships among those peaks were then measured.Image partitioning was done due to varying degrees of similar and dissimilar concentrations in the image.Similar and dissimilar concentration levels and spatial frequency generated a threshold image from the combined fuzzy matrix and FDE.This distinguished between a normal and abnormal eye condition,thus detecting patients with glaucomatous eyes.展开更多
In the intelligent medical diagnosis area,Artificial Intelligence(AI)’s trustworthiness,reliability,and interpretability are critical,especially in cancer diagnosis.Traditional neural networks,while excellent at proc...In the intelligent medical diagnosis area,Artificial Intelligence(AI)’s trustworthiness,reliability,and interpretability are critical,especially in cancer diagnosis.Traditional neural networks,while excellent at processing natural images,often lack interpretability and adaptability when processing high-resolution digital pathological images.This limitation is particularly evident in pathological diagnosis,which is the gold standard of cancer diagnosis and relies on a pathologist’s careful examination and analysis of digital pathological slides to identify the features and progression of the disease.Therefore,the integration of interpretable AI into smart medical diagnosis is not only an inevitable technological trend but also a key to improving diagnostic accuracy and reliability.In this paper,we introduce an innovative Multi-Scale Multi-Branch Feature Encoder(MSBE)and present the design of the CrossLinkNet Framework.The MSBE enhances the network’s capability for feature extraction by allowing the adjustment of hyperparameters to configure the number of branches and modules.The CrossLinkNet Framework,serving as a versatile image segmentation network architecture,employs cross-layer encoder-decoder connections for multi-level feature fusion,thereby enhancing feature integration and segmentation accuracy.Comprehensive quantitative and qualitative experiments on two datasets demonstrate that CrossLinkNet,equipped with the MSBE encoder,not only achieves accurate segmentation results but is also adaptable to various tumor segmentation tasks and scenarios by replacing different feature encoders.Crucially,CrossLinkNet emphasizes the interpretability of the AI model,a crucial aspect for medical professionals,providing an in-depth understanding of the model’s decisions and thereby enhancing trust and reliability in AI-assisted diagnostics.展开更多
This paper explores a double quantum images representation(DNEQR)model that allows for simultaneous storage of two digital images in a quantum superposition state.Additionally,a new type of two-dimensional hyperchaoti...This paper explores a double quantum images representation(DNEQR)model that allows for simultaneous storage of two digital images in a quantum superposition state.Additionally,a new type of two-dimensional hyperchaotic system based on sine and logistic maps is investigated,offering a wider parameter space and better chaotic behavior compared to the sine and logistic maps.Based on the DNEQR model and the hyperchaotic system,a double quantum images encryption algorithm is proposed.Firstly,two classical plaintext images are transformed into quantum states using the DNEQR model.Then,the proposed hyperchaotic system is employed to iteratively generate pseudo-random sequences.These chaotic sequences are utilized to perform pixel value and position operations on the quantum image,resulting in changes to both pixel values and positions.Finally,the ciphertext image can be obtained by qubit-level diffusion using two XOR operations between the position-permutated image and the pseudo-random sequences.The corresponding quantum circuits are also given.Experimental results demonstrate that the proposed scheme ensures the security of the images during transmission,improves the encryption efficiency,and enhances anti-interference and anti-attack capabilities.展开更多
Rapid and accurate acquisition of soil organic matter(SOM)information in cultivated land is important for sustainable agricultural development and carbon balance management.This study proposed a novel approach to pred...Rapid and accurate acquisition of soil organic matter(SOM)information in cultivated land is important for sustainable agricultural development and carbon balance management.This study proposed a novel approach to predict SOM with high accuracy using multiyear synthetic remote sensing variables on a monthly scale.We obtained 12 monthly synthetic Sentinel-2 images covering the study area from 2016 to 2021 through the Google Earth Engine(GEE)platform,and reflectance bands and vegetation indices were extracted from these composite images.Then the random forest(RF),support vector machine(SVM)and gradient boosting regression tree(GBRT)models were tested to investigate the difference in SOM prediction accuracy under different combinations of monthly synthetic variables.Results showed that firstly,all monthly synthetic spectral bands of Sentinel-2 showed a significant correlation with SOM(P<0.05)for the months of January,March,April,October,and November.Secondly,in terms of single-monthly composite variables,the prediction accuracy was relatively poor,with the highest R^(2)value of 0.36 being observed in January.When monthly synthetic environmental variables were grouped in accordance with the four quarters of the year,the first quarter and the fourth quarter showed good performance,and any combination of three quarters was similar in estimation accuracy.The overall best performance was observed when all monthly synthetic variables were incorporated into the models.Thirdly,among the three models compared,the RF model was consistently more accurate than the SVM and GBRT models,achieving an R^(2)value of 0.56.Except for band 12 in December,the importance of the remaining bands did not exhibit significant differences.This research offers a new attempt to map SOM with high accuracy and fine spatial resolution based on monthly synthetic Sentinel-2 images.展开更多
Objective:Medical images have been increased rapidly in digital medicine era,presenting an opportunity for the intervention of artificial intelligence(AI).In order to explore the value of convolutional neural network(...Objective:Medical images have been increased rapidly in digital medicine era,presenting an opportunity for the intervention of artificial intelligence(AI).In order to explore the value of convolutional neural network(CNN)algorithms in endoscopic images,we developed an AI-assisted comprehensive analysis system for endoscopic images and explored its performance in clinical real scenarios.Methods:A total of 6,270 white light endoscopic images from 516 cases were used to train 14 different CNN models.The images were divided into training set,validation set and test set according to 7:1:2 for exploring the possibility of discrimination of gastric cancer(GC)and benign lesions(nGC),gastric ulcer(GU)and ulcerated cancer(UCa),early gastric cancer(EGC)and nGC,infection of Helicobacter pylori(Hp)and no infection of Hp(noHp),as well as metastasis and no-metastasis at perigastric lymph nodes.Results:Among the 14 CNN models,EfficientNetB7 revealed the best performance on two-category of GC and nGC[accuracy:96.40%and area under the curve(AUC)=0.9959],GU and UCa(accuracy:90.84%and AUC=0.8155),EGC and nGC(accuracy:97.88%and AUC=0.9943),and Hp and noHp(accuracy:83.33%and AUC=0.9096).Whereas,InceptionV3 model showed better performance on predicting metastasis and nometastasis of perigastric lymph nodes for EGC(accuracy:79.44%and AUC=0.7181).In addition,the integrated analysis of endoscopic images and gross images of gastrectomy specimens was performed on 95 cases by EfficientNetB7 and RFB-SSD object detection model,resulting in 100%of predictive accuracy in EGC.Conclusions:Taken together,this study integrated image sources from endoscopic examination and gastrectomy of gastric tumors and incorporated the advantages of different CNN models.The AI-assisted diagnostic system will play an important role in the therapeutic decision-making of EGC.展开更多
In the present research,we describe a computer-aided detection(CAD)method aimed at automatic fetal head circumference(HC)measurement in 2D ultrasonography pictures during all trimesters of pregnancy.The HC might be ut...In the present research,we describe a computer-aided detection(CAD)method aimed at automatic fetal head circumference(HC)measurement in 2D ultrasonography pictures during all trimesters of pregnancy.The HC might be utilized toward determining gestational age and tracking fetal development.This automated approach is particularly valuable in low-resource settings where access to trained sonographers is limited.The CAD system is divided into two steps:to begin,Haar-like characteristics were extracted from ultrasound pictures in order to train a classifier using random forests to find the fetal skull.We identified the HC using dynamic programming,an elliptical fit,and a Hough transform.The computer-aided detection(CAD)program was well-trained on 999 pictures(HC18 challenge data source),and then verified on 335 photos from all trimesters in an independent test set.A skilled sonographer and an expert in medicine personally marked the test set.We used the crown-rump length(CRL)measurement to calculate the reference gestational age(GA).In the first,second,and third trimesters,the median difference between the standard GA and the GA calculated by the skilled sonographer stayed at 0.7±2.7,0.0±4.5,and 2.0±12.0 days,respectively.The regular duration variance between the baseline GA and the health investigator’s GA remained 1.5±3.0,1.9±5.0,and 4.0±14 a couple of days.The mean variance between the standard GA and the CAD system’s GA remained between 0.5 and 5.0,with an additional variation of 2.9 to 12.5 days.The outcomes reveal that the computer-aided detection(CAD)program outperforms an expert sonographer.When paired with the classifications reported in the literature,the provided system achieves results that are comparable or even better.We have assessed and scheduled this computerized approach for HC evaluation,which includes information from all trimesters of gestation.展开更多
Object detection in unmanned aerial vehicle(UAV)aerial images has become increasingly important in military and civil applications.General object detection models are not robust enough against interclass similarity an...Object detection in unmanned aerial vehicle(UAV)aerial images has become increasingly important in military and civil applications.General object detection models are not robust enough against interclass similarity and intraclass variability of small objects,and UAV-specific nuisances such as uncontrolledweather conditions.Unlike previous approaches focusing on high-level semantic information,we report the importance of underlying features to improve detection accuracy and robustness fromthe information-theoretic perspective.Specifically,we propose a robust and discriminative feature learning approach through mutual information maximization(RD-MIM),which can be integrated into numerous object detection methods for aerial images.Firstly,we present the rank sample mining method to reduce underlying feature differences between the natural image domain and the aerial image domain.Then,we design a momentum contrast learning strategy to make object features similar to the same category and dissimilar to different categories.Finally,we construct a transformer-based global attention mechanism to boost object location semantics by leveraging the high interrelation of different receptive fields.We conduct extensive experiments on the VisDrone and Unmanned Aerial Vehicle Benchmark Object Detection and Tracking(UAVDT)datasets to prove the effectiveness of the proposed method.The experimental results show that our approach brings considerable robustness gains to basic detectors and advanced detection methods,achieving relative growth rates of 51.0%and 39.4%in corruption robustness,respectively.Our code is available at https://github.com/cq100/RD-MIM(accessed on 2 August 2024).展开更多
Industrial activities, through the human-induced release of Green House Gas (GHG) emissions, have beenidentified as the primary cause of global warming. Accurate and quantitative monitoring of these emissions isessent...Industrial activities, through the human-induced release of Green House Gas (GHG) emissions, have beenidentified as the primary cause of global warming. Accurate and quantitative monitoring of these emissions isessential for a comprehensive understanding of their impact on the Earth’s climate and for effectively enforcingemission regulations at a large scale. This work examines the feasibility of detecting and quantifying industrialsmoke plumes using freely accessible geo-satellite imagery. The existing systemhas so many lagging factors such aslimitations in accuracy, robustness, and efficiency and these factors hinder the effectiveness in supporting timelyresponse to industrial fires. In this work, the utilization of grayscale images is done instead of traditional colorimages for smoke plume detection. The dataset was trained through a ResNet-50 model for classification and aU-Net model for segmentation. The dataset consists of images gathered by European Space Agency’s Sentinel-2 satellite constellation from a selection of industrial sites. The acquired images predominantly capture scenesof industrial locations, some of which exhibit active smoke plume emissions. The performance of the abovementionedtechniques and models is represented by their accuracy and IOU (Intersection-over-Union) metric.The images are first trained on the basic RGB images where their respective classification using the ResNet-50model results in an accuracy of 94.4% and segmentation using the U-Net Model with an IOU metric of 0.5 andaccuracy of 94% which leads to the detection of exact patches where the smoke plume has occurred. This work hastrained the classification model on grayscale images achieving a good increase in accuracy of 96.4%.展开更多
In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clini...In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clinical operating environments,endoscopic images often suffer from challenges such as low texture,uneven illumination,and non-rigid structures,which affect feature observation and extraction.This can severely impact surgical navigation or clinical diagnosis due to missing feature points in endoscopic images,leading to treatment and postoperative recovery issues for patients.To address these challenges,this paper introduces,for the first time,a Cross-Channel Multi-Modal Adaptive Spatial Feature Fusion(ASFF)module based on the lightweight architecture of EfficientViT.Additionally,a novel lightweight feature extraction and matching network based on attention mechanism is proposed.This network dynamically adjusts attention weights for cross-modal information from grayscale images and optical flow images through a dual-branch Siamese network.It extracts static and dynamic information features ranging from low-level to high-level,and from local to global,ensuring robust feature extraction across different widths,noise levels,and blur scenarios.Global and local matching are performed through a multi-level cascaded attention mechanism,with cross-channel attention introduced to simultaneously extract low-level and high-level features.Extensive ablation experiments and comparative studies are conducted on the HyperKvasir,EAD,M2caiSeg,CVC-ClinicDB,and UCL synthetic datasets.Experimental results demonstrate that the proposed network improves upon the baseline EfficientViT-B3 model by 75.4%in accuracy(Acc),while also enhancing runtime performance and storage efficiency.When compared with the complex DenseDescriptor feature extraction network,the difference in Acc is less than 7.22%,and IoU calculation results on specific datasets outperform complex dense models.Furthermore,this method increases the F1 score by 33.2%and accelerates runtime by 70.2%.It is noteworthy that the speed of CMMCAN surpasses that of comparative lightweight models,with feature extraction and matching performance comparable to existing complex models but with faster speed and higher cost-effectiveness.展开更多
Recovering high-quality inscription images from unknown and complex inscription noisy images is a challenging research issue.Different fromnatural images,character images pay more attention to stroke information.Howev...Recovering high-quality inscription images from unknown and complex inscription noisy images is a challenging research issue.Different fromnatural images,character images pay more attention to stroke information.However,existingmodelsmainly consider pixel-level informationwhile ignoring structural information of the character,such as its edge and glyph,resulting in reconstructed images with mottled local structure and character damage.To solve these problems,we propose a novel generative adversarial network(GAN)framework based on an edge-guided generator and a discriminator constructed by a dual-domain U-Net framework,i.e.,EDU-GAN.Unlike existing frameworks,the generator introduces the edge extractionmodule,guiding it into the denoising process through the attention mechanism,which maintains the edge detail of the restored inscription image.Moreover,a dual-domain U-Net-based discriminator is proposed to learn the global and local discrepancy between the denoised and the label images in both image and morphological domains,which is helpful to blind denoising tasks.The proposed dual-domain discriminator and generator for adversarial training can reduce local artifacts and keep the denoised character structure intact.Due to the lack of a real-inscription image,we built the real-inscription dataset to provide an effective benchmark for studying inscription image denoising.The experimental results show the superiority of our method both in the synthetic and real-inscription datasets.展开更多
Multiplicative noise removal problems have attracted much attention in recent years.Unlike additive noise,multiplicative noise destroys almost all information of the original image,especially for texture images.Motiva...Multiplicative noise removal problems have attracted much attention in recent years.Unlike additive noise,multiplicative noise destroys almost all information of the original image,especially for texture images.Motivated by the TV-Stokes model,we propose a new two-step variational model to denoise the texture images corrupted by multiplicative noise with a good geometry explanation in this paper.In the first step,we convert the multiplicative denoising problem into an additive one by the logarithm transform and propagate the isophote directions in the tangential field smoothing.Once the isophote directions are constructed,an image is restored to fit the constructed directions in the second step.The existence and uniqueness of the solution to the variational problems are proved.In these two steps,we use the gradient descent method and construct finite difference schemes to solve the problems.Especially,the augmented Lagrangian method and the fast Fourier transform are adopted to accelerate the calculation.Experimental results show that the proposed model can remove the multiplicative noise efficiently and protect the texture well.展开更多
Detecting brain tumours is complex due to the natural variation in their location, shape, and intensity in images. While having accurate detection and segmentation of brain tumours would be beneficial, current methods...Detecting brain tumours is complex due to the natural variation in their location, shape, and intensity in images. While having accurate detection and segmentation of brain tumours would be beneficial, current methods still need to solve this problem despite the numerous available approaches. Precise analysis of Magnetic Resonance Imaging (MRI) is crucial for detecting, segmenting, and classifying brain tumours in medical diagnostics. Magnetic Resonance Imaging is a vital component in medical diagnosis, and it requires precise, efficient, careful, efficient, and reliable image analysis techniques. The authors developed a Deep Learning (DL) fusion model to classify brain tumours reliably. Deep Learning models require large amounts of training data to achieve good results, so the researchers utilised data augmentation techniques to increase the dataset size for training models. VGG16, ResNet50, and convolutional deep belief networks networks extracted deep features from MRI images. Softmax was used as the classifier, and the training set was supplemented with intentionally created MRI images of brain tumours in addition to the genuine ones. The features of two DL models were combined in the proposed model to generate a fusion model, which significantly increased classification accuracy. An openly accessible dataset from the internet was used to test the model's performance, and the experimental results showed that the proposed fusion model achieved a classification accuracy of 98.98%. Finally, the results were compared with existing methods, and the proposed model outperformed them significantly.展开更多
Mobile technology is developing significantly.Mobile phone technologies have been integrated into the healthcare industry to help medical practitioners.Typically,computer vision models focus on image detection and cla...Mobile technology is developing significantly.Mobile phone technologies have been integrated into the healthcare industry to help medical practitioners.Typically,computer vision models focus on image detection and classification issues.MobileNetV2 is a computer vision model that performs well on mobile devices,but it requires cloud services to process biometric image information and provide predictions to users.This leads to increased latency.Processing biometrics image datasets on mobile devices will make the prediction faster,but mobiles are resource-restricted devices in terms of storage,power,and computational speed.Hence,a model that is small in size,efficient,and has good prediction quality for biometrics image classification problems is required.Quantizing pre-trained CNN(PCNN)MobileNetV2 architecture combined with a Support Vector Machine(SVM)compacts the model representation and reduces the computational cost and memory requirement.This proposed novel approach combines quantized pre-trained CNN(PCNN)MobileNetV2 architecture with a Support Vector Machine(SVM)to represent models efficiently with low computational cost and memory.Our contributions include evaluating three CNN models for ocular disease identification in transfer learning and deep feature plus SVM approaches,showing the superiority of deep features from MobileNetV2 and SVM classification models,comparing traditional methods,exploring six ocular diseases and normal classification with 20,111 images postdata augmentation,and reducing the number of trainable models.The model is trained on ocular disorder retinal fundus image datasets according to the severity of six age-related macular degeneration(AMD),one of the most common eye illnesses,Cataract,Diabetes,Glaucoma,Hypertension,andMyopia with one class Normal.From the experiment outcomes,it is observed that the suggested MobileNetV2-SVM model size is compressed.The testing accuracy for MobileNetV2-SVM,InceptionV3,and MobileNetV2 is 90.11%,86.88%,and 89.76%respectively while MobileNetV2-SVM,InceptionV3,and MobileNetV2 accuracy are observed to be 92.59%,83.38%,and 90.16%,respectively.The proposed novel technique can be used to classify all biometric medical image datasets on mobile devices.展开更多
This paper emphasizes a faster digital processing time while presenting an accurate method for identifying spinefractures in X-ray pictures. The study focuses on efficiency by utilizing many methods that include pictu...This paper emphasizes a faster digital processing time while presenting an accurate method for identifying spinefractures in X-ray pictures. The study focuses on efficiency by utilizing many methods that include picturesegmentation, feature reduction, and image classification. Two important elements are investigated to reducethe classification time: Using feature reduction software and leveraging the capabilities of sophisticated digitalprocessing hardware. The researchers use different algorithms for picture enhancement, including theWiener andKalman filters, and they look into two background correction techniques. The article presents a technique forextracting textural features and evaluates three picture segmentation algorithms and three fractured spine detectionalgorithms using transformdomain, PowerDensity Spectrum(PDS), andHigher-Order Statistics (HOS) for featureextraction.With an emphasis on reducing digital processing time, this all-encompassing method helps to create asimplified system for classifying fractured spine fractures. A feature reduction program code has been built toimprove the processing speed for picture classification. Overall, the proposed approach shows great potential forsignificantly reducing classification time in clinical settings where time is critical. In comparison to other transformdomains, the texture features’ discrete cosine transform (DCT) yielded an exceptional classification rate, and theprocess of extracting features from the transform domain took less time. More capable hardware can also result inquicker execution times for the feature extraction algorithms.展开更多
基金supported by the Research Council of Norway under contracts 223252/F50 and 300844/F50the Trond Mohn Foundation。
文摘Global images of auroras obtained by cameras on spacecraft are a key tool for studying the near-Earth environment.However,the cameras are sensitive not only to auroral emissions produced by precipitating particles,but also to dayglow emissions produced by photoelectrons induced by sunlight.Nightglow emissions and scattered sunlight can contribute to the background signal.To fully utilize such images in space science,background contamination must be removed to isolate the auroral signal.Here we outline a data-driven approach to modeling the background intensity in multiple images by formulating linear inverse problems based on B-splines and spherical harmonics.The approach is robust,flexible,and iteratively deselects outliers,such as auroral emissions.The final model is smooth across the terminator and accounts for slow temporal variations and large-scale asymmetries in the dayglow.We demonstrate the model by using the three far ultraviolet cameras on the Imager for Magnetopause-to-Aurora Global Exploration(IMAGE)mission.The method can be applied to historical missions and is relevant for upcoming missions,such as the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)mission.
基金supported by a grant from the Basic Science Research Program through the National Research Foundation(NRF)(2021R1F1A1063634)funded by the Ministry of Science and ICT(MSIT),Republic of KoreaThe authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding Program Grant Code(NU/RG/SERC/13/40)+2 种基金Also,the authors are thankful to Prince Satam bin Abdulaziz University for supporting this study via funding from Prince Satam bin Abdulaziz University project number(PSAU/2024/R/1445)This work was also supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R54)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
基金financed by the grant from the Youth Fund for Humanities and Social Sciences Research of the Ministry of Education (No. 19YJCZH040)。
文摘The pancreas is neither part of the five Zang organs(五脏) nor the six Fu organs(六腑).Thus,it has received little attention in Chinese medical literature.In the late 19th century,medical missionaries in China started translating and introducing anatomical and physiological knowledge about the pancreas.As for the word pancreas,an early and influential translation was “sweet meat”(甜肉),proposed by Benjamin Hobson(合信).The translation “sweet meat” is not faithful to the original meaning of “pancreas”,but is a term coined by Hobson based on his personal habits,and the word “sweet” appeared by chance.However,in the decades since the term “sweet meat” became popular,Chinese medicine practitioners,such as Tang Zonghai(唐宗海),reinterpreted it by drawing new medical illustrations for “sweet meat” and giving new connotations to the word “sweet”.This discussion and interpretation of “sweet meat” in modern China,particularly among Chinese medicine professionals,is not only a dissemination and interpretation of the knowledge of “pancreas”,but also a construction of knowledge around the term “sweet meat”.
基金This research was funded by the National Natural Science Foundation of China(Nos.71762010,62262019,62162025,61966013,12162012)the Hainan Provincial Natural Science Foundation of China(Nos.823RC488,623RC481,620RC603,621QN241,620RC602,121RC536)+1 种基金the Haikou Science and Technology Plan Project of China(No.2022-016)the Project supported by the Education Department of Hainan Province,No.Hnky2021-23.
文摘Artificial Intelligence(AI)is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy(VTDR),which is a leading cause of visual impairment and blindness worldwide.However,previous automated VTDR detection methods have mainly relied on manual feature extraction and classification,leading to errors.This paper proposes a novel VTDR detection and classification model that combines different models through majority voting.Our proposed methodology involves preprocessing,data augmentation,feature extraction,and classification stages.We use a hybrid convolutional neural network-singular value decomposition(CNN-SVD)model for feature extraction and selection and an improved SVM-RBF with a Decision Tree(DT)and K-Nearest Neighbor(KNN)for classification.We tested our model on the IDRiD dataset and achieved an accuracy of 98.06%,a sensitivity of 83.67%,and a specificity of 100%for DR detection and evaluation tests,respectively.Our proposed approach outperforms baseline techniques and provides a more robust and accurate method for VTDR detection.
基金supported by the“Pioneer”and“Leading Goose”R&D Program of Zhejiang(2022C02053)National Natural Science Foundation of China(NSFC)(32201632).
文摘Recent advances in spectral sensing techniques and machine learning(ML)methods have enabled the estimation of plant physiochemical traits.Nitrogen(N)is a primary limiting factor for terrestrial forest growth,but traditional methods for N determination are labor-intensive,time-consuming,and destructive.In this study,we present a rapid,non-destructive method to predict leaf N concentration(LNC)in Metasequoia glyptostroboides plantations under N and phosphorus(P)fertilization using ML techniques and unmanned aerial vehicle(UAV)-based RGB(red,green,blue)images.Nine spectral vegetation indices(VIs)were extracted from the RGB images.The spectral reflectance and VIs were used as input features to construct models for estimating LNC based on support vector machine,ran-dom forest(RF),and multiple linear regression,gradient boosting regression and classification and regression trees(CART).The results show that RF is the best fitting model for estimating LNC with a coefficient of determination(R2)of 0.73.Using this model,we evaluated the effects of N and P treatments on LNC and found a significant increase with N and a decrease with P.Height,diameter at breast height(DBH),and crown width of all M.glyptostroboides were analyzed by Pearson correlation with the predicted LNC.DBH was significantly correlated with LNC under N treat-ment.Our results highlight the potential of combining UAV RGB images with an ML algorithm as an efficient,scalable,and cost-effective method for LNC quantification.Future research can extend this approach to different tree species and different plant traits,paving the way for large-scale,time-efficient plant growth monitoring.
基金funding the publication of this research through the Researchers Supporting Program (RSPD2023R809),King Saud University,Riyadh,Saudi Arabia.
文摘The intuitive fuzzy set has found important application in decision-making and machine learning.To enrich and utilize the intuitive fuzzy set,this study designed and developed a deep neural network-based glaucoma eye detection using fuzzy difference equations in the domain where the retinal images converge.Retinal image detections are categorized as normal eye recognition,suspected glaucomatous eye recognition,and glaucomatous eye recognition.Fuzzy degrees associated with weighted values are calculated to determine the level of concentration between the fuzzy partition and the retinal images.The proposed model was used to diagnose glaucoma using retinal images and involved utilizing the Convolutional Neural Network(CNN)and deep learning to identify the fuzzy weighted regularization between images.This methodology was used to clarify the input images and make them adequate for the process of glaucoma detection.The objective of this study was to propose a novel approach to the early diagnosis of glaucoma using the Fuzzy Expert System(FES)and Fuzzy differential equation(FDE).The intensities of the different regions in the images and their respective peak levels were determined.Once the peak regions were identified,the recurrence relationships among those peaks were then measured.Image partitioning was done due to varying degrees of similar and dissimilar concentrations in the image.Similar and dissimilar concentration levels and spatial frequency generated a threshold image from the combined fuzzy matrix and FDE.This distinguished between a normal and abnormal eye condition,thus detecting patients with glaucomatous eyes.
基金supported by the National Natural Science Foundation of China(Grant Numbers:62372083,62072074,62076054,62027827,62002047)the Sichuan Provincial Science and Technology Innovation Platform and Talent Program(Grant Number:2022JDJQ0039)+1 种基金the Sichuan Provincial Science and Technology Support Program(Grant Numbers:2022YFQ0045,2022YFS0220,2021YFG0131,2023YFS0020,2023YFS0197,2023YFG0148)the CCF-Baidu Open Fund(Grant Number:202312).
文摘In the intelligent medical diagnosis area,Artificial Intelligence(AI)’s trustworthiness,reliability,and interpretability are critical,especially in cancer diagnosis.Traditional neural networks,while excellent at processing natural images,often lack interpretability and adaptability when processing high-resolution digital pathological images.This limitation is particularly evident in pathological diagnosis,which is the gold standard of cancer diagnosis and relies on a pathologist’s careful examination and analysis of digital pathological slides to identify the features and progression of the disease.Therefore,the integration of interpretable AI into smart medical diagnosis is not only an inevitable technological trend but also a key to improving diagnostic accuracy and reliability.In this paper,we introduce an innovative Multi-Scale Multi-Branch Feature Encoder(MSBE)and present the design of the CrossLinkNet Framework.The MSBE enhances the network’s capability for feature extraction by allowing the adjustment of hyperparameters to configure the number of branches and modules.The CrossLinkNet Framework,serving as a versatile image segmentation network architecture,employs cross-layer encoder-decoder connections for multi-level feature fusion,thereby enhancing feature integration and segmentation accuracy.Comprehensive quantitative and qualitative experiments on two datasets demonstrate that CrossLinkNet,equipped with the MSBE encoder,not only achieves accurate segmentation results but is also adaptable to various tumor segmentation tasks and scenarios by replacing different feature encoders.Crucially,CrossLinkNet emphasizes the interpretability of the AI model,a crucial aspect for medical professionals,providing an in-depth understanding of the model’s decisions and thereby enhancing trust and reliability in AI-assisted diagnostics.
基金Project supported by the Open Fund of Anhui Key Laboratory of Mine Intelligent Equipment and Technology (Grant No.ZKSYS202204)the Talent Introduction Fund of Anhui University of Science and Technology (Grant No.2021yjrc34)the Scientific Research Fund of Anhui Provincial Education Department (Grant No.KJ2020A0301)。
文摘This paper explores a double quantum images representation(DNEQR)model that allows for simultaneous storage of two digital images in a quantum superposition state.Additionally,a new type of two-dimensional hyperchaotic system based on sine and logistic maps is investigated,offering a wider parameter space and better chaotic behavior compared to the sine and logistic maps.Based on the DNEQR model and the hyperchaotic system,a double quantum images encryption algorithm is proposed.Firstly,two classical plaintext images are transformed into quantum states using the DNEQR model.Then,the proposed hyperchaotic system is employed to iteratively generate pseudo-random sequences.These chaotic sequences are utilized to perform pixel value and position operations on the quantum image,resulting in changes to both pixel values and positions.Finally,the ciphertext image can be obtained by qubit-level diffusion using two XOR operations between the position-permutated image and the pseudo-random sequences.The corresponding quantum circuits are also given.Experimental results demonstrate that the proposed scheme ensures the security of the images during transmission,improves the encryption efficiency,and enhances anti-interference and anti-attack capabilities.
基金National Key Research and Development Program of China(2022YFB3903302 and 2021YFC1809104)。
文摘Rapid and accurate acquisition of soil organic matter(SOM)information in cultivated land is important for sustainable agricultural development and carbon balance management.This study proposed a novel approach to predict SOM with high accuracy using multiyear synthetic remote sensing variables on a monthly scale.We obtained 12 monthly synthetic Sentinel-2 images covering the study area from 2016 to 2021 through the Google Earth Engine(GEE)platform,and reflectance bands and vegetation indices were extracted from these composite images.Then the random forest(RF),support vector machine(SVM)and gradient boosting regression tree(GBRT)models were tested to investigate the difference in SOM prediction accuracy under different combinations of monthly synthetic variables.Results showed that firstly,all monthly synthetic spectral bands of Sentinel-2 showed a significant correlation with SOM(P<0.05)for the months of January,March,April,October,and November.Secondly,in terms of single-monthly composite variables,the prediction accuracy was relatively poor,with the highest R^(2)value of 0.36 being observed in January.When monthly synthetic environmental variables were grouped in accordance with the four quarters of the year,the first quarter and the fourth quarter showed good performance,and any combination of three quarters was similar in estimation accuracy.The overall best performance was observed when all monthly synthetic variables were incorporated into the models.Thirdly,among the three models compared,the RF model was consistently more accurate than the SVM and GBRT models,achieving an R^(2)value of 0.56.Except for band 12 in December,the importance of the remaining bands did not exhibit significant differences.This research offers a new attempt to map SOM with high accuracy and fine spatial resolution based on monthly synthetic Sentinel-2 images.
基金supported by the National Natural Science Foundation of China(No.82473013,82072602,82270575 and 82070558)the Shanghai Science and Technology Committee(No.20DZ2201900)+1 种基金the Innovation Foundation of Translational Medicine of Shanghai Jiao Tong University School of Medicine(No.TM202001)the Collaborative Innovation Center for Clinical and Translational Science by Chinese Ministry of Education&Shanghai Municipal Government(No.CCTS-2022202 and CCTS-202302)。
文摘Objective:Medical images have been increased rapidly in digital medicine era,presenting an opportunity for the intervention of artificial intelligence(AI).In order to explore the value of convolutional neural network(CNN)algorithms in endoscopic images,we developed an AI-assisted comprehensive analysis system for endoscopic images and explored its performance in clinical real scenarios.Methods:A total of 6,270 white light endoscopic images from 516 cases were used to train 14 different CNN models.The images were divided into training set,validation set and test set according to 7:1:2 for exploring the possibility of discrimination of gastric cancer(GC)and benign lesions(nGC),gastric ulcer(GU)and ulcerated cancer(UCa),early gastric cancer(EGC)and nGC,infection of Helicobacter pylori(Hp)and no infection of Hp(noHp),as well as metastasis and no-metastasis at perigastric lymph nodes.Results:Among the 14 CNN models,EfficientNetB7 revealed the best performance on two-category of GC and nGC[accuracy:96.40%and area under the curve(AUC)=0.9959],GU and UCa(accuracy:90.84%and AUC=0.8155),EGC and nGC(accuracy:97.88%and AUC=0.9943),and Hp and noHp(accuracy:83.33%and AUC=0.9096).Whereas,InceptionV3 model showed better performance on predicting metastasis and nometastasis of perigastric lymph nodes for EGC(accuracy:79.44%and AUC=0.7181).In addition,the integrated analysis of endoscopic images and gross images of gastrectomy specimens was performed on 95 cases by EfficientNetB7 and RFB-SSD object detection model,resulting in 100%of predictive accuracy in EGC.Conclusions:Taken together,this study integrated image sources from endoscopic examination and gastrectomy of gastric tumors and incorporated the advantages of different CNN models.The AI-assisted diagnostic system will play an important role in the therapeutic decision-making of EGC.
文摘In the present research,we describe a computer-aided detection(CAD)method aimed at automatic fetal head circumference(HC)measurement in 2D ultrasonography pictures during all trimesters of pregnancy.The HC might be utilized toward determining gestational age and tracking fetal development.This automated approach is particularly valuable in low-resource settings where access to trained sonographers is limited.The CAD system is divided into two steps:to begin,Haar-like characteristics were extracted from ultrasound pictures in order to train a classifier using random forests to find the fetal skull.We identified the HC using dynamic programming,an elliptical fit,and a Hough transform.The computer-aided detection(CAD)program was well-trained on 999 pictures(HC18 challenge data source),and then verified on 335 photos from all trimesters in an independent test set.A skilled sonographer and an expert in medicine personally marked the test set.We used the crown-rump length(CRL)measurement to calculate the reference gestational age(GA).In the first,second,and third trimesters,the median difference between the standard GA and the GA calculated by the skilled sonographer stayed at 0.7±2.7,0.0±4.5,and 2.0±12.0 days,respectively.The regular duration variance between the baseline GA and the health investigator’s GA remained 1.5±3.0,1.9±5.0,and 4.0±14 a couple of days.The mean variance between the standard GA and the CAD system’s GA remained between 0.5 and 5.0,with an additional variation of 2.9 to 12.5 days.The outcomes reveal that the computer-aided detection(CAD)program outperforms an expert sonographer.When paired with the classifications reported in the literature,the provided system achieves results that are comparable or even better.We have assessed and scheduled this computerized approach for HC evaluation,which includes information from all trimesters of gestation.
基金supported by the National Natural Science Foundation of China under Grant 61671219.
文摘Object detection in unmanned aerial vehicle(UAV)aerial images has become increasingly important in military and civil applications.General object detection models are not robust enough against interclass similarity and intraclass variability of small objects,and UAV-specific nuisances such as uncontrolledweather conditions.Unlike previous approaches focusing on high-level semantic information,we report the importance of underlying features to improve detection accuracy and robustness fromthe information-theoretic perspective.Specifically,we propose a robust and discriminative feature learning approach through mutual information maximization(RD-MIM),which can be integrated into numerous object detection methods for aerial images.Firstly,we present the rank sample mining method to reduce underlying feature differences between the natural image domain and the aerial image domain.Then,we design a momentum contrast learning strategy to make object features similar to the same category and dissimilar to different categories.Finally,we construct a transformer-based global attention mechanism to boost object location semantics by leveraging the high interrelation of different receptive fields.We conduct extensive experiments on the VisDrone and Unmanned Aerial Vehicle Benchmark Object Detection and Tracking(UAVDT)datasets to prove the effectiveness of the proposed method.The experimental results show that our approach brings considerable robustness gains to basic detectors and advanced detection methods,achieving relative growth rates of 51.0%and 39.4%in corruption robustness,respectively.Our code is available at https://github.com/cq100/RD-MIM(accessed on 2 August 2024).
文摘Industrial activities, through the human-induced release of Green House Gas (GHG) emissions, have beenidentified as the primary cause of global warming. Accurate and quantitative monitoring of these emissions isessential for a comprehensive understanding of their impact on the Earth’s climate and for effectively enforcingemission regulations at a large scale. This work examines the feasibility of detecting and quantifying industrialsmoke plumes using freely accessible geo-satellite imagery. The existing systemhas so many lagging factors such aslimitations in accuracy, robustness, and efficiency and these factors hinder the effectiveness in supporting timelyresponse to industrial fires. In this work, the utilization of grayscale images is done instead of traditional colorimages for smoke plume detection. The dataset was trained through a ResNet-50 model for classification and aU-Net model for segmentation. The dataset consists of images gathered by European Space Agency’s Sentinel-2 satellite constellation from a selection of industrial sites. The acquired images predominantly capture scenesof industrial locations, some of which exhibit active smoke plume emissions. The performance of the abovementionedtechniques and models is represented by their accuracy and IOU (Intersection-over-Union) metric.The images are first trained on the basic RGB images where their respective classification using the ResNet-50model results in an accuracy of 94.4% and segmentation using the U-Net Model with an IOU metric of 0.5 andaccuracy of 94% which leads to the detection of exact patches where the smoke plume has occurred. This work hastrained the classification model on grayscale images achieving a good increase in accuracy of 96.4%.
基金This work was supported by Science and Technology Cooperation Special Project of Shijiazhuang(SJZZXA23005).
文摘In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clinical operating environments,endoscopic images often suffer from challenges such as low texture,uneven illumination,and non-rigid structures,which affect feature observation and extraction.This can severely impact surgical navigation or clinical diagnosis due to missing feature points in endoscopic images,leading to treatment and postoperative recovery issues for patients.To address these challenges,this paper introduces,for the first time,a Cross-Channel Multi-Modal Adaptive Spatial Feature Fusion(ASFF)module based on the lightweight architecture of EfficientViT.Additionally,a novel lightweight feature extraction and matching network based on attention mechanism is proposed.This network dynamically adjusts attention weights for cross-modal information from grayscale images and optical flow images through a dual-branch Siamese network.It extracts static and dynamic information features ranging from low-level to high-level,and from local to global,ensuring robust feature extraction across different widths,noise levels,and blur scenarios.Global and local matching are performed through a multi-level cascaded attention mechanism,with cross-channel attention introduced to simultaneously extract low-level and high-level features.Extensive ablation experiments and comparative studies are conducted on the HyperKvasir,EAD,M2caiSeg,CVC-ClinicDB,and UCL synthetic datasets.Experimental results demonstrate that the proposed network improves upon the baseline EfficientViT-B3 model by 75.4%in accuracy(Acc),while also enhancing runtime performance and storage efficiency.When compared with the complex DenseDescriptor feature extraction network,the difference in Acc is less than 7.22%,and IoU calculation results on specific datasets outperform complex dense models.Furthermore,this method increases the F1 score by 33.2%and accelerates runtime by 70.2%.It is noteworthy that the speed of CMMCAN surpasses that of comparative lightweight models,with feature extraction and matching performance comparable to existing complex models but with faster speed and higher cost-effectiveness.
基金supported by the Key R&D Program of Shaanxi Province,China(Grant Nos.2022GY-274,2023-YBSF-505)the National Natural Science Foundation of China(Grant No.62273273).
文摘Recovering high-quality inscription images from unknown and complex inscription noisy images is a challenging research issue.Different fromnatural images,character images pay more attention to stroke information.However,existingmodelsmainly consider pixel-level informationwhile ignoring structural information of the character,such as its edge and glyph,resulting in reconstructed images with mottled local structure and character damage.To solve these problems,we propose a novel generative adversarial network(GAN)framework based on an edge-guided generator and a discriminator constructed by a dual-domain U-Net framework,i.e.,EDU-GAN.Unlike existing frameworks,the generator introduces the edge extractionmodule,guiding it into the denoising process through the attention mechanism,which maintains the edge detail of the restored inscription image.Moreover,a dual-domain U-Net-based discriminator is proposed to learn the global and local discrepancy between the denoised and the label images in both image and morphological domains,which is helpful to blind denoising tasks.The proposed dual-domain discriminator and generator for adversarial training can reduce local artifacts and keep the denoised character structure intact.Due to the lack of a real-inscription image,we built the real-inscription dataset to provide an effective benchmark for studying inscription image denoising.The experimental results show the superiority of our method both in the synthetic and real-inscription datasets.
文摘Multiplicative noise removal problems have attracted much attention in recent years.Unlike additive noise,multiplicative noise destroys almost all information of the original image,especially for texture images.Motivated by the TV-Stokes model,we propose a new two-step variational model to denoise the texture images corrupted by multiplicative noise with a good geometry explanation in this paper.In the first step,we convert the multiplicative denoising problem into an additive one by the logarithm transform and propagate the isophote directions in the tangential field smoothing.Once the isophote directions are constructed,an image is restored to fit the constructed directions in the second step.The existence and uniqueness of the solution to the variational problems are proved.In these two steps,we use the gradient descent method and construct finite difference schemes to solve the problems.Especially,the augmented Lagrangian method and the fast Fourier transform are adopted to accelerate the calculation.Experimental results show that the proposed model can remove the multiplicative noise efficiently and protect the texture well.
基金Ministry of Education,Youth and Sports of the Chezk Republic,Grant/Award Numbers:SP2023/039,SP2023/042the European Union under the REFRESH,Grant/Award Number:CZ.10.03.01/00/22_003/0000048。
文摘Detecting brain tumours is complex due to the natural variation in their location, shape, and intensity in images. While having accurate detection and segmentation of brain tumours would be beneficial, current methods still need to solve this problem despite the numerous available approaches. Precise analysis of Magnetic Resonance Imaging (MRI) is crucial for detecting, segmenting, and classifying brain tumours in medical diagnostics. Magnetic Resonance Imaging is a vital component in medical diagnosis, and it requires precise, efficient, careful, efficient, and reliable image analysis techniques. The authors developed a Deep Learning (DL) fusion model to classify brain tumours reliably. Deep Learning models require large amounts of training data to achieve good results, so the researchers utilised data augmentation techniques to increase the dataset size for training models. VGG16, ResNet50, and convolutional deep belief networks networks extracted deep features from MRI images. Softmax was used as the classifier, and the training set was supplemented with intentionally created MRI images of brain tumours in addition to the genuine ones. The features of two DL models were combined in the proposed model to generate a fusion model, which significantly increased classification accuracy. An openly accessible dataset from the internet was used to test the model's performance, and the experimental results showed that the proposed fusion model achieved a classification accuracy of 98.98%. Finally, the results were compared with existing methods, and the proposed model outperformed them significantly.
文摘Mobile technology is developing significantly.Mobile phone technologies have been integrated into the healthcare industry to help medical practitioners.Typically,computer vision models focus on image detection and classification issues.MobileNetV2 is a computer vision model that performs well on mobile devices,but it requires cloud services to process biometric image information and provide predictions to users.This leads to increased latency.Processing biometrics image datasets on mobile devices will make the prediction faster,but mobiles are resource-restricted devices in terms of storage,power,and computational speed.Hence,a model that is small in size,efficient,and has good prediction quality for biometrics image classification problems is required.Quantizing pre-trained CNN(PCNN)MobileNetV2 architecture combined with a Support Vector Machine(SVM)compacts the model representation and reduces the computational cost and memory requirement.This proposed novel approach combines quantized pre-trained CNN(PCNN)MobileNetV2 architecture with a Support Vector Machine(SVM)to represent models efficiently with low computational cost and memory.Our contributions include evaluating three CNN models for ocular disease identification in transfer learning and deep feature plus SVM approaches,showing the superiority of deep features from MobileNetV2 and SVM classification models,comparing traditional methods,exploring six ocular diseases and normal classification with 20,111 images postdata augmentation,and reducing the number of trainable models.The model is trained on ocular disorder retinal fundus image datasets according to the severity of six age-related macular degeneration(AMD),one of the most common eye illnesses,Cataract,Diabetes,Glaucoma,Hypertension,andMyopia with one class Normal.From the experiment outcomes,it is observed that the suggested MobileNetV2-SVM model size is compressed.The testing accuracy for MobileNetV2-SVM,InceptionV3,and MobileNetV2 is 90.11%,86.88%,and 89.76%respectively while MobileNetV2-SVM,InceptionV3,and MobileNetV2 accuracy are observed to be 92.59%,83.38%,and 90.16%,respectively.The proposed novel technique can be used to classify all biometric medical image datasets on mobile devices.
基金the appreciation to the Deanship of Postgraduate Studies and ScientificResearch atMajmaah University for funding this research work through the Project Number R-2024-922.
文摘This paper emphasizes a faster digital processing time while presenting an accurate method for identifying spinefractures in X-ray pictures. The study focuses on efficiency by utilizing many methods that include picturesegmentation, feature reduction, and image classification. Two important elements are investigated to reducethe classification time: Using feature reduction software and leveraging the capabilities of sophisticated digitalprocessing hardware. The researchers use different algorithms for picture enhancement, including theWiener andKalman filters, and they look into two background correction techniques. The article presents a technique forextracting textural features and evaluates three picture segmentation algorithms and three fractured spine detectionalgorithms using transformdomain, PowerDensity Spectrum(PDS), andHigher-Order Statistics (HOS) for featureextraction.With an emphasis on reducing digital processing time, this all-encompassing method helps to create asimplified system for classifying fractured spine fractures. A feature reduction program code has been built toimprove the processing speed for picture classification. Overall, the proposed approach shows great potential forsignificantly reducing classification time in clinical settings where time is critical. In comparison to other transformdomains, the texture features’ discrete cosine transform (DCT) yielded an exceptional classification rate, and theprocess of extracting features from the transform domain took less time. More capable hardware can also result inquicker execution times for the feature extraction algorithms.