针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行...针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。展开更多
There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteri...There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteristics and influencing factors of each type,is essential for creating urban and rural B&B agglomeration areas.This study used density-based spatial clustering of applications with noise(DBSCAN)and the multi-scale geographically weighted regression(MGWR)model to explore similarities and differences in the spatial distribution patterns and influencing factors for urban and rural B&Bs on the Jiaodong Peninsula of China from 2010 to 2022.The results showed that:1)both urban and rural B&Bs in Jiaodong Peninsula went through three stages:a slow start from 2010 to 2015,rapid development from 2015 to 2019,and hindered development from 2019 to 2022.However,urban B&Bs demonstrated a higher development speed and agglomeration intensity,leading to an increasingly evident trend of uneven development between the two sectors.2)The clustering scale of both urban and rural B&Bs continued to expand in terms of quantity and volume.Urban B&B clusters characterized by a limited number,but a higher likelihood of transitioning from low-level to high-level clusters.While the number of rural B&B clusters steadily increased over time,their clustering scale was comparatively lower than that of urban B&Bs,and they lacked the presence of high-level clustering.3)In terms of development direction,urban B&B clusters exhibited a relatively stable pattern and evolved into high-level clustering centers within the main urban areas.Conversely,rural B&Bs exhibited a more pronounced spatial diffusion effect,with clusters showing a trend of multi-center development along the coastline.4)Transport emerged as a common influencing factor for both urban and rural B&Bs,with the density of road network having the strongest explanatory power for their spatial distribution.In terms of differences,population agglomeration had a positive impact on the distribution of urban B&Bs and a negative effect on the distribution of rural B&Bs.Rural B&Bs clustering was more influenced by tourism resources compared with urban B&Bs,but increasing tourist stay duration remains an urgent issue to be addressed.The findings of this study could provide a more precise basis for government planning and management of urban and rural B&B agglomeration areas.展开更多
Gobi spans a large area of China,surpassing the combined expanse of mobile dunes and semi-fixed dunes.Its presence significantly influences the movement of sand and dust.However,the complex origins and diverse materia...Gobi spans a large area of China,surpassing the combined expanse of mobile dunes and semi-fixed dunes.Its presence significantly influences the movement of sand and dust.However,the complex origins and diverse materials constituting the Gobi result in notable differences in saltation processes across various Gobi surfaces.It is challenging to describe these processes according to a uniform morphology.Therefore,it becomes imperative to articulate surface characteristics through parameters such as the three-dimensional(3D)size and shape of gravel.Collecting morphology information for Gobi gravels is essential for studying its genesis and sand saltation.To enhance the efficiency and information yield of gravel parameter measurements,this study conducted field experiments in the Gobi region across Dunhuang City,Guazhou County,and Yumen City(administrated by Jiuquan City),Gansu Province,China in March 2023.A research framework and methodology for measuring 3D parameters of gravel using point cloud were developed,alongside improved calculation formulas for 3D parameters including gravel grain size,volume,flatness,roundness,sphericity,and equivalent grain size.Leveraging multi-view geometry technology for 3D reconstruction allowed for establishing an optimal data acquisition scheme characterized by high point cloud reconstruction efficiency and clear quality.Additionally,the proposed methodology incorporated point cloud clustering,segmentation,and filtering techniques to isolate individual gravel point clouds.Advanced point cloud algorithms,including the Oriented Bounding Box(OBB),point cloud slicing method,and point cloud triangulation,were then deployed to calculate the 3D parameters of individual gravels.These systematic processes allow precise and detailed characterization of individual gravels.For gravel grain size and volume,the correlation coefficients between point cloud and manual measurements all exceeded 0.9000,confirming the feasibility of the proposed methodology for measuring 3D parameters of individual gravels.The proposed workflow yields accurate calculations of relevant parameters for Gobi gravels,providing essential data support for subsequent studies on Gobi environments.展开更多
This paper studies an existing 13.8 kilovolt distribution network which, serves an oil production field spread over an area of approximately 60 kilometers square, in order to locate any fault that may occur anywhere i...This paper studies an existing 13.8 kilovolt distribution network which, serves an oil production field spread over an area of approximately 60 kilometers square, in order to locate any fault that may occur anywhere in the network using fuzzy c-mean classification techniques. In addition, Sections 5 and 6 introduce two different methods for normalizing data and selecting the optimum number of clusters in order to classify data. Results and conclusions are given to show the feasibility for the suggested fault location method. Suggestion for future related research has been provided in Section 8.展开更多
Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformat...Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformational changes or interaction mechanisms.As one of the density-based clustering algorithms,find density peaks(FDP)is an accurate and reasonable candidate for the molecular conformation clustering.However,facing the rapidly increasing simulation length due to the increase in computing power,the low computing efficiency of FDP limits its application potential.Here we propose a marginal extension to FDP named K-means find density peaks(KFDP)to solve the mass source consuming problem.In KFDP,the points are initially clustered by a high efficiency clustering algorithm,such as K-means.Cluster centers are defined as typical points with a weight which represents the cluster size.Then,the weighted typical points are clustered again by FDP,and then are refined as core,boundary,and redefined halo points.In this way,KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n^(2))to O(n).We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle,secondary structure or contact map.The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.展开更多
Clustering data with varying densities and complicated structures is important,while many existing clustering algorithms face difficulties for this problem. The reason is that varying densities and complicated structu...Clustering data with varying densities and complicated structures is important,while many existing clustering algorithms face difficulties for this problem. The reason is that varying densities and complicated structure make single algorithms perform badly for different parts of data. More intensive parts are assumed to have more information probably,an algorithm clustering from high density part is proposed,which begins from a tiny distance to find the highest density-connected partition and form corresponding super cores,then distance is iteratively increased by a global heuristic method to cluster parts with different densities. Mean of silhouette coefficient indicates the cluster performance. Denoising function is implemented to eliminate influence of noise and outliers. Many challenging experiments indicate that the algorithm has good performance on data with widely varying densities and extremely complex structures. It decides the optimal number of clusters automatically.Background knowledge is not needed and parameters tuning is easy. It is robust against noise and outliers.展开更多
As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural sp...As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural space and tourism experience.In this case,it is necessary to explore the spatial characteristics and influencing factors of physical bookstores.This study uses Density-Based Spatial Clustering of Applications with Noise(DBSCAN),spatial analysis and geographical detectors to calculate the spatial distribution pattern and factors influencing physical bookstores in national central cities/municipality(hereafter using cities)in western China.Based on spatial data,population density,road density and other data,this study constructed a data set of the influencing factors of physical bookstores,consisting of 11 factors along 6 dimensions for 3 national central cities in western China.The results are as follows:first,the spatial distribution pattern of physical bookstores in Xi’an,Chengdu,and Chongqing is unbalanced.The spatial distribution of physical bookstores in Xi’an and Chongqing is from southwest to northeast and are relatively clustered,while those in Chengdu are relatively discrete.Second,the spatial distribution pattern of physical bookstores has been formed under the influence of different factors.The intensity and significance of influencing factors differ in the case cities.However,in general,the social factor,business factor,the density of research facilities,tourism factor and road density are the main driving factors in the three cities.There is a synergistic relationship between public libraries and physical bookstores.Third,the explanatory power becomes stronger after the interaction between various factors.In Xi’an and Chengdu,the density of communities and the density of research facilities have stronger explanatory power for the dependent variable after interacting with other factors.However,in Chongqing,the traffic factors have stronger explanatory power for the dependent variable after interacting with other factors.The results could provide a practical reference for the sustainable development of physical bookstores and encourage a love of reading among the public.展开更多
Electric vehicle(EV)charging load is greatly affected by many traffic factors,such as road congestion.Accurate ultra short-term load forecasting(STLF)results for regional EV charging load are important to the scheduli...Electric vehicle(EV)charging load is greatly affected by many traffic factors,such as road congestion.Accurate ultra short-term load forecasting(STLF)results for regional EV charging load are important to the scheduling plan of regional charging load,which can be derived to realize the optimal vehicle to grid benefit.In this paper,a regional-level EV ultra STLF method is proposed and discussed.The usage degree of all charging piles is firstly defined by us based on the usage frequency of charging piles,and then constructed by our collected EV charging transactiondata in thefield.Secondly,these usagedegrees are combinedwithhistorical charging loadvalues toform the inputmatrix for the deep learning based load predictionmodel.Finally,long short-termmemory(LSTM)neural network is used to construct EV charging load forecastingmodel,which is trained by the formed inputmatrix.The comparison experiment proves that the proposed method in this paper has higher prediction accuracy compared with traditionalmethods.In addition,load characteristic index for the fluctuation of adjacent day load and adjacent week load are proposed by us,and these fluctuation factors are used to assess the prediction accuracy of the EV charging load,together with the mean absolute percentage error(MAPE).展开更多
For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic...For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.展开更多
Clustering is one of the unsupervised learning problems.It is a procedure which partitions data objects into groups.Many algorithms could not overcome the problems of morphology,overlapping and the large number of clu...Clustering is one of the unsupervised learning problems.It is a procedure which partitions data objects into groups.Many algorithms could not overcome the problems of morphology,overlapping and the large number of clusters at the same time.Many scientific communities have used the clustering algorithm from the perspective of density,which is one of the best methods in clustering.This study proposes a density-based spatial clustering of applications with noise(DBSCAN)algorithm based on the selected high-density areas by automatic fuzzy-DBSCAN(AFD)which works with the initialization of two parameters.AFD,by using fuzzy and DBSCAN features,is modeled by the selection of high-density areas and generates two parameters for merging and separating automatically.The two generated parameters provide a state of sub-cluster rules in the Cartesian coordinate system for the dataset.The model overcomes the problems of clustering such as morphology,overlapping,and the number of clusters in a dataset simultaneously.In the experiments,all algorithms are performed on eight data sets with 30 times of running.Three of them are related to overlapping real datasets and the rest are morphologic and synthetic datasets.It is demonstrated that the AFD algorithm outperforms other recently developed clustering algorithms.展开更多
In order to diagnose the common faults of railway switch control circuit,a fault diagnosis method based on density-based spatial clustering of applications with noise(DBSCAN)and self-organizing feature map(SOM)is prop...In order to diagnose the common faults of railway switch control circuit,a fault diagnosis method based on density-based spatial clustering of applications with noise(DBSCAN)and self-organizing feature map(SOM)is proposed.Firstly,the three-phase current curve of the switch machine recorded by the micro-computer monitoring system is dealt with segmentally and then the feature parameters of the three-phase current are calculated according to the action principle of the switch machine.Due to the high dimension of initial features,the DBSCAN algorithm is used to separate the sensitive features of fault diagnosis and construct the diagnostic sensitive feature set.Then,the particle swarm optimization(PSO)algorithm is used to adjust the weight of SOM network to modify the rules to avoid“dead neurons”.Finally,the PSO-SOM network fault classifier is designed to complete the classification and diagnosis of the samples to be tested.The experimental results show that this method can judge the fault mode of switch control circuit with less training samples,and the accuracy of fault diagnosis is higher than that of traditional SOM network.展开更多
The year of 2013 is considered the first year of smart city in China. With the development of informationization and urbanization in China, city diseases(traffic jam, medical problem and unbalanced education) are more...The year of 2013 is considered the first year of smart city in China. With the development of informationization and urbanization in China, city diseases(traffic jam, medical problem and unbalanced education) are more and more apparent. Smart city is the key to solving these diseases. This paper presents the overall smart city development in China in term of market scale and development stages, the technology standards, and industry layout. The paper claims that the issues and challenges facing smart city development in China and proposes to make polices to support smart city development.展开更多
As the demand for bike-sharing has been increasing,the oversupply problem of bike-sharing has occurred,which leads to the waste of resources and disturbance of the urban environment.In order to regulate the supply vol...As the demand for bike-sharing has been increasing,the oversupply problem of bike-sharing has occurred,which leads to the waste of resources and disturbance of the urban environment.In order to regulate the supply volume of bike-sharing reasonably,an estimating model was proposed to quantify the urban carrying capacity(UCC)for bike-sharing through the demand data.In this way,the maximum supply volume of bike-sharing that a city can accommodate can be obtained.The UCC on bike-sharing is reflected in the road network carrying capacity(RNCC)and parking facilities’carrying capacity(PFCC).The space-time consumption method and density-based spatial clustering of application with noise(DBSCAN)algorithm were used to explore the RNCC and PFCC for bike-sharing.Combined with the users’demand,the urban load ratio on bike-sharing can be evaluated to judge whether the UCC can meet users’demand,so that the supply volume of bike-sharing and distribution of the related facilities can be adjusted accordingly.The application of the model was carried out by estimating the UCC and load ratio of each traffic analysis zone in Nanjing,China.Compared with the field survey data,the effect of the proposed algorithm was verified.展开更多
Caused by the environment clutter,the radar false alarm plots are unavoidable.Suppressing false alarm points has always been a key issue in Radar plots procession.In this paper,a radar false alarm plots elimination me...Caused by the environment clutter,the radar false alarm plots are unavoidable.Suppressing false alarm points has always been a key issue in Radar plots procession.In this paper,a radar false alarm plots elimination method based on multi-feature extraction and classification is proposed to effectively eliminate false alarm plots.Firstly,the density based spatial clustering of applications with noise(DBSCAN)algorithm is used to cluster the radar echo data processed by constant false-alarm rate(CFAR).The multi-features including the scale features,time domain features and transform domain features are extracted.Secondly,a feature evaluation method combining pearson correlation coefficient(PCC)and entropy weight method(EWM)is proposed to evaluate interrelation among features,effective feature combination sets are selected as inputs of the classifier.Finally,False alarm plots classified as clutters are eliminated.The experimental results show that proposed method can eliminate about 90%false alarm plots with less target loss rate.展开更多
Hardware Trojans(HTs)have drawn increasing attention in both academia and industry because of their significant potential threat.In this paper,we propose HTDet,a novel HT detection method using information entropybase...Hardware Trojans(HTs)have drawn increasing attention in both academia and industry because of their significant potential threat.In this paper,we propose HTDet,a novel HT detection method using information entropybased clustering.To maintain high concealment,HTs are usually inserted in the regions with low controllability and low observability,which will result in that Trojan logics have extremely low transitions during the simulation.This implies that the regions with the low transitions will provide much more abundant and more important information for HT detection.The HTDet applies information theory technology and a density-based clustering algorithm called Density-Based Spatial Clustering of Applications with Noise(DBSCAN)to detect all suspicious Trojan logics in the circuit under detection.The DBSCAN is an unsupervised learning algorithm,that can improve the applicability of HTDet.In addition,we develop a heuristic test pattern generation method using mutual information to increase the transitions of suspicious Trojan logics.Experiments on circuit benchmarks demonstrate the effectiveness of HTDet.展开更多
This paper deals with the problem of piecewise auto regressive systems with exogenous input(PWARX) model identification based on clustering solution. This problem involves both the estimation of the parameters of the ...This paper deals with the problem of piecewise auto regressive systems with exogenous input(PWARX) model identification based on clustering solution. This problem involves both the estimation of the parameters of the affine sub-models and the hyper planes defining the partitions of the state-input regression. The existing identification methods present three main drawbacks which limit its effectiveness. First, most of them may converge to local minima in the case of poor initializations because they are based on the optimization using nonlinear criteria. Second, they use simple and ineffective techniques to remove outliers. Third, most of them assume that the number of sub-models is known a priori. To overcome these drawbacks, we suggest the use of the density-based spatial clustering of applications with noise(DBSCAN) algorithm. The results presented in this paper illustrate the performance of our methods in comparison with the existing approach. An application of the developed approach to an olive oil esterification reactor is also proposed in order to validate the simulation results.展开更多
针对即时配送“最后一公里”的问题,综合利用订单取送点、即时配送骑手历史时空轨迹、兴趣面(area of interest,AOI)空间范围与门禁位置等数据,精确预估AOI内部各兴趣点(point of interest,POI)到相应可通行门禁点的时间、距离及路径。...针对即时配送“最后一公里”的问题,综合利用订单取送点、即时配送骑手历史时空轨迹、兴趣面(area of interest,AOI)空间范围与门禁位置等数据,精确预估AOI内部各兴趣点(point of interest,POI)到相应可通行门禁点的时间、距离及路径。在此基础上设计了配套的调用选优策略,获得最优的末端指引方案,以有效提高即时配送路径质量及时间距离预估准确性。展开更多
The density based notion for clustering approach is used widely due to its easy implementation and ability to detect arbitrary shaped clusters in the presence of noisy data points without requiring prior knowledge of ...The density based notion for clustering approach is used widely due to its easy implementation and ability to detect arbitrary shaped clusters in the presence of noisy data points without requiring prior knowledge of the number of clusters to be identified. Density-based spatial clustering of applications with noise (DBSCAN) is the first algorithm proposed in the literature that uses density based notion for cluster detection. Since most of the real data set, today contains feature space of adjacent nested clusters, clearly DBSCAN is not suitable to detect variable adjacent density clusters due to the use of global density parameter neighborhood radius Y,.ad and minimum number of points in neighborhood Np~,. So the efficiency of DBSCAN depends on these initial parameter settings, for DBSCAN to work properly, the neighborhood radius must be less than the distance between two clusters otherwise algorithm merges two clusters and detects them as a single cluster. Through this paper: 1) We have proposed improved version of DBSCAN algorithm to detect clusters of varying density adjacent clusters by using the concept of neighborhood difference and using the notion of density based approach without introducing much additional computational complexity to original DBSCAN algorithm. 2) We validated our experimental results using one of our authors recently proposed space density indexing (SDI) internal cluster measure to demonstrate the quality of proposed clustering method. Also our experimental results suggested that proposed method is effective in detecting variable density adjacent nested clusters.展开更多
Background At present,it is insufficient to understand the basic data characteristics of the correlated X-ray scattering.And there is a great challenge about how to master the nature of the data.So it is difficult to ...Background At present,it is insufficient to understand the basic data characteristics of the correlated X-ray scattering.And there is a great challenge about how to master the nature of the data.So it is difficult to use and analyze the experimental data more effectively.In addition,there are many reasons,for the experimental artifacts such as whether the shutter is on or off,whether there is the beam line or not,the swaying of the nozzle and the shadow of the detector.So it is rather challenging to analyze the scattering patterns.Purpose The purpose of this paper was to develop a method to filter the invalid scattering data and provide the theoretical and experiment fundamentals for studying the X-ray scattering data of the complex biological sample further.Methods The heliummolecules were scattered by the X-ray free-electron laser in Spring8 in Japan.Andmillions of scattering patterns were obtained from the X-ray free-electron laser experiment.Through the analysis of the scattering data,the sum,mean,median and variance of the scattering intensity were obtained.Then different clusters were obtained with the densitybased spatial clustering of applications with noise(DBSCAN)algorithm.Results Based on the DBSCAN,some of the scattering patterns with high artifacts were removed and different clusters were clarified.So the experimental scattering data could be analyzed more effectively.Conclusion The theoretical and experiment fundamentals for comprehensively studying the X-ray scattering data of the complex biological sample were provided.After the data filtering,the angular autocorrelation of different clusters with Kam’s method will be computed and analyzed effectively.展开更多
文摘针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。
基金Under the auspices of National Social Science Foundation of China (No.21BJY202)。
文摘There are significant differences between urban and rural bed-and-breakfasts(B&Bs)in terms of customer positioning,economic strength and spatial carrier.Accurately identifying the differences in spatial characteristics and influencing factors of each type,is essential for creating urban and rural B&B agglomeration areas.This study used density-based spatial clustering of applications with noise(DBSCAN)and the multi-scale geographically weighted regression(MGWR)model to explore similarities and differences in the spatial distribution patterns and influencing factors for urban and rural B&Bs on the Jiaodong Peninsula of China from 2010 to 2022.The results showed that:1)both urban and rural B&Bs in Jiaodong Peninsula went through three stages:a slow start from 2010 to 2015,rapid development from 2015 to 2019,and hindered development from 2019 to 2022.However,urban B&Bs demonstrated a higher development speed and agglomeration intensity,leading to an increasingly evident trend of uneven development between the two sectors.2)The clustering scale of both urban and rural B&Bs continued to expand in terms of quantity and volume.Urban B&B clusters characterized by a limited number,but a higher likelihood of transitioning from low-level to high-level clusters.While the number of rural B&B clusters steadily increased over time,their clustering scale was comparatively lower than that of urban B&Bs,and they lacked the presence of high-level clustering.3)In terms of development direction,urban B&B clusters exhibited a relatively stable pattern and evolved into high-level clustering centers within the main urban areas.Conversely,rural B&Bs exhibited a more pronounced spatial diffusion effect,with clusters showing a trend of multi-center development along the coastline.4)Transport emerged as a common influencing factor for both urban and rural B&Bs,with the density of road network having the strongest explanatory power for their spatial distribution.In terms of differences,population agglomeration had a positive impact on the distribution of urban B&Bs and a negative effect on the distribution of rural B&Bs.Rural B&Bs clustering was more influenced by tourism resources compared with urban B&Bs,but increasing tourist stay duration remains an urgent issue to be addressed.The findings of this study could provide a more precise basis for government planning and management of urban and rural B&B agglomeration areas.
基金funded by the National Natural Science Foundation of China(42071014).
文摘Gobi spans a large area of China,surpassing the combined expanse of mobile dunes and semi-fixed dunes.Its presence significantly influences the movement of sand and dust.However,the complex origins and diverse materials constituting the Gobi result in notable differences in saltation processes across various Gobi surfaces.It is challenging to describe these processes according to a uniform morphology.Therefore,it becomes imperative to articulate surface characteristics through parameters such as the three-dimensional(3D)size and shape of gravel.Collecting morphology information for Gobi gravels is essential for studying its genesis and sand saltation.To enhance the efficiency and information yield of gravel parameter measurements,this study conducted field experiments in the Gobi region across Dunhuang City,Guazhou County,and Yumen City(administrated by Jiuquan City),Gansu Province,China in March 2023.A research framework and methodology for measuring 3D parameters of gravel using point cloud were developed,alongside improved calculation formulas for 3D parameters including gravel grain size,volume,flatness,roundness,sphericity,and equivalent grain size.Leveraging multi-view geometry technology for 3D reconstruction allowed for establishing an optimal data acquisition scheme characterized by high point cloud reconstruction efficiency and clear quality.Additionally,the proposed methodology incorporated point cloud clustering,segmentation,and filtering techniques to isolate individual gravel point clouds.Advanced point cloud algorithms,including the Oriented Bounding Box(OBB),point cloud slicing method,and point cloud triangulation,were then deployed to calculate the 3D parameters of individual gravels.These systematic processes allow precise and detailed characterization of individual gravels.For gravel grain size and volume,the correlation coefficients between point cloud and manual measurements all exceeded 0.9000,confirming the feasibility of the proposed methodology for measuring 3D parameters of individual gravels.The proposed workflow yields accurate calculations of relevant parameters for Gobi gravels,providing essential data support for subsequent studies on Gobi environments.
文摘This paper studies an existing 13.8 kilovolt distribution network which, serves an oil production field spread over an area of approximately 60 kilometers square, in order to locate any fault that may occur anywhere in the network using fuzzy c-mean classification techniques. In addition, Sections 5 and 6 introduce two different methods for normalizing data and selecting the optimum number of clusters in order to classify data. Results and conclusions are given to show the feasibility for the suggested fault location method. Suggestion for future related research has been provided in Section 8.
基金Professor Hong Yu at Intelligent Fishery Innovative Team(No.C202109)in School of Information Engineering of Dalian Ocean University for her support of this workfunded by the National Natural Science Foundation of China(No.31800615 and No.21933010)。
文摘Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories.Usually,it is a critical step for interpreting complex conformational changes or interaction mechanisms.As one of the density-based clustering algorithms,find density peaks(FDP)is an accurate and reasonable candidate for the molecular conformation clustering.However,facing the rapidly increasing simulation length due to the increase in computing power,the low computing efficiency of FDP limits its application potential.Here we propose a marginal extension to FDP named K-means find density peaks(KFDP)to solve the mass source consuming problem.In KFDP,the points are initially clustered by a high efficiency clustering algorithm,such as K-means.Cluster centers are defined as typical points with a weight which represents the cluster size.Then,the weighted typical points are clustered again by FDP,and then are refined as core,boundary,and redefined halo points.In this way,KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n^(2))to O(n).We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle,secondary structure or contact map.The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.
基金Supported by the National Key Research and Development Program of China(No.2016YFB0201305)National Science and Technology Major Project(No.2013ZX0102-8001-001-001)National Natural Science Foundation of China(No.91430218,31327901,61472395,61272134,61432018)
文摘Clustering data with varying densities and complicated structures is important,while many existing clustering algorithms face difficulties for this problem. The reason is that varying densities and complicated structure make single algorithms perform badly for different parts of data. More intensive parts are assumed to have more information probably,an algorithm clustering from high density part is proposed,which begins from a tiny distance to find the highest density-connected partition and form corresponding super cores,then distance is iteratively increased by a global heuristic method to cluster parts with different densities. Mean of silhouette coefficient indicates the cluster performance. Denoising function is implemented to eliminate influence of noise and outliers. Many challenging experiments indicate that the algorithm has good performance on data with widely varying densities and extremely complex structures. It decides the optimal number of clusters automatically.Background knowledge is not needed and parameters tuning is easy. It is robust against noise and outliers.
基金Under the auspices of National Natural Science Foundation of China(No.41271179)。
文摘As cultural facilities,physical bookstore is an important part of urban infrastructure.Influenced by the development of social economy and the internet,physical bookstores also have become a combination of cultural space and tourism experience.In this case,it is necessary to explore the spatial characteristics and influencing factors of physical bookstores.This study uses Density-Based Spatial Clustering of Applications with Noise(DBSCAN),spatial analysis and geographical detectors to calculate the spatial distribution pattern and factors influencing physical bookstores in national central cities/municipality(hereafter using cities)in western China.Based on spatial data,population density,road density and other data,this study constructed a data set of the influencing factors of physical bookstores,consisting of 11 factors along 6 dimensions for 3 national central cities in western China.The results are as follows:first,the spatial distribution pattern of physical bookstores in Xi’an,Chengdu,and Chongqing is unbalanced.The spatial distribution of physical bookstores in Xi’an and Chongqing is from southwest to northeast and are relatively clustered,while those in Chengdu are relatively discrete.Second,the spatial distribution pattern of physical bookstores has been formed under the influence of different factors.The intensity and significance of influencing factors differ in the case cities.However,in general,the social factor,business factor,the density of research facilities,tourism factor and road density are the main driving factors in the three cities.There is a synergistic relationship between public libraries and physical bookstores.Third,the explanatory power becomes stronger after the interaction between various factors.In Xi’an and Chengdu,the density of communities and the density of research facilities have stronger explanatory power for the dependent variable after interacting with other factors.However,in Chongqing,the traffic factors have stronger explanatory power for the dependent variable after interacting with other factors.The results could provide a practical reference for the sustainable development of physical bookstores and encourage a love of reading among the public.
基金supported by National Key R&D Program of China(No.2021YFB2601602).
文摘Electric vehicle(EV)charging load is greatly affected by many traffic factors,such as road congestion.Accurate ultra short-term load forecasting(STLF)results for regional EV charging load are important to the scheduling plan of regional charging load,which can be derived to realize the optimal vehicle to grid benefit.In this paper,a regional-level EV ultra STLF method is proposed and discussed.The usage degree of all charging piles is firstly defined by us based on the usage frequency of charging piles,and then constructed by our collected EV charging transactiondata in thefield.Secondly,these usagedegrees are combinedwithhistorical charging loadvalues toform the inputmatrix for the deep learning based load predictionmodel.Finally,long short-termmemory(LSTM)neural network is used to construct EV charging load forecastingmodel,which is trained by the formed inputmatrix.The comparison experiment proves that the proposed method in this paper has higher prediction accuracy compared with traditionalmethods.In addition,load characteristic index for the fluctuation of adjacent day load and adjacent week load are proposed by us,and these fluctuation factors are used to assess the prediction accuracy of the EV charging load,together with the mean absolute percentage error(MAPE).
基金supported by the National Key Research and Development Program of China(2018YFB1003700)the Scientific and Technological Support Project(Society)of Jiangsu Province(BE2016776)+2 种基金the“333” project of Jiangsu Province(BRA2017228 BRA2017401)the Talent Project in Six Fields of Jiangsu Province(2015-JNHB-012)
文摘For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.
文摘Clustering is one of the unsupervised learning problems.It is a procedure which partitions data objects into groups.Many algorithms could not overcome the problems of morphology,overlapping and the large number of clusters at the same time.Many scientific communities have used the clustering algorithm from the perspective of density,which is one of the best methods in clustering.This study proposes a density-based spatial clustering of applications with noise(DBSCAN)algorithm based on the selected high-density areas by automatic fuzzy-DBSCAN(AFD)which works with the initialization of two parameters.AFD,by using fuzzy and DBSCAN features,is modeled by the selection of high-density areas and generates two parameters for merging and separating automatically.The two generated parameters provide a state of sub-cluster rules in the Cartesian coordinate system for the dataset.The model overcomes the problems of clustering such as morphology,overlapping,and the number of clusters in a dataset simultaneously.In the experiments,all algorithms are performed on eight data sets with 30 times of running.Three of them are related to overlapping real datasets and the rest are morphologic and synthetic datasets.It is demonstrated that the AFD algorithm outperforms other recently developed clustering algorithms.
基金High Education Research Project Funding(No.2018C-11)Natural Science Fund of Gansu Province(Nos.18JR3RA107,1610RJYA034)Key Research and Development Program of Gansu Province(No.17YF1WA 158)。
文摘In order to diagnose the common faults of railway switch control circuit,a fault diagnosis method based on density-based spatial clustering of applications with noise(DBSCAN)and self-organizing feature map(SOM)is proposed.Firstly,the three-phase current curve of the switch machine recorded by the micro-computer monitoring system is dealt with segmentally and then the feature parameters of the three-phase current are calculated according to the action principle of the switch machine.Due to the high dimension of initial features,the DBSCAN algorithm is used to separate the sensitive features of fault diagnosis and construct the diagnostic sensitive feature set.Then,the particle swarm optimization(PSO)algorithm is used to adjust the weight of SOM network to modify the rules to avoid“dead neurons”.Finally,the PSO-SOM network fault classifier is designed to complete the classification and diagnosis of the samples to be tested.The experimental results show that this method can judge the fault mode of switch control circuit with less training samples,and the accuracy of fault diagnosis is higher than that of traditional SOM network.
文摘The year of 2013 is considered the first year of smart city in China. With the development of informationization and urbanization in China, city diseases(traffic jam, medical problem and unbalanced education) are more and more apparent. Smart city is the key to solving these diseases. This paper presents the overall smart city development in China in term of market scale and development stages, the technology standards, and industry layout. The paper claims that the issues and challenges facing smart city development in China and proposes to make polices to support smart city development.
基金Project(2018YFE0120100)supported by the National Key R&D Program of ChinaProject(YBPY2040)supported by the Scientific Research Foundation of Graduate School of Southeast University,China。
文摘As the demand for bike-sharing has been increasing,the oversupply problem of bike-sharing has occurred,which leads to the waste of resources and disturbance of the urban environment.In order to regulate the supply volume of bike-sharing reasonably,an estimating model was proposed to quantify the urban carrying capacity(UCC)for bike-sharing through the demand data.In this way,the maximum supply volume of bike-sharing that a city can accommodate can be obtained.The UCC on bike-sharing is reflected in the road network carrying capacity(RNCC)and parking facilities’carrying capacity(PFCC).The space-time consumption method and density-based spatial clustering of application with noise(DBSCAN)algorithm were used to explore the RNCC and PFCC for bike-sharing.Combined with the users’demand,the urban load ratio on bike-sharing can be evaluated to judge whether the UCC can meet users’demand,so that the supply volume of bike-sharing and distribution of the related facilities can be adjusted accordingly.The application of the model was carried out by estimating the UCC and load ratio of each traffic analysis zone in Nanjing,China.Compared with the field survey data,the effect of the proposed algorithm was verified.
文摘Caused by the environment clutter,the radar false alarm plots are unavoidable.Suppressing false alarm points has always been a key issue in Radar plots procession.In this paper,a radar false alarm plots elimination method based on multi-feature extraction and classification is proposed to effectively eliminate false alarm plots.Firstly,the density based spatial clustering of applications with noise(DBSCAN)algorithm is used to cluster the radar echo data processed by constant false-alarm rate(CFAR).The multi-features including the scale features,time domain features and transform domain features are extracted.Secondly,a feature evaluation method combining pearson correlation coefficient(PCC)and entropy weight method(EWM)is proposed to evaluate interrelation among features,effective feature combination sets are selected as inputs of the classifier.Finally,False alarm plots classified as clutters are eliminated.The experimental results show that proposed method can eliminate about 90%false alarm plots with less target loss rate.
文摘Hardware Trojans(HTs)have drawn increasing attention in both academia and industry because of their significant potential threat.In this paper,we propose HTDet,a novel HT detection method using information entropybased clustering.To maintain high concealment,HTs are usually inserted in the regions with low controllability and low observability,which will result in that Trojan logics have extremely low transitions during the simulation.This implies that the regions with the low transitions will provide much more abundant and more important information for HT detection.The HTDet applies information theory technology and a density-based clustering algorithm called Density-Based Spatial Clustering of Applications with Noise(DBSCAN)to detect all suspicious Trojan logics in the circuit under detection.The DBSCAN is an unsupervised learning algorithm,that can improve the applicability of HTDet.In addition,we develop a heuristic test pattern generation method using mutual information to increase the transitions of suspicious Trojan logics.Experiments on circuit benchmarks demonstrate the effectiveness of HTDet.
文摘This paper deals with the problem of piecewise auto regressive systems with exogenous input(PWARX) model identification based on clustering solution. This problem involves both the estimation of the parameters of the affine sub-models and the hyper planes defining the partitions of the state-input regression. The existing identification methods present three main drawbacks which limit its effectiveness. First, most of them may converge to local minima in the case of poor initializations because they are based on the optimization using nonlinear criteria. Second, they use simple and ineffective techniques to remove outliers. Third, most of them assume that the number of sub-models is known a priori. To overcome these drawbacks, we suggest the use of the density-based spatial clustering of applications with noise(DBSCAN) algorithm. The results presented in this paper illustrate the performance of our methods in comparison with the existing approach. An application of the developed approach to an olive oil esterification reactor is also proposed in order to validate the simulation results.
文摘针对即时配送“最后一公里”的问题,综合利用订单取送点、即时配送骑手历史时空轨迹、兴趣面(area of interest,AOI)空间范围与门禁位置等数据,精确预估AOI内部各兴趣点(point of interest,POI)到相应可通行门禁点的时间、距离及路径。在此基础上设计了配套的调用选优策略,获得最优的末端指引方案,以有效提高即时配送路径质量及时间距离预估准确性。
文摘The density based notion for clustering approach is used widely due to its easy implementation and ability to detect arbitrary shaped clusters in the presence of noisy data points without requiring prior knowledge of the number of clusters to be identified. Density-based spatial clustering of applications with noise (DBSCAN) is the first algorithm proposed in the literature that uses density based notion for cluster detection. Since most of the real data set, today contains feature space of adjacent nested clusters, clearly DBSCAN is not suitable to detect variable adjacent density clusters due to the use of global density parameter neighborhood radius Y,.ad and minimum number of points in neighborhood Np~,. So the efficiency of DBSCAN depends on these initial parameter settings, for DBSCAN to work properly, the neighborhood radius must be less than the distance between two clusters otherwise algorithm merges two clusters and detects them as a single cluster. Through this paper: 1) We have proposed improved version of DBSCAN algorithm to detect clusters of varying density adjacent clusters by using the concept of neighborhood difference and using the notion of density based approach without introducing much additional computational complexity to original DBSCAN algorithm. 2) We validated our experimental results using one of our authors recently proposed space density indexing (SDI) internal cluster measure to demonstrate the quality of proposed clustering method. Also our experimental results suggested that proposed method is effective in detecting variable density adjacent nested clusters.
基金National Institutes of Health research Grant 251 R01-GM097463Stanford NIH Biotechnology Training Grant No.5T32GM008412-20,US Department of Energy Office of Science under Contract No.DE-AC02-05CH11231National Nature Science Foundation of China for theoretical physics Grant No.11547238.
文摘Background At present,it is insufficient to understand the basic data characteristics of the correlated X-ray scattering.And there is a great challenge about how to master the nature of the data.So it is difficult to use and analyze the experimental data more effectively.In addition,there are many reasons,for the experimental artifacts such as whether the shutter is on or off,whether there is the beam line or not,the swaying of the nozzle and the shadow of the detector.So it is rather challenging to analyze the scattering patterns.Purpose The purpose of this paper was to develop a method to filter the invalid scattering data and provide the theoretical and experiment fundamentals for studying the X-ray scattering data of the complex biological sample further.Methods The heliummolecules were scattered by the X-ray free-electron laser in Spring8 in Japan.Andmillions of scattering patterns were obtained from the X-ray free-electron laser experiment.Through the analysis of the scattering data,the sum,mean,median and variance of the scattering intensity were obtained.Then different clusters were obtained with the densitybased spatial clustering of applications with noise(DBSCAN)algorithm.Results Based on the DBSCAN,some of the scattering patterns with high artifacts were removed and different clusters were clarified.So the experimental scattering data could be analyzed more effectively.Conclusion The theoretical and experiment fundamentals for comprehensively studying the X-ray scattering data of the complex biological sample were provided.After the data filtering,the angular autocorrelation of different clusters with Kam’s method will be computed and analyzed effectively.