In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering a...In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed. First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values. Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed. Finally, the clustering is completed by the traditional k-means clustering. By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity. The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower.展开更多
Reliable Cluster Head(CH)selectionbased routing protocols are necessary for increasing the packet transmission efficiency with optimal path discovery that never introduces degradation over the transmission reliability...Reliable Cluster Head(CH)selectionbased routing protocols are necessary for increasing the packet transmission efficiency with optimal path discovery that never introduces degradation over the transmission reliability.In this paper,Hybrid Golden Jackal,and Improved Whale Optimization Algorithm(HGJIWOA)is proposed as an effective and optimal routing protocol that guarantees efficient routing of data packets in the established between the CHs and the movable sink.This HGJIWOA included the phases of Dynamic Lens-Imaging Learning Strategy and Novel Update Rules for determining the reliable route essential for data packets broadcasting attained through fitness measure estimation-based CH selection.The process of CH selection achieved using Golden Jackal Optimization Algorithm(GJOA)completely depends on the factors of maintainability,consistency,trust,delay,and energy.The adopted GJOA algorithm play a dominant role in determining the optimal path of routing depending on the parameter of reduced delay and minimal distance.It further utilized Improved Whale Optimisation Algorithm(IWOA)for forwarding the data from chosen CHs to the BS via optimized route depending on the parameters of energy and distance.It also included a reliable route maintenance process that aids in deciding the selected route through which data need to be transmitted or re-routed.The simulation outcomes of the proposed HGJIWOA mechanism with different sensor nodes confirmed an improved mean throughput of 18.21%,sustained residual energy of 19.64%with minimized end-to-end delay of 21.82%,better than the competitive CH selection approaches.展开更多
Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease ...Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease and treating it is pretty challenging in this period.Image processing is employed to detect plant disease since it requires much effort and an extended processing period.The main goal of this study is to discover the disease that affects the plants by creating an image processing system that can recognize and classify four different forms of plant diseases,including Phytophthora infestans,Fusarium graminearum,Puccinia graminis,tomato yellow leaf curl.Therefore,this work uses the Support vector machine(SVM)classifier to detect and classify the plant disease using various steps like image acquisition,Pre-processing,Segmentation,feature extraction,and classification.The gray level co-occurrence matrix(GLCM)and the local binary pattern features(LBP)are used to identify the disease-affected portion of the plant leaf.According to experimental data,the proposed technology can correctly detect and diagnose plant sickness with a 97.2 percent accuracy.展开更多
The K-means algorithm is widely known for its simplicity and fastness in text clustering.However,the selection of the initial clus?tering center with the traditional K-means algorithm is some random,and therefore,the ...The K-means algorithm is widely known for its simplicity and fastness in text clustering.However,the selection of the initial clus?tering center with the traditional K-means algorithm is some random,and therefore,the fluctuations and instability of the clustering results are strongly affected by the initial clustering center.This paper proposed an algorithm to select the initial clustering center to eliminate the uncertainty of central point selection.The experiment results show that the improved K-means clustering algorithm is superior to the traditional algorithm.展开更多
Offboard active decoys(OADs)can effectively jam monopulse radars.However,for missiles approaching from a particular direction and distance,the OAD should be placed at a specific location,posing high requirements for t...Offboard active decoys(OADs)can effectively jam monopulse radars.However,for missiles approaching from a particular direction and distance,the OAD should be placed at a specific location,posing high requirements for timing and deployment.To improve the response speed and jamming effect,a cluster of OADs based on an unmanned surface vehicle(USV)is proposed.The formation of the cluster determines the effectiveness of jamming.First,based on the mechanism of OAD jamming,critical conditions are identified,and a method for assessing the jamming effect is proposed.Then,for the optimization of the cluster formation,a mathematical model is built,and a multi-tribe adaptive particle swarm optimization algorithm based on mutation strategy and Metropolis criterion(3M-APSO)is designed.Finally,the formation optimization problem is solved and analyzed using the 3M-APSO algorithm under specific scenarios.The results show that the improved algorithm has a faster convergence rate and superior performance as compared to the standard Adaptive-PSO algorithm.Compared with a single OAD,the optimal formation of USV-OAD cluster effectively fills the blind area and maximizes the use of jamming resources.展开更多
In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared dista...In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.展开更多
A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise(DBSCAN)algorithm.To identify convective and stratiform clo...A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise(DBSCAN)algorithm.To identify convective and stratiform clouds in different developmental phases,two-dimensional(2D)and three-dimensional(3D)models are proposed by applying reflectivity factors at 0.5°and at 0.5°,1.5°,and 2.4°elevation angles,respectively.According to the thresholds of the algorithm,which include echo intensity,the echo top height of 35 dBZ(ET),density threshold,andεneighborhood,cloud clusters can be marked into four types:deep-convective cloud(DCC),shallow-convective cloud(SCC),hybrid convective-stratiform cloud(HCS),and stratiform cloud(SFC)types.Each cloud cluster type is further identified as a core area and boundary area,which can provide more abundant cloud structure information.The algorithm is verified using the volume scan data observed with new-generation S-band weather radars in Nanjing,Xuzhou,and Qingdao.The results show that cloud clusters can be intuitively identified as core and boundary points,which change in area continuously during the process of convective evolution,by the improved DBSCAN algorithm.Therefore,the occurrence and disappearance of convective weather can be estimated in advance by observing the changes of the classification.Because density thresholds are different and multiple elevations are utilized in the 3D model,the identified echo types and areas are dissimilar between the 2D and 3D models.The 3D model identifies larger convective and stratiform clouds than the 2D model.However,the developing convective clouds of small areas at lower heights cannot be identified with the 3D model because they are covered by thick stratiform clouds.In addition,the 3D model can avoid the influence of the melting layer and better suggest convective clouds in the developmental stage.展开更多
With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In th...With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In this paper, a set of software classification method based on software operating characteristics is proposed. The method uses software run-time resource consumption to describe the software running characteristics. Firstly, principal component analysis (PCA) is used to reduce the dimension of software running feature data and to interpret software characteristic information. Then the modified K-means algorithm was used to classify the meteorological data processing software. Finally, it combined with the results of principal component analysis to explain the significance of various types of integrated software operating characteristics. And it is used as the basis for optimizing the allocation of software hardware resources and improving the efficiency of software operation.展开更多
Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experien...Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experience-based criteria. In order to eliminate linguistic criteria resulted from experience-based judgments and account for uncertainties in determining class boundaries developed by SMR system,the system classification results were corrected using two clustering algorithms, namely K-means and fuzzy c-means(FCM), for the ratings obtained via continuous and discrete functions. By applying clustering algorithms in SMR classification system, no in-advance experience-based judgment was made on the number of extracted classes in this system, and it was only after all steps of the clustering algorithms were accomplished that new classification scheme was proposed for SMR system under different failure modes based on the ratings obtained via continuous and discrete functions. The results of this study showed that, engineers can achieve more reliable and objective evaluations over slope stability by using SMR system based on the ratings calculated via continuous and discrete functions.展开更多
The K-means method is one of the most widely used clustering methods and has been implemented in many fields of science and technology. One of the major problems of the k-means algorithm is that it may produce empty c...The K-means method is one of the most widely used clustering methods and has been implemented in many fields of science and technology. One of the major problems of the k-means algorithm is that it may produce empty clusters depending on initial center vectors. Genetic Algorithms (GAs) are adaptive heuristic search algorithm based on the evolutionary principles of natural selection and genetics. This paper presents a hybrid version of the k-means algorithm with GAs that efficiently eliminates this empty cluster problem. Results of simulation experiments using several data sets prove our claim.展开更多
K-means algorithm is one of the most widely used algorithms in the clustering analysis. To deal with the problem caused by the random selection of initial center points in the traditional al- gorithm, this paper propo...K-means algorithm is one of the most widely used algorithms in the clustering analysis. To deal with the problem caused by the random selection of initial center points in the traditional al- gorithm, this paper proposes an improved K-means algorithm based on the similarity matrix. The im- proved algorithm can effectively avoid the random selection of initial center points, therefore it can provide effective initial points for clustering process, and reduce the fluctuation of clustering results which are resulted from initial points selections, thus a better clustering quality can be obtained. The experimental results also show that the F-measure of the improved K-means algorithm has been greatly improved and the clustering results are more stable.展开更多
Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-clus...Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-cluster similarity and low inter-cluster similarity. Clustering techniques are applied in different domains to predict future trends of available data and its uses for the real world. This research work is carried out to find the performance of two of the most delegated, partition based clustering algorithms namely k-Means and k-Medoids. A state of art analysis of these two algorithms is implemented and performance is analyzed based on their clustering result quality by means of its execution time and other components. Telecommunication data is the source data for this analysis. The connection oriented broadband data is given as input to find the clustering quality of the algorithms. Distance between the server locations and their connection is considered for clustering. Execution time for each algorithm is analyzed and the results are compared with one another. Results found in comparison study are satisfactory for the chosen application.展开更多
For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the o...For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the original space discretionarily in the existing methods, this paper proposes a new method for ensuring the clustering center that virtual clustering centers are defined in the feature space by the original classification as the initial cluster centers and the iteration clustering centers are ensured by the further virtual classification. The improved method is used for fault diagnosis of roller bearing that achieves a good cluster and diagnosis result, which demonstrates the effectiveness of the proposed method.展开更多
针对电池储能系统(battery energy storage system,BESS)进行光伏波动平抑时寿命损耗高及荷电状态(state of charge,SOC)一致性差的问题,提出了光伏波动平抑下改进K-means的BESS动态分组控制策略。首先,采用最小最大调度方法获取光伏并...针对电池储能系统(battery energy storage system,BESS)进行光伏波动平抑时寿命损耗高及荷电状态(state of charge,SOC)一致性差的问题,提出了光伏波动平抑下改进K-means的BESS动态分组控制策略。首先,采用最小最大调度方法获取光伏并网指令。其次,设计了改进侏儒猫鼬优化算法(improved dwarf mongoose optimizer,IDMO),并利用它对传统K-means聚类算法进行改进,加快了聚类速度。接着,制定了电池单元动态分组原则,并根据电池单元SOC利用改进K-means将其分为3个电池组。然后,设计了基于充放电函数的电池单元SOC一致性功率分配方法,并据此提出BESS双层功率分配策略,上层确定电池组充放电顺序及指令,下层计算电池单元充放电指令。对所提策略进行仿真验证,结果表明,所设计的IDMO具有更高的寻优精度及更快的寻优速度。所提BESS平抑光伏波动策略在有效平抑波动的同时,降低了BESS运行寿命损耗并提高了电池单元SOC的均衡性。展开更多
受限于自然条件,光伏出力具有很强的随机性。为准确评估轨道交通基础设施分布式光伏发电的光伏出力特性,提出一种基于改进K-means聚类算法的轨道交通基础设施分布式光伏发电典型场景生成方法,并基于此进行光伏出力特性分析。首先,基于...受限于自然条件,光伏出力具有很强的随机性。为准确评估轨道交通基础设施分布式光伏发电的光伏出力特性,提出一种基于改进K-means聚类算法的轨道交通基础设施分布式光伏发电典型场景生成方法,并基于此进行光伏出力特性分析。首先,基于分布式光伏发电设施以及气象数据,利用PVsyst软件模拟光伏发电出力数据。然后,针对基本K-means聚类算法聚类参数和初始聚类中心盲目性高的问题,结合聚类有效性指标(Density based index,DBI)和层次聚类对其进行改进并利用改进K-means聚类算法生成光伏典型日出力场景。最后,基于华中地区某地轨道交通基础设施分布式光伏系统对所提方法的有效性和优越性进行验证,并通过定性和定量分析各典型场景的出力特性揭示轨道交通基础设施分布式光伏出力的规律和特点。展开更多
As the cash register system gradually prevailed in shopping malls, detecting the abnormal status of the cash register system has gradually become a hotspot issue. This paper analyzes the transaction data of a shopping...As the cash register system gradually prevailed in shopping malls, detecting the abnormal status of the cash register system has gradually become a hotspot issue. This paper analyzes the transaction data of a shopping mall. When calculating the degree of data difference, the coefficient of variation is used as the attribute weight;the weighted Euclidean distance is used to calculate the degree of difference;and k-means clustering is used to classify different time periods. It applies the LOF algorithm to detect the outlier degree of transaction data at each time period, sets the initial threshold to detect outliers, deletes the outliers, and then performs SAX detection on the data set. If it does not pass the test, then it will gradually expand the outlying domain and repeat the above process to optimize the outlier threshold to improve the sensitivity of detection algorithm and reduce false positives.展开更多
Most clustering algorithms need to describe the similarity of objects by a predefined distance function. Three distance functions which are widely used in two traditional clustering algorithms k-means and hierarchical...Most clustering algorithms need to describe the similarity of objects by a predefined distance function. Three distance functions which are widely used in two traditional clustering algorithms k-means and hierarchical clustering were investigated. Both theoretical analysis and detailed experimental results were given. It is shown that a distance function greatly affects clustering results and can be used to detect the outlier of a cluster by the comparison of such different results and give the shape information of clusters. In practice situation, it is suggested to use different distance function separately, compare the clustering results and pick out the 搒wing points? And such points may leak out more information for data analysts.展开更多
基金The National Natural Science Foundation of China(No50674086)Specialized Research Fund for the Doctoral Program of Higher Education (No20060290508)the Youth Scientific Research Foundation of China University of Mining and Technology (No2006A047)
文摘In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed. First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values. Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed. Finally, the clustering is completed by the traditional k-means clustering. By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity. The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower.
文摘Reliable Cluster Head(CH)selectionbased routing protocols are necessary for increasing the packet transmission efficiency with optimal path discovery that never introduces degradation over the transmission reliability.In this paper,Hybrid Golden Jackal,and Improved Whale Optimization Algorithm(HGJIWOA)is proposed as an effective and optimal routing protocol that guarantees efficient routing of data packets in the established between the CHs and the movable sink.This HGJIWOA included the phases of Dynamic Lens-Imaging Learning Strategy and Novel Update Rules for determining the reliable route essential for data packets broadcasting attained through fitness measure estimation-based CH selection.The process of CH selection achieved using Golden Jackal Optimization Algorithm(GJOA)completely depends on the factors of maintainability,consistency,trust,delay,and energy.The adopted GJOA algorithm play a dominant role in determining the optimal path of routing depending on the parameter of reduced delay and minimal distance.It further utilized Improved Whale Optimisation Algorithm(IWOA)for forwarding the data from chosen CHs to the BS via optimized route depending on the parameters of energy and distance.It also included a reliable route maintenance process that aids in deciding the selected route through which data need to be transmitted or re-routed.The simulation outcomes of the proposed HGJIWOA mechanism with different sensor nodes confirmed an improved mean throughput of 18.21%,sustained residual energy of 19.64%with minimized end-to-end delay of 21.82%,better than the competitive CH selection approaches.
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2023R104)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Several pests feed on leaves,stems,bases,and the entire plant,causing plant illnesses.As a result,it is vital to identify and eliminate the disease before causing any damage to plants.Manually detecting plant disease and treating it is pretty challenging in this period.Image processing is employed to detect plant disease since it requires much effort and an extended processing period.The main goal of this study is to discover the disease that affects the plants by creating an image processing system that can recognize and classify four different forms of plant diseases,including Phytophthora infestans,Fusarium graminearum,Puccinia graminis,tomato yellow leaf curl.Therefore,this work uses the Support vector machine(SVM)classifier to detect and classify the plant disease using various steps like image acquisition,Pre-processing,Segmentation,feature extraction,and classification.The gray level co-occurrence matrix(GLCM)and the local binary pattern features(LBP)are used to identify the disease-affected portion of the plant leaf.According to experimental data,the proposed technology can correctly detect and diagnose plant sickness with a 97.2 percent accuracy.
文摘The K-means algorithm is widely known for its simplicity and fastness in text clustering.However,the selection of the initial clus?tering center with the traditional K-means algorithm is some random,and therefore,the fluctuations and instability of the clustering results are strongly affected by the initial clustering center.This paper proposed an algorithm to select the initial clustering center to eliminate the uncertainty of central point selection.The experiment results show that the improved K-means clustering algorithm is superior to the traditional algorithm.
基金the National Natural Science Foundation of China(Grant No.62101579).
文摘Offboard active decoys(OADs)can effectively jam monopulse radars.However,for missiles approaching from a particular direction and distance,the OAD should be placed at a specific location,posing high requirements for timing and deployment.To improve the response speed and jamming effect,a cluster of OADs based on an unmanned surface vehicle(USV)is proposed.The formation of the cluster determines the effectiveness of jamming.First,based on the mechanism of OAD jamming,critical conditions are identified,and a method for assessing the jamming effect is proposed.Then,for the optimization of the cluster formation,a mathematical model is built,and a multi-tribe adaptive particle swarm optimization algorithm based on mutation strategy and Metropolis criterion(3M-APSO)is designed.Finally,the formation optimization problem is solved and analyzed using the 3M-APSO algorithm under specific scenarios.The results show that the improved algorithm has a faster convergence rate and superior performance as compared to the standard Adaptive-PSO algorithm.Compared with a single OAD,the optimal formation of USV-OAD cluster effectively fills the blind area and maximizes the use of jamming resources.
文摘In k-means clustering, we are given a set of n data points in d-dimensional space R^d and an integer k and the problem is to determine a set of k points in R^d, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this paper, we present a simple and efficient clustering algorithm based on the k-means algorithm, which we call enhanced k-means algorithm. This algorithm is easy to implement, requiring a simple data structure to keep some information in each iteration to be used in the next iteration. Our experimental results demonstrated that our scheme can improve the computational speed of the k-means algorithm by the magnitude in the total number of distance calculations and the overall time of computation.
基金funded by the Key-Area Research and Development Program of Guangdong Province(Grant No.2020B1111200001)the Key project of monitoring,early warning and prevention of major natural disasters of China(Grant No.2019YFC1510304)+1 种基金the S&T Program of Hebei(Grant No.19275408D)the Scientific Research Projects of Weather Modification in Northwest China(Grant No.RYSY201905).
文摘A convective and stratiform cloud classification method for weather radar is proposed based on the density-based spatial clustering of applications with noise(DBSCAN)algorithm.To identify convective and stratiform clouds in different developmental phases,two-dimensional(2D)and three-dimensional(3D)models are proposed by applying reflectivity factors at 0.5°and at 0.5°,1.5°,and 2.4°elevation angles,respectively.According to the thresholds of the algorithm,which include echo intensity,the echo top height of 35 dBZ(ET),density threshold,andεneighborhood,cloud clusters can be marked into four types:deep-convective cloud(DCC),shallow-convective cloud(SCC),hybrid convective-stratiform cloud(HCS),and stratiform cloud(SFC)types.Each cloud cluster type is further identified as a core area and boundary area,which can provide more abundant cloud structure information.The algorithm is verified using the volume scan data observed with new-generation S-band weather radars in Nanjing,Xuzhou,and Qingdao.The results show that cloud clusters can be intuitively identified as core and boundary points,which change in area continuously during the process of convective evolution,by the improved DBSCAN algorithm.Therefore,the occurrence and disappearance of convective weather can be estimated in advance by observing the changes of the classification.Because density thresholds are different and multiple elevations are utilized in the 3D model,the identified echo types and areas are dissimilar between the 2D and 3D models.The 3D model identifies larger convective and stratiform clouds than the 2D model.However,the developing convective clouds of small areas at lower heights cannot be identified with the 3D model because they are covered by thick stratiform clouds.In addition,the 3D model can avoid the influence of the melting layer and better suggest convective clouds in the developmental stage.
文摘With the increasing variety of application software of meteorological satellite ground system, how to provide reasonable hardware resources and improve the efficiency of software is paid more and more attention. In this paper, a set of software classification method based on software operating characteristics is proposed. The method uses software run-time resource consumption to describe the software running characteristics. Firstly, principal component analysis (PCA) is used to reduce the dimension of software running feature data and to interpret software characteristic information. Then the modified K-means algorithm was used to classify the meteorological data processing software. Finally, it combined with the results of principal component analysis to explain the significance of various types of integrated software operating characteristics. And it is used as the basis for optimizing the allocation of software hardware resources and improving the efficiency of software operation.
文摘Classification systems such as Slope Mass Rating(SMR) are currently being used to undertake slope stability analysis. In SMR classification system, data is allocated to certain classes based on linguistic and experience-based criteria. In order to eliminate linguistic criteria resulted from experience-based judgments and account for uncertainties in determining class boundaries developed by SMR system,the system classification results were corrected using two clustering algorithms, namely K-means and fuzzy c-means(FCM), for the ratings obtained via continuous and discrete functions. By applying clustering algorithms in SMR classification system, no in-advance experience-based judgment was made on the number of extracted classes in this system, and it was only after all steps of the clustering algorithms were accomplished that new classification scheme was proposed for SMR system under different failure modes based on the ratings obtained via continuous and discrete functions. The results of this study showed that, engineers can achieve more reliable and objective evaluations over slope stability by using SMR system based on the ratings calculated via continuous and discrete functions.
文摘The K-means method is one of the most widely used clustering methods and has been implemented in many fields of science and technology. One of the major problems of the k-means algorithm is that it may produce empty clusters depending on initial center vectors. Genetic Algorithms (GAs) are adaptive heuristic search algorithm based on the evolutionary principles of natural selection and genetics. This paper presents a hybrid version of the k-means algorithm with GAs that efficiently eliminates this empty cluster problem. Results of simulation experiments using several data sets prove our claim.
文摘K-means algorithm is one of the most widely used algorithms in the clustering analysis. To deal with the problem caused by the random selection of initial center points in the traditional al- gorithm, this paper proposes an improved K-means algorithm based on the similarity matrix. The im- proved algorithm can effectively avoid the random selection of initial center points, therefore it can provide effective initial points for clustering process, and reduce the fluctuation of clustering results which are resulted from initial points selections, thus a better clustering quality can be obtained. The experimental results also show that the F-measure of the improved K-means algorithm has been greatly improved and the clustering results are more stable.
文摘Cluster analysis is one of the major data analysis methods widely used for many practical applications in emerging areas of data mining. A good clustering method will produce high quality clusters with high intra-cluster similarity and low inter-cluster similarity. Clustering techniques are applied in different domains to predict future trends of available data and its uses for the real world. This research work is carried out to find the performance of two of the most delegated, partition based clustering algorithms namely k-Means and k-Medoids. A state of art analysis of these two algorithms is implemented and performance is analyzed based on their clustering result quality by means of its execution time and other components. Telecommunication data is the source data for this analysis. The connection oriented broadband data is given as input to find the clustering quality of the algorithms. Distance between the server locations and their connection is considered for clustering. Execution time for each algorithm is analyzed and the results are compared with one another. Results found in comparison study are satisfactory for the chosen application.
文摘For the kernel K-mean cluster method is run in an implicit feature space, the initial and iterative cluster centers cannot be defined explicitly. Against the deficiency of the initial cluster centers selected in the original space discretionarily in the existing methods, this paper proposes a new method for ensuring the clustering center that virtual clustering centers are defined in the feature space by the original classification as the initial cluster centers and the iteration clustering centers are ensured by the further virtual classification. The improved method is used for fault diagnosis of roller bearing that achieves a good cluster and diagnosis result, which demonstrates the effectiveness of the proposed method.
文摘针对电池储能系统(battery energy storage system,BESS)进行光伏波动平抑时寿命损耗高及荷电状态(state of charge,SOC)一致性差的问题,提出了光伏波动平抑下改进K-means的BESS动态分组控制策略。首先,采用最小最大调度方法获取光伏并网指令。其次,设计了改进侏儒猫鼬优化算法(improved dwarf mongoose optimizer,IDMO),并利用它对传统K-means聚类算法进行改进,加快了聚类速度。接着,制定了电池单元动态分组原则,并根据电池单元SOC利用改进K-means将其分为3个电池组。然后,设计了基于充放电函数的电池单元SOC一致性功率分配方法,并据此提出BESS双层功率分配策略,上层确定电池组充放电顺序及指令,下层计算电池单元充放电指令。对所提策略进行仿真验证,结果表明,所设计的IDMO具有更高的寻优精度及更快的寻优速度。所提BESS平抑光伏波动策略在有效平抑波动的同时,降低了BESS运行寿命损耗并提高了电池单元SOC的均衡性。
文摘受限于自然条件,光伏出力具有很强的随机性。为准确评估轨道交通基础设施分布式光伏发电的光伏出力特性,提出一种基于改进K-means聚类算法的轨道交通基础设施分布式光伏发电典型场景生成方法,并基于此进行光伏出力特性分析。首先,基于分布式光伏发电设施以及气象数据,利用PVsyst软件模拟光伏发电出力数据。然后,针对基本K-means聚类算法聚类参数和初始聚类中心盲目性高的问题,结合聚类有效性指标(Density based index,DBI)和层次聚类对其进行改进并利用改进K-means聚类算法生成光伏典型日出力场景。最后,基于华中地区某地轨道交通基础设施分布式光伏系统对所提方法的有效性和优越性进行验证,并通过定性和定量分析各典型场景的出力特性揭示轨道交通基础设施分布式光伏出力的规律和特点。
文摘As the cash register system gradually prevailed in shopping malls, detecting the abnormal status of the cash register system has gradually become a hotspot issue. This paper analyzes the transaction data of a shopping mall. When calculating the degree of data difference, the coefficient of variation is used as the attribute weight;the weighted Euclidean distance is used to calculate the degree of difference;and k-means clustering is used to classify different time periods. It applies the LOF algorithm to detect the outlier degree of transaction data at each time period, sets the initial threshold to detect outliers, deletes the outliers, and then performs SAX detection on the data set. If it does not pass the test, then it will gradually expand the outlying domain and repeat the above process to optimize the outlier threshold to improve the sensitivity of detection algorithm and reduce false positives.
文摘Most clustering algorithms need to describe the similarity of objects by a predefined distance function. Three distance functions which are widely used in two traditional clustering algorithms k-means and hierarchical clustering were investigated. Both theoretical analysis and detailed experimental results were given. It is shown that a distance function greatly affects clustering results and can be used to detect the outlier of a cluster by the comparison of such different results and give the shape information of clusters. In practice situation, it is suggested to use different distance function separately, compare the clustering results and pick out the 搒wing points? And such points may leak out more information for data analysts.