In computational physics proton transfer phenomena could be viewed as pattern classification problems based on a set of input features allowing classification of the proton motion into two categories: transfer 'occu...In computational physics proton transfer phenomena could be viewed as pattern classification problems based on a set of input features allowing classification of the proton motion into two categories: transfer 'occurred' and transfer 'not occurred'. The goal of this paper is to evaluate the use of artificial neural networks in the classification of proton transfer events, based on the feed-forward back propagation neural network, used as a classifier to distinguish between the two transfer cases. In this paper, we use a new developed data mining and pattern recognition tool for automating, controlling, and drawing charts of the output data of an Empirical Valence Bond existing code. The study analyzes the need for pattern recognition in aqueous proton transfer processes and how the learning approach in error back propagation (multilayer perceptron algorithms) could be satisfactorily employed in the present case. We present a tool for pattern recognition and validate the code including a real physical case study. The results of applying the artificial neural networks methodology to crowd patterns based upon selected physical properties (e.g., temperature, density) show the abilities of the network to learn proton transfer patterns corresponding to properties of the aqueous environments, which is in turn proved to be fully compatible with previous proton transfer studies.展开更多
Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux...Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux systems. The method uses the data mining technique to model the normal behavior of a privileged program and uses a variable-length pattern matching algorithm to perform the comparison of the current behavior and historic normal behavior, which is more suitable for this problem than the fixed-length pattern matching algorithm proposed by Forrest et al. At the detection stage, the particularity of the audit data is taken into account, and two alternative schemes could be used to distinguish between normalities and intrusions. The method gives attention to both computational efficiency and detection accuracy and is especially applicable for on-line detection. The performance of the method is evaluated using the typical testing data set, and the results show that it is significantly better than the anomaly detection method based on hidden Markov models proposed by Yan et al. and the method based on fixed-length patterns proposed by Forrest and Hofmeyr. The novel method has been applied to practical hosted-based intrusion detection systems and achieved high detection performance.展开更多
In order to explore the travel characteristics and space-time distribution of different groups of bikeshare users,an online analytical processing(OLAP)tool called data cube was used for treating and displaying multi-d...In order to explore the travel characteristics and space-time distribution of different groups of bikeshare users,an online analytical processing(OLAP)tool called data cube was used for treating and displaying multi-dimensional data.We extended and modified the traditionally threedimensional data cube into four dimensions,which are space,date,time,and user,each with a user-specified hierarchy,and took transaction numbers and travel time as two quantitative measures.The results suggest that there are two obvious transaction peaks during the morning and afternoon rush hours on weekdays,while the volume at weekends has an approximate even distribution.Bad weather condition significantly restricts the bikeshare usage.Besides,seamless smartcard users generally take a longer trip than exclusive smartcard users;and non-native users ride faster than native users.These findings not only support the applicability and efficiency of data cube in the field of visualizing massive smartcard data,but also raise equity concerns among bikeshare users with different demographic backgrounds.展开更多
The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional...The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional datasets. In addition, the traditional outlier detection method does not consider the frequency of subsets occurrence, thus, the detected outliers do not fit the definition of outliers (i.e., rarely appearing). The pattern mining-based outlier detection approaches have solved this problem, but the importance of each pattern is not taken into account in outlier detection process, so the detected outliers cannot truly reflect some actual situation. Aimed at these problems, a two-phase minimal weighted rare pattern mining-based outlier detection approach, called MWRPM-Outlier, is proposed to effectively detect outliers on the weight data stream. In particular, a method called MWRPM is proposed in the pattern mining phase to fast mine the minimal weighted rare patterns, and then two deviation factors are defined in outlier detection phase to measure the abnormal degree of each transaction on the weight data stream. Experimental results show that the proposed MWRPM-Outlier approach has excellent performance in outlier detection and MWRPM approach outperforms in weighted rare pattern mining.展开更多
Based on the night light data, urban area data, and economic data of Wuhan Urban Agglomeration from 2009 to 2015, we use spatial correlation dimension, spatial self-correlation analysis and weighted standard deviation...Based on the night light data, urban area data, and economic data of Wuhan Urban Agglomeration from 2009 to 2015, we use spatial correlation dimension, spatial self-correlation analysis and weighted standard deviation ellipse to identify the general characteristics and dynamic evolution characteristics of urban spatial pattern and economic disparity pattern. The research results prove that: between 2009 and 2013, Wuhan Urban Agglomeration expanded gradually from northwest to southeast and presented the dynamic evolution features of “along the river and the road”. The spatial structure is obvious, forming the pattern of “core-periphery”. The development of Wuhan Urban Agglomeration has obvious imbalance in economic geography space, presenting the development tendency of “One prominent, stronger in the west and weaker in the east”. The contract within Wuhan Urban Agglomeration is gradually decreased. Wuhan city and its surrounding areas have stronger economic growth strength as well as the cities along The Yangtze River. However, the relative development rate of Wuhan city area is still far higher than other cities and counties.展开更多
A variety of faulty radar echoes may cause serious problems with radar data applications,especially radar data assimilation and quantitative precipitation estimates.In this study,"test pattern" caused by test signal...A variety of faulty radar echoes may cause serious problems with radar data applications,especially radar data assimilation and quantitative precipitation estimates.In this study,"test pattern" caused by test signal or radar hardware failures in CINRAD (China New Generation Weather Radar) SA and SB radar operational observations are investigated.In order to distinguish the test pattern from other types of radar echoes,such as precipitation,clear air and other non-meteorological echoes,five feature parameters including the effective reflectivity data percentage (Rz),velocity RF (range folding) data percentage (RRF),missing velocity data percentage (RM),averaged along-azimuth reflectivity fluctuation (RNr,z) and averaged along-beam reflectivity fluctuation (RNa,z) are proposed.Based on the fuzzy logic method,a test pattern identification algorithm is developed,and the statistical results from all the different kinds of radar echoes indicate the performance of the algorithm.Analysis of two typical cases with heavy precipitation echoes located inside the test pattern are performed.The statistical results show that the test pattern identification algorithm performs well,since the test pattern is recognized in most cases.Besides,the algorithm can effectively remove the test pattern signal and retain strong precipitation echoes in heavy rainfall events.展开更多
How organizations analyze and use data for decision-making has been changed by cognitive computing and artificial intelligence (AI). Cognitive computing solutions can translate enormous amounts of data into valuable i...How organizations analyze and use data for decision-making has been changed by cognitive computing and artificial intelligence (AI). Cognitive computing solutions can translate enormous amounts of data into valuable insights by utilizing the power of cutting-edge algorithms and machine learning, empowering enterprises to make deft decisions quickly and efficiently. This article explores the idea of cognitive computing and AI in decision-making, emphasizing its function in converting unvalued data into valuable knowledge. It details the advantages of utilizing these technologies, such as greater productivity, accuracy, and efficiency. Businesses may use cognitive computing and AI to their advantage to obtain a competitive edge in today’s data-driven world by knowing their capabilities and possibilities [1].展开更多
In crime science, understanding the dynamics and interactions between crime events is crucial for comprehending the underlying factors that drive their occurrences. Nonetheless, gaining access to detailed spatiotempor...In crime science, understanding the dynamics and interactions between crime events is crucial for comprehending the underlying factors that drive their occurrences. Nonetheless, gaining access to detailed spatiotemporal crime records from law enforcement faces significant challenges due to confidentiality concerns. In response to these challenges, this paper introduces an innovative analytical tool named “stppSim,” designed to synthesize fine-grained spatiotemporal point records while safeguarding the privacy of individual locations. By utilizing the open-source R platform, this tool ensures easy accessibility for researchers, facilitating download, re-use, and potential advancements in various research domains beyond crime science.展开更多
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
基金Dr. Steve Jones, Scientific Advisor of the Canon Foundation for Scientific Research (7200 The Quorum, Oxford Business Park, Oxford OX4 2JZ, England). Canon Foundation for Scientific Research funded the UPC 2013 tuition fees of the corresponding author during her writing this article
文摘In computational physics proton transfer phenomena could be viewed as pattern classification problems based on a set of input features allowing classification of the proton motion into two categories: transfer 'occurred' and transfer 'not occurred'. The goal of this paper is to evaluate the use of artificial neural networks in the classification of proton transfer events, based on the feed-forward back propagation neural network, used as a classifier to distinguish between the two transfer cases. In this paper, we use a new developed data mining and pattern recognition tool for automating, controlling, and drawing charts of the output data of an Empirical Valence Bond existing code. The study analyzes the need for pattern recognition in aqueous proton transfer processes and how the learning approach in error back propagation (multilayer perceptron algorithms) could be satisfactorily employed in the present case. We present a tool for pattern recognition and validate the code including a real physical case study. The results of applying the artificial neural networks methodology to crowd patterns based upon selected physical properties (e.g., temperature, density) show the abilities of the network to learn proton transfer patterns corresponding to properties of the aqueous environments, which is in turn proved to be fully compatible with previous proton transfer studies.
基金supported by the National Grand Fundamental Research "973" Program of China (2004CB318109)the National High-Technology Research and Development Plan of China (2006AA01Z452)the National Information Security "242"Program of China (2005C39).
文摘Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux systems. The method uses the data mining technique to model the normal behavior of a privileged program and uses a variable-length pattern matching algorithm to perform the comparison of the current behavior and historic normal behavior, which is more suitable for this problem than the fixed-length pattern matching algorithm proposed by Forrest et al. At the detection stage, the particularity of the audit data is taken into account, and two alternative schemes could be used to distinguish between normalities and intrusions. The method gives attention to both computational efficiency and detection accuracy and is especially applicable for on-line detection. The performance of the method is evaluated using the typical testing data set, and the results show that it is significantly better than the anomaly detection method based on hidden Markov models proposed by Yan et al. and the method based on fixed-length patterns proposed by Forrest and Hofmeyr. The novel method has been applied to practical hosted-based intrusion detection systems and achieved high detection performance.
基金Supported by Projects of International Cooperation and Exchange of the National Natural Science Foundation of China(51561135003)Key Project of National Natural Science Foundation of China(51338003)Scientific Research Foundation of Graduated School of Southeast University(YBJJ1842)
文摘In order to explore the travel characteristics and space-time distribution of different groups of bikeshare users,an online analytical processing(OLAP)tool called data cube was used for treating and displaying multi-dimensional data.We extended and modified the traditionally threedimensional data cube into four dimensions,which are space,date,time,and user,each with a user-specified hierarchy,and took transaction numbers and travel time as two quantitative measures.The results suggest that there are two obvious transaction peaks during the morning and afternoon rush hours on weekdays,while the volume at weekends has an approximate even distribution.Bad weather condition significantly restricts the bikeshare usage.Besides,seamless smartcard users generally take a longer trip than exclusive smartcard users;and non-native users ride faster than native users.These findings not only support the applicability and efficiency of data cube in the field of visualizing massive smartcard data,but also raise equity concerns among bikeshare users with different demographic backgrounds.
基金supported by Fundamental Research Funds for the Central Universities (No. 2018XD004)
文摘The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional datasets. In addition, the traditional outlier detection method does not consider the frequency of subsets occurrence, thus, the detected outliers do not fit the definition of outliers (i.e., rarely appearing). The pattern mining-based outlier detection approaches have solved this problem, but the importance of each pattern is not taken into account in outlier detection process, so the detected outliers cannot truly reflect some actual situation. Aimed at these problems, a two-phase minimal weighted rare pattern mining-based outlier detection approach, called MWRPM-Outlier, is proposed to effectively detect outliers on the weight data stream. In particular, a method called MWRPM is proposed in the pattern mining phase to fast mine the minimal weighted rare patterns, and then two deviation factors are defined in outlier detection phase to measure the abnormal degree of each transaction on the weight data stream. Experimental results show that the proposed MWRPM-Outlier approach has excellent performance in outlier detection and MWRPM approach outperforms in weighted rare pattern mining.
文摘Based on the night light data, urban area data, and economic data of Wuhan Urban Agglomeration from 2009 to 2015, we use spatial correlation dimension, spatial self-correlation analysis and weighted standard deviation ellipse to identify the general characteristics and dynamic evolution characteristics of urban spatial pattern and economic disparity pattern. The research results prove that: between 2009 and 2013, Wuhan Urban Agglomeration expanded gradually from northwest to southeast and presented the dynamic evolution features of “along the river and the road”. The spatial structure is obvious, forming the pattern of “core-periphery”. The development of Wuhan Urban Agglomeration has obvious imbalance in economic geography space, presenting the development tendency of “One prominent, stronger in the west and weaker in the east”. The contract within Wuhan Urban Agglomeration is gradually decreased. Wuhan city and its surrounding areas have stronger economic growth strength as well as the cities along The Yangtze River. However, the relative development rate of Wuhan city area is still far higher than other cities and counties.
基金supported by the National Key Program for Developing Basic Sciences under Grant 2012CB417202the National Natural Science Foundation of China under Grant No. 41175038, No. 41305088 and No. 41075023+4 种基金the Meteorological Special Project "Radar network observation technology and QC"the CMA Key project "Radar Operational Software Engineering"the Chinese Academy of Meteorological Sciences Basic ScientificOperational Projects "Observation and retrieval methods of micro-physics and dynamic parameters of cloud and precipitation with multi-wavelength Remote Sensing"Project of the State Key Laboratory of Severe Weather grant 2012LASW-B04
文摘A variety of faulty radar echoes may cause serious problems with radar data applications,especially radar data assimilation and quantitative precipitation estimates.In this study,"test pattern" caused by test signal or radar hardware failures in CINRAD (China New Generation Weather Radar) SA and SB radar operational observations are investigated.In order to distinguish the test pattern from other types of radar echoes,such as precipitation,clear air and other non-meteorological echoes,five feature parameters including the effective reflectivity data percentage (Rz),velocity RF (range folding) data percentage (RRF),missing velocity data percentage (RM),averaged along-azimuth reflectivity fluctuation (RNr,z) and averaged along-beam reflectivity fluctuation (RNa,z) are proposed.Based on the fuzzy logic method,a test pattern identification algorithm is developed,and the statistical results from all the different kinds of radar echoes indicate the performance of the algorithm.Analysis of two typical cases with heavy precipitation echoes located inside the test pattern are performed.The statistical results show that the test pattern identification algorithm performs well,since the test pattern is recognized in most cases.Besides,the algorithm can effectively remove the test pattern signal and retain strong precipitation echoes in heavy rainfall events.
文摘How organizations analyze and use data for decision-making has been changed by cognitive computing and artificial intelligence (AI). Cognitive computing solutions can translate enormous amounts of data into valuable insights by utilizing the power of cutting-edge algorithms and machine learning, empowering enterprises to make deft decisions quickly and efficiently. This article explores the idea of cognitive computing and AI in decision-making, emphasizing its function in converting unvalued data into valuable knowledge. It details the advantages of utilizing these technologies, such as greater productivity, accuracy, and efficiency. Businesses may use cognitive computing and AI to their advantage to obtain a competitive edge in today’s data-driven world by knowing their capabilities and possibilities [1].
文摘In crime science, understanding the dynamics and interactions between crime events is crucial for comprehending the underlying factors that drive their occurrences. Nonetheless, gaining access to detailed spatiotemporal crime records from law enforcement faces significant challenges due to confidentiality concerns. In response to these challenges, this paper introduces an innovative analytical tool named “stppSim,” designed to synthesize fine-grained spatiotemporal point records while safeguarding the privacy of individual locations. By utilizing the open-source R platform, this tool ensures easy accessibility for researchers, facilitating download, re-use, and potential advancements in various research domains beyond crime science.
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.