This study addressed the issues related to the collection and management of basic data for railway green performance. A railway green performance basic database has been constructed based on metadata and data exchange...This study addressed the issues related to the collection and management of basic data for railway green performance. A railway green performance basic database has been constructed based on metadata and data exchange schemas. A data classification system has been established from the perspectives of businesses, processes,and entities. A BIM(Building Information Modelling) model data extraction scheme is proposed based on field similarity matching and a document content extraction scheme is proposed based on image recognition. A railway green performance basic data collection system has been developed, achieving efficient collection and integrated management of railway green performance basic data. This system can provide data support for applications such as railway carbon emissions accounting, green cost-benefit analysis, and evaluation of green design solutions.展开更多
A book entitled BASIC DATA OF CHINA'S POPULATION edited by the Data User Service(DUS),a UNFPA-funded project implemented by the China Population Information and Research Centre(CPIRC),has recently been published b...A book entitled BASIC DATA OF CHINA'S POPULATION edited by the Data User Service(DUS),a UNFPA-funded project implemented by the China Population Information and Research Centre(CPIRC),has recently been published by the China Population Publishing House.展开更多
In the era of big data, data application based on data governance has become an inevitable trend in the construction of smart campus in higher education. In this paper, a set of data governance system framework coveri...In the era of big data, data application based on data governance has become an inevitable trend in the construction of smart campus in higher education. In this paper, a set of data governance system framework covering the whole life cycle of data suitable for higher education is proposed, and based on this, the ideas and methods of data governance are applied to the construction of data management system for the basic development status of faculties by combining the practice of data governance of Donghua University.It forms a closed-loop management of data in all aspects, such as collection, information feedback, and statistical analysis of the basic development status data of the college. While optimizing the management business of higher education, the system provides a scientific and reliable basis for precise decision-making and strategic development of higher education.展开更多
With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for clou...With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.展开更多
Missing value is one of the main factors that cause dirty data.Without high-quality data,there will be no reliable analysis results and precise decision-making.Therefore,the data warehouse needs to integrate high-qual...Missing value is one of the main factors that cause dirty data.Without high-quality data,there will be no reliable analysis results and precise decision-making.Therefore,the data warehouse needs to integrate high-quality data consistently.In the power system,the electricity consumption data of some large users cannot be normally collected resulting in missing data,which affects the calculation of power supply and eventually leads to a large error in the daily power line loss rate.For the problem of missing electricity consumption data,this study proposes a group method of data handling(GMDH)based data interpolation method in distribution power networks and applies it in the analysis of actually collected electricity data.First,the dependent and independent variables are defined from the original data,and the upper and lower limits of missing values are determined according to prior knowledge or existing data information.All missing data are randomly interpolated within the upper and lower limits.Then,the GMDH network is established to obtain the optimal complexity model,which is used to predict the missing data to replace the last imputed electricity consumption data.At last,this process is implemented iteratively until the missing values do not change.Under a relatively small noise level(α=0.25),the proposed approach achieves a maximum error of no more than 0.605%.Experimental findings demonstrate the efficacy and feasibility of the proposed approach,which realizes the transformation from incomplete data to complete data.Also,this proposed data interpolation approach provides a strong basis for the electricity theft diagnosis and metering fault analysis of electricity enterprises.展开更多
Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal depende...Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal dependence,and noise.Therefore,methodologies for data augmentation and conversion of time series data into images for analysis have been studied.This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance,temporal dependence,and robustness to noise.The method of data augmentation is set as the addition of noise.It involves adding Gaussian noise,with the noise level set to 0.002,to maximize the generalization performance of the model.In addition,we use the Markov Transition Field(MTF)method to effectively visualize the dynamic transitions of the data while converting the time series data into images.It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data.For anomaly detection,the PatchCore model is applied to show excellent performance,and the detected anomaly areas are represented as heat maps.It allows for the detection of anomalies,and by applying an anomaly map to the original image,it is possible to capture the areas where anomalies occur.The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images.Additionally,when processed as images rather than as time series data,there was a significant reduction in both the size of the data and the training time.The proposed method can provide an important springboard for research in the field of anomaly detection using time series data.Besides,it helps solve problems such as analyzing complex patterns in data lightweight.展开更多
Large-scale wireless sensor networks(WSNs)play a critical role in monitoring dangerous scenarios and responding to medical emergencies.However,the inherent instability and error-prone nature of wireless links present ...Large-scale wireless sensor networks(WSNs)play a critical role in monitoring dangerous scenarios and responding to medical emergencies.However,the inherent instability and error-prone nature of wireless links present significant challenges,necessitating efficient data collection and reliable transmission services.This paper addresses the limitations of existing data transmission and recovery protocols by proposing a systematic end-to-end design tailored for medical event-driven cluster-based large-scale WSNs.The primary goal is to enhance the reliability of data collection and transmission services,ensuring a comprehensive and practical approach.Our approach focuses on refining the hop-count-based routing scheme to achieve fairness in forwarding reliability.Additionally,it emphasizes reliable data collection within clusters and establishes robust data transmission over multiple hops.These systematic improvements are designed to optimize the overall performance of the WSN in real-world scenarios.Simulation results of the proposed protocol validate its exceptional performance compared to other prominent data transmission schemes.The evaluation spans varying sensor densities,wireless channel conditions,and packet transmission rates,showcasing the protocol’s superiority in ensuring reliable and efficient data transfer.Our systematic end-to-end design successfully addresses the challenges posed by the instability of wireless links in large-scaleWSNs.By prioritizing fairness,reliability,and efficiency,the proposed protocol demonstrates its efficacy in enhancing data collection and transmission services,thereby offering a valuable contribution to the field of medical event-drivenWSNs.展开更多
In the field of pediatric nursing, in the perinatal period, numerous ethical issues arise alongside the advancement of medical technology. However, sufficient education on bioethics is not provided in the pediatric nu...In the field of pediatric nursing, in the perinatal period, numerous ethical issues arise alongside the advancement of medical technology. However, sufficient education on bioethics is not provided in the pediatric nursing domain of basic nursing education. The purpose of this research is to examine the current status of bioethics education in the pediatric nursing domain of basic nursing education and explore the challenges perceived by the pediatric nursing faculty regarding bioethics education. The research method was a questionnaire survey on 100 randomly selected pediatric nursing faculty members from nursing universities across Japan. The results revealed that although bioethics issues were considered important, the emphasis remained primarily on addressing bioethics as part of nursing that respects children’s rights. Furthermore, respondents expressed difficulties regarding teaching methods and content related to bioethics.展开更多
Guest Editors Prof.Ling Tian Prof.Jian-Hua Tao University of Electronic Science and Technology of China Tsinghua University lingtian@uestc.edu.cn jhtao@tsinghua.edu.cn Dr.Bin Zhou National University of Defense Techno...Guest Editors Prof.Ling Tian Prof.Jian-Hua Tao University of Electronic Science and Technology of China Tsinghua University lingtian@uestc.edu.cn jhtao@tsinghua.edu.cn Dr.Bin Zhou National University of Defense Technology binzhou@nudt.edu.cn, Since the concept of “Big Data” was first introduced in Nature in 2008, it has been widely applied in fields, such as business, healthcare, national defense, education, transportation, and security. With the maturity of artificial intelligence technology, big data analysis techniques tailored to various fields have made significant progress, but still face many challenges in terms of data quality, algorithms, and computing power.展开更多
Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series ”(GOES-R) Geostationary Lightning Mapper(GLM) flash extent density(FED) data within the operational Gridpoint Statistical Interp...Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series ”(GOES-R) Geostationary Lightning Mapper(GLM) flash extent density(FED) data within the operational Gridpoint Statistical Interpolation ensemble Kalman filter(GSI-EnKF) framework were previously developed and tested with a mesoscale convective system(MCS) case. In this study, such capabilities are further developed to assimilate GOES GLM FED data within the GSI ensemble-variational(EnVar) hybrid data assimilation(DA) framework. The results of assimilating the GLM FED data using 3DVar, and pure En3DVar(PEn3DVar, using 100% ensemble covariance and no static covariance) are compared with those of EnKF/DfEnKF for a supercell storm case. The focus of this study is to validate the correctness and evaluate the performance of the new implementation rather than comparing the performance of FED DA among different DA schemes. Only the results of 3DVar and pEn3DVar are examined and compared with EnKF/DfEnKF. Assimilation of a single FED observation shows that the magnitude and horizontal extent of the analysis increments from PEn3DVar are generally larger than from EnKF, which is mainly caused by using different localization strategies in EnFK/DfEnKF and PEn3DVar as well as the integration limits of the graupel mass in the observation operator. Overall, the forecast performance of PEn3DVar is comparable to EnKF/DfEnKF, suggesting correct implementation.展开更多
In source detection in the Tianlai project,locating the interferometric fringe in visibility data accurately will influence downstream tasks drastically,such as physical parameter estimation and weak source exploratio...In source detection in the Tianlai project,locating the interferometric fringe in visibility data accurately will influence downstream tasks drastically,such as physical parameter estimation and weak source exploration.Considering that traditional locating methods are time-consuming and supervised methods require a great quantity of expensive labeled data,in this paper,we first investigate characteristics of interferometric fringes in the simulation and real scenario separately,and integrate an almost parameter-free unsupervised clustering method and seeding filling or eraser algorithm to propose a hierarchical plug and play method to improve location accuracy.Then,we apply our method to locate single and multiple sources’interferometric fringes in simulation data.Next,we apply our method to real data taken from the Tianlai radio telescope array.Finally,we compare with unsupervised methods that are state of the art.These results show that our method has robustness in different scenarios and can improve location measurement accuracy effectively.展开更多
With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapi...With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapid development of IIoT.Blockchain technology has immutability,decentralization,and autonomy,which can greatly improve the inherent defects of the IIoT.In the traditional blockchain,data is stored in a Merkle tree.As data continues to grow,the scale of proofs used to validate it grows,threatening the efficiency,security,and reliability of blockchain-based IIoT.Accordingly,this paper first analyzes the inefficiency of the traditional blockchain structure in verifying the integrity and correctness of data.To solve this problem,a new Vector Commitment(VC)structure,Partition Vector Commitment(PVC),is proposed by improving the traditional VC structure.Secondly,this paper uses PVC instead of the Merkle tree to store big data generated by IIoT.PVC can improve the efficiency of traditional VC in the process of commitment and opening.Finally,this paper uses PVC to build a blockchain-based IIoT data security storage mechanism and carries out a comparative analysis of experiments.This mechanism can greatly reduce communication loss and maximize the rational use of storage space,which is of great significance for maintaining the security and stability of blockchain-based IIoT.展开更多
The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections an...The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.展开更多
Traditional Io T systems suffer from high equipment management costs and difficulty in trustworthy data sharing caused by centralization.Blockchain provides a feasible research direction to solve these problems. The m...Traditional Io T systems suffer from high equipment management costs and difficulty in trustworthy data sharing caused by centralization.Blockchain provides a feasible research direction to solve these problems. The main challenge at this stage is to integrate the blockchain from the resourceconstrained Io T devices and ensure the data of Io T system is credible. We provide a general framework for intelligent Io T data acquisition and sharing in an untrusted environment based on the blockchain, where gateways become Oracles. A distributed Oracle network based on Byzantine Fault Tolerant algorithm is used to provide trusted data for the blockchain to make intelligent Io T data trustworthy. An aggregation contract is deployed to collect data from various Oracle and share the credible data to all on-chain users. We also propose a gateway data aggregation scheme based on the REST API event publishing/subscribing mechanism which uses SQL to achieve flexible data aggregation. The experimental results show that the proposed scheme can alleviate the problem of limited performance of Io T equipment, make data reliable, and meet the diverse data needs on the chain.展开更多
Data stream clustering is integral to contemporary big data applications.However,addressing the ongoing influx of data streams efficiently and accurately remains a primary challenge in current research.This paper aims...Data stream clustering is integral to contemporary big data applications.However,addressing the ongoing influx of data streams efficiently and accurately remains a primary challenge in current research.This paper aims to elevate the efficiency and precision of data stream clustering,leveraging the TEDA(Typicality and Eccentricity Data Analysis)algorithm as a foundation,we introduce improvements by integrating a nearest neighbor search algorithm to enhance both the efficiency and accuracy of the algorithm.The original TEDA algorithm,grounded in the concept of“Typicality and Eccentricity Data Analytics”,represents an evolving and recursive method that requires no prior knowledge.While the algorithm autonomously creates and merges clusters as new data arrives,its efficiency is significantly hindered by the need to traverse all existing clusters upon the arrival of further data.This work presents the NS-TEDA(Neighbor Search Based Typicality and Eccentricity Data Analysis)algorithm by incorporating a KD-Tree(K-Dimensional Tree)algorithm integrated with the Scapegoat Tree.Upon arrival,this ensures that new data points interact solely with clusters in very close proximity.This significantly enhances algorithm efficiency while preventing a single data point from joining too many clusters and mitigating the merging of clusters with high overlap to some extent.We apply the NS-TEDA algorithm to several well-known datasets,comparing its performance with other data stream clustering algorithms and the original TEDA algorithm.The results demonstrate that the proposed algorithm achieves higher accuracy,and its runtime exhibits almost linear dependence on the volume of data,making it more suitable for large-scale data stream analysis research.展开更多
Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary w...Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary with a deformation condition.This study proposes a novel approach for accurately predicting an anisotropic deformation behavior of wrought Mg alloys using machine learning(ML)with data augmentation.The developed model combines four key strategies from data science:learning the entire flow curves,generative adversarial networks(GAN),algorithm-driven hyperparameter tuning,and gated recurrent unit(GRU)architecture.The proposed model,namely GAN-aided GRU,was extensively evaluated for various predictive scenarios,such as interpolation,extrapolation,and a limited dataset size.The model exhibited significant predictability and improved generalizability for estimating the anisotropic compressive behavior of ZK60 Mg alloys under 11 annealing conditions and for three loading directions.The GAN-aided GRU results were superior to those of previous ML models and constitutive equations.The superior performance was attributed to hyperparameter optimization,GAN-based data augmentation,and the inherent predictivity of the GRU for extrapolation.As a first attempt to employ ML techniques other than artificial neural networks,this study proposes a novel perspective on predicting the anisotropic deformation behaviors of wrought Mg alloys.展开更多
Data is regarded as a valuable asset,and sharing data is a prerequisite for fully exploiting the value of data.However,the current medical data sharing scheme lacks a fair incentive mechanism,and the authenticity of d...Data is regarded as a valuable asset,and sharing data is a prerequisite for fully exploiting the value of data.However,the current medical data sharing scheme lacks a fair incentive mechanism,and the authenticity of data cannot be guaranteed,resulting in low enthusiasm of participants.A fair and trusted medical data trading scheme based on smart contracts is proposed,which aims to encourage participants to be honest and improve their enthusiasm for participation.The scheme uses zero-knowledge range proof for trusted verification,verifies the authenticity of the patient’s data and the specific attributes of the data before the transaction,and realizes privacy protection.At the same time,the game pricing strategy selects the best revenue strategy for all parties involved and realizes the fairness and incentive of the transaction price.The smart contract is used to complete the verification and game bargaining process,and the blockchain is used as a distributed ledger to record the medical data transaction process to prevent data tampering and transaction denial.Finally,by deploying smart contracts on the Ethereum test network and conducting experiments and theoretical calculations,it is proved that the transaction scheme achieves trusted verification and fair bargaining while ensuring privacy protection in a decentralized environment.The experimental results show that the model improves the credibility and fairness of medical data transactions,maximizes social benefits,encourages more patients and medical institutions to participate in the circulation of medical data,and more fully taps the potential value of medical data.展开更多
El Nino-Southern Oscillation(ENSO),the leading mode of global interannual variability,usually intensifies the Hadley Circulation(HC),and meanwhile constrains its meridional extension,leading to an equatorward movement...El Nino-Southern Oscillation(ENSO),the leading mode of global interannual variability,usually intensifies the Hadley Circulation(HC),and meanwhile constrains its meridional extension,leading to an equatorward movement of the jet system.Previous studies have investigated the response of HC to ENSO events using different reanalysis datasets and evaluated their capability in capturing the main features of ENSO-associated HC anomalies.However,these studies mainly focused on the global HC,represented by a zonal-mean mass stream function(MSF).Comparatively fewer studies have evaluated HC responses from a regional perspective,partly due to the prerequisite of the Stokes MSF,which prevents us from integrating a regional HC.In this study,we adopt a recently developed technique to construct the three-dimensional structure of HC and evaluate the capability of eight state-of-the-art reanalyses in reproducing the regional HC response to ENSO events.Results show that all eight reanalyses reproduce the spatial structure of HC responses well,with an intensified HC around the central-eastern Pacific but weakened circulations around the Indo-Pacific warm pool and tropical Atlantic.The spatial correlation coefficient of the three-dimensional HC anomalies among the different datasets is always larger than 0.93.However,these datasets may not capture the amplitudes of the HC responses well.This uncertainty is especially large for ENSO-associated equatorially asymmetric HC anomalies,with the maximum amplitude in Climate Forecast System Reanalysis(CFSR)being about 2.7 times the minimum value in the Twentieth Century Reanalysis(20CR).One should be careful when using reanalysis data to evaluate the intensity of ENSO-associated HC anomalies.展开更多
Effective data communication is a crucial aspect of the Social Internet of Things(SIoT)and continues to be a significant research focus.This paper proposes a data forwarding algorithm based on Multidimensional Social ...Effective data communication is a crucial aspect of the Social Internet of Things(SIoT)and continues to be a significant research focus.This paper proposes a data forwarding algorithm based on Multidimensional Social Relations(MSRR)in SIoT to solve this problem.The proposed algorithm separates message forwarding into intra-and cross-community forwarding by analyzing interest traits and social connections among nodes.Three new metrics are defined:the intensity of node social relationships,node activity,and community connectivity.Within the community,messages are sent by determining which node is most similar to the sender by weighing the strength of social connections and node activity.When a node performs cross-community forwarding,the message is forwarded to the most reasonable relay community by measuring the node activity and the connection between communities.The proposed algorithm was compared to three existing routing algorithms in simulation experiments.Results indicate that the proposed algorithmsubstantially improves message delivery efficiency while lessening network overhead and enhancing connectivity and coordination in the SIoT context.展开更多
基金supported by the Science and Technology Research and Development Plan of China State Railway Group Co.,Ltd.(L2023Z001).
文摘This study addressed the issues related to the collection and management of basic data for railway green performance. A railway green performance basic database has been constructed based on metadata and data exchange schemas. A data classification system has been established from the perspectives of businesses, processes,and entities. A BIM(Building Information Modelling) model data extraction scheme is proposed based on field similarity matching and a document content extraction scheme is proposed based on image recognition. A railway green performance basic data collection system has been developed, achieving efficient collection and integrated management of railway green performance basic data. This system can provide data support for applications such as railway carbon emissions accounting, green cost-benefit analysis, and evaluation of green design solutions.
文摘A book entitled BASIC DATA OF CHINA'S POPULATION edited by the Data User Service(DUS),a UNFPA-funded project implemented by the China Population Information and Research Centre(CPIRC),has recently been published by the China Population Publishing House.
基金Special Project for Renovation and Procurement of Donghua University,Ministry of Education,China (No. CG202002845)。
文摘In the era of big data, data application based on data governance has become an inevitable trend in the construction of smart campus in higher education. In this paper, a set of data governance system framework covering the whole life cycle of data suitable for higher education is proposed, and based on this, the ideas and methods of data governance are applied to the construction of data management system for the basic development status of faculties by combining the practice of data governance of Donghua University.It forms a closed-loop management of data in all aspects, such as collection, information feedback, and statistical analysis of the basic development status data of the college. While optimizing the management business of higher education, the system provides a scientific and reliable basis for precise decision-making and strategic development of higher education.
基金sponsored by the National Natural Science Foundation of China under grant number No. 62172353, No. 62302114, No. U20B2046 and No. 62172115Innovation Fund Program of the Engineering Research Center for Integration and Application of Digital Learning Technology of Ministry of Education No.1331007 and No. 1311022+1 种基金Natural Science Foundation of the Jiangsu Higher Education Institutions Grant No. 17KJB520044Six Talent Peaks Project in Jiangsu Province No.XYDXX-108
文摘With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.
基金This research was funded by the National Nature Sciences Foundation of China(Grant No.42250410321).
文摘Missing value is one of the main factors that cause dirty data.Without high-quality data,there will be no reliable analysis results and precise decision-making.Therefore,the data warehouse needs to integrate high-quality data consistently.In the power system,the electricity consumption data of some large users cannot be normally collected resulting in missing data,which affects the calculation of power supply and eventually leads to a large error in the daily power line loss rate.For the problem of missing electricity consumption data,this study proposes a group method of data handling(GMDH)based data interpolation method in distribution power networks and applies it in the analysis of actually collected electricity data.First,the dependent and independent variables are defined from the original data,and the upper and lower limits of missing values are determined according to prior knowledge or existing data information.All missing data are randomly interpolated within the upper and lower limits.Then,the GMDH network is established to obtain the optimal complexity model,which is used to predict the missing data to replace the last imputed electricity consumption data.At last,this process is implemented iteratively until the missing values do not change.Under a relatively small noise level(α=0.25),the proposed approach achieves a maximum error of no more than 0.605%.Experimental findings demonstrate the efficacy and feasibility of the proposed approach,which realizes the transformation from incomplete data to complete data.Also,this proposed data interpolation approach provides a strong basis for the electricity theft diagnosis and metering fault analysis of electricity enterprises.
基金This research was financially supported by the Ministry of Trade,Industry,and Energy(MOTIE),Korea,under the“Project for Research and Development with Middle Markets Enterprises and DNA(Data,Network,AI)Universities”(AI-based Safety Assessment and Management System for Concrete Structures)(ReferenceNumber P0024559)supervised by theKorea Institute for Advancement of Technology(KIAT).
文摘Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal dependence,and noise.Therefore,methodologies for data augmentation and conversion of time series data into images for analysis have been studied.This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance,temporal dependence,and robustness to noise.The method of data augmentation is set as the addition of noise.It involves adding Gaussian noise,with the noise level set to 0.002,to maximize the generalization performance of the model.In addition,we use the Markov Transition Field(MTF)method to effectively visualize the dynamic transitions of the data while converting the time series data into images.It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data.For anomaly detection,the PatchCore model is applied to show excellent performance,and the detected anomaly areas are represented as heat maps.It allows for the detection of anomalies,and by applying an anomaly map to the original image,it is possible to capture the areas where anomalies occur.The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images.Additionally,when processed as images rather than as time series data,there was a significant reduction in both the size of the data and the training time.The proposed method can provide an important springboard for research in the field of anomaly detection using time series data.Besides,it helps solve problems such as analyzing complex patterns in data lightweight.
文摘Large-scale wireless sensor networks(WSNs)play a critical role in monitoring dangerous scenarios and responding to medical emergencies.However,the inherent instability and error-prone nature of wireless links present significant challenges,necessitating efficient data collection and reliable transmission services.This paper addresses the limitations of existing data transmission and recovery protocols by proposing a systematic end-to-end design tailored for medical event-driven cluster-based large-scale WSNs.The primary goal is to enhance the reliability of data collection and transmission services,ensuring a comprehensive and practical approach.Our approach focuses on refining the hop-count-based routing scheme to achieve fairness in forwarding reliability.Additionally,it emphasizes reliable data collection within clusters and establishes robust data transmission over multiple hops.These systematic improvements are designed to optimize the overall performance of the WSN in real-world scenarios.Simulation results of the proposed protocol validate its exceptional performance compared to other prominent data transmission schemes.The evaluation spans varying sensor densities,wireless channel conditions,and packet transmission rates,showcasing the protocol’s superiority in ensuring reliable and efficient data transfer.Our systematic end-to-end design successfully addresses the challenges posed by the instability of wireless links in large-scaleWSNs.By prioritizing fairness,reliability,and efficiency,the proposed protocol demonstrates its efficacy in enhancing data collection and transmission services,thereby offering a valuable contribution to the field of medical event-drivenWSNs.
文摘In the field of pediatric nursing, in the perinatal period, numerous ethical issues arise alongside the advancement of medical technology. However, sufficient education on bioethics is not provided in the pediatric nursing domain of basic nursing education. The purpose of this research is to examine the current status of bioethics education in the pediatric nursing domain of basic nursing education and explore the challenges perceived by the pediatric nursing faculty regarding bioethics education. The research method was a questionnaire survey on 100 randomly selected pediatric nursing faculty members from nursing universities across Japan. The results revealed that although bioethics issues were considered important, the emphasis remained primarily on addressing bioethics as part of nursing that respects children’s rights. Furthermore, respondents expressed difficulties regarding teaching methods and content related to bioethics.
文摘Guest Editors Prof.Ling Tian Prof.Jian-Hua Tao University of Electronic Science and Technology of China Tsinghua University lingtian@uestc.edu.cn jhtao@tsinghua.edu.cn Dr.Bin Zhou National University of Defense Technology binzhou@nudt.edu.cn, Since the concept of “Big Data” was first introduced in Nature in 2008, it has been widely applied in fields, such as business, healthcare, national defense, education, transportation, and security. With the maturity of artificial intelligence technology, big data analysis techniques tailored to various fields have made significant progress, but still face many challenges in terms of data quality, algorithms, and computing power.
基金supported by NOAA JTTI award via Grant #NA21OAR4590165, NOAA GOESR Program funding via Grant #NA16OAR4320115provided by NOAA/Office of Oceanic and Atmospheric Research under NOAA-University of Oklahoma Cooperative Agreement #NA11OAR4320072, U.S. Department of Commercesupported by the National Oceanic and Atmospheric Administration (NOAA) of the U.S. Department of Commerce via Grant #NA18NWS4680063。
文摘Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series ”(GOES-R) Geostationary Lightning Mapper(GLM) flash extent density(FED) data within the operational Gridpoint Statistical Interpolation ensemble Kalman filter(GSI-EnKF) framework were previously developed and tested with a mesoscale convective system(MCS) case. In this study, such capabilities are further developed to assimilate GOES GLM FED data within the GSI ensemble-variational(EnVar) hybrid data assimilation(DA) framework. The results of assimilating the GLM FED data using 3DVar, and pure En3DVar(PEn3DVar, using 100% ensemble covariance and no static covariance) are compared with those of EnKF/DfEnKF for a supercell storm case. The focus of this study is to validate the correctness and evaluate the performance of the new implementation rather than comparing the performance of FED DA among different DA schemes. Only the results of 3DVar and pEn3DVar are examined and compared with EnKF/DfEnKF. Assimilation of a single FED observation shows that the magnitude and horizontal extent of the analysis increments from PEn3DVar are generally larger than from EnKF, which is mainly caused by using different localization strategies in EnFK/DfEnKF and PEn3DVar as well as the integration limits of the graupel mass in the observation operator. Overall, the forecast performance of PEn3DVar is comparable to EnKF/DfEnKF, suggesting correct implementation.
基金supported by the National Natural Science Foundation of China(NSFC,grant Nos.42172323 and 12371454)。
文摘In source detection in the Tianlai project,locating the interferometric fringe in visibility data accurately will influence downstream tasks drastically,such as physical parameter estimation and weak source exploration.Considering that traditional locating methods are time-consuming and supervised methods require a great quantity of expensive labeled data,in this paper,we first investigate characteristics of interferometric fringes in the simulation and real scenario separately,and integrate an almost parameter-free unsupervised clustering method and seeding filling or eraser algorithm to propose a hierarchical plug and play method to improve location accuracy.Then,we apply our method to locate single and multiple sources’interferometric fringes in simulation data.Next,we apply our method to real data taken from the Tianlai radio telescope array.Finally,we compare with unsupervised methods that are state of the art.These results show that our method has robustness in different scenarios and can improve location measurement accuracy effectively.
基金supported by China’s National Natural Science Foundation(Nos.62072249,62072056)This work is also funded by the National Science Foundation of Hunan Province(2020JJ2029).
文摘With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapid development of IIoT.Blockchain technology has immutability,decentralization,and autonomy,which can greatly improve the inherent defects of the IIoT.In the traditional blockchain,data is stored in a Merkle tree.As data continues to grow,the scale of proofs used to validate it grows,threatening the efficiency,security,and reliability of blockchain-based IIoT.Accordingly,this paper first analyzes the inefficiency of the traditional blockchain structure in verifying the integrity and correctness of data.To solve this problem,a new Vector Commitment(VC)structure,Partition Vector Commitment(PVC),is proposed by improving the traditional VC structure.Secondly,this paper uses PVC instead of the Merkle tree to store big data generated by IIoT.PVC can improve the efficiency of traditional VC in the process of commitment and opening.Finally,this paper uses PVC to build a blockchain-based IIoT data security storage mechanism and carries out a comparative analysis of experiments.This mechanism can greatly reduce communication loss and maximize the rational use of storage space,which is of great significance for maintaining the security and stability of blockchain-based IIoT.
文摘The 6th generation mobile networks(6G)network is a kind of multi-network interconnection and multi-scenario coexistence network,where multiple network domains break the original fixed boundaries to form connections and convergence.In this paper,with the optimization objective of maximizing network utility while ensuring flows performance-centric weighted fairness,this paper designs a reinforcement learning-based cloud-edge autonomous multi-domain data center network architecture that achieves single-domain autonomy and multi-domain collaboration.Due to the conflict between the utility of different flows,the bandwidth fairness allocation problem for various types of flows is formulated by considering different defined reward functions.Regarding the tradeoff between fairness and utility,this paper deals with the corresponding reward functions for the cases where the flows undergo abrupt changes and smooth changes in the flows.In addition,to accommodate the Quality of Service(QoS)requirements for multiple types of flows,this paper proposes a multi-domain autonomous routing algorithm called LSTM+MADDPG.Introducing a Long Short-Term Memory(LSTM)layer in the actor and critic networks,more information about temporal continuity is added,further enhancing the adaptive ability changes in the dynamic network environment.The LSTM+MADDPG algorithm is compared with the latest reinforcement learning algorithm by conducting experiments on real network topology and traffic traces,and the experimental results show that LSTM+MADDPG improves the delay convergence speed by 14.6%and delays the start moment of packet loss by 18.2%compared with other algorithms.
基金supported by the open research fund of Key Lab of Broadband Wireless Communication and Sensor Network Technology(Nanjing University of Posts and Telecommunications),Ministry of Education(No.JZNY202114)Postgraduate Research&Practice Innovation Program of Jiangsu Province(No.KYCX210734).
文摘Traditional Io T systems suffer from high equipment management costs and difficulty in trustworthy data sharing caused by centralization.Blockchain provides a feasible research direction to solve these problems. The main challenge at this stage is to integrate the blockchain from the resourceconstrained Io T devices and ensure the data of Io T system is credible. We provide a general framework for intelligent Io T data acquisition and sharing in an untrusted environment based on the blockchain, where gateways become Oracles. A distributed Oracle network based on Byzantine Fault Tolerant algorithm is used to provide trusted data for the blockchain to make intelligent Io T data trustworthy. An aggregation contract is deployed to collect data from various Oracle and share the credible data to all on-chain users. We also propose a gateway data aggregation scheme based on the REST API event publishing/subscribing mechanism which uses SQL to achieve flexible data aggregation. The experimental results show that the proposed scheme can alleviate the problem of limited performance of Io T equipment, make data reliable, and meet the diverse data needs on the chain.
基金This research was funded by the National Natural Science Foundation of China(Grant No.72001190)by the Ministry of Education’s Humanities and Social Science Project via the China Ministry of Education(Grant No.20YJC630173)by Zhejiang A&F University(Grant No.2022LFR062).
文摘Data stream clustering is integral to contemporary big data applications.However,addressing the ongoing influx of data streams efficiently and accurately remains a primary challenge in current research.This paper aims to elevate the efficiency and precision of data stream clustering,leveraging the TEDA(Typicality and Eccentricity Data Analysis)algorithm as a foundation,we introduce improvements by integrating a nearest neighbor search algorithm to enhance both the efficiency and accuracy of the algorithm.The original TEDA algorithm,grounded in the concept of“Typicality and Eccentricity Data Analytics”,represents an evolving and recursive method that requires no prior knowledge.While the algorithm autonomously creates and merges clusters as new data arrives,its efficiency is significantly hindered by the need to traverse all existing clusters upon the arrival of further data.This work presents the NS-TEDA(Neighbor Search Based Typicality and Eccentricity Data Analysis)algorithm by incorporating a KD-Tree(K-Dimensional Tree)algorithm integrated with the Scapegoat Tree.Upon arrival,this ensures that new data points interact solely with clusters in very close proximity.This significantly enhances algorithm efficiency while preventing a single data point from joining too many clusters and mitigating the merging of clusters with high overlap to some extent.We apply the NS-TEDA algorithm to several well-known datasets,comparing its performance with other data stream clustering algorithms and the original TEDA algorithm.The results demonstrate that the proposed algorithm achieves higher accuracy,and its runtime exhibits almost linear dependence on the volume of data,making it more suitable for large-scale data stream analysis research.
基金Korea Institute of Energy Technology Evaluation and Planning(KETEP)grant funded by the Korea government(Grant No.20214000000140,Graduate School of Convergence for Clean Energy Integrated Power Generation)Korea Basic Science Institute(National Research Facilities and Equipment Center)grant funded by the Ministry of Education(2021R1A6C101A449)the National Research Foundation of Korea grant funded by the Ministry of Science and ICT(2021R1A2C1095139),Republic of Korea。
文摘Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary with a deformation condition.This study proposes a novel approach for accurately predicting an anisotropic deformation behavior of wrought Mg alloys using machine learning(ML)with data augmentation.The developed model combines four key strategies from data science:learning the entire flow curves,generative adversarial networks(GAN),algorithm-driven hyperparameter tuning,and gated recurrent unit(GRU)architecture.The proposed model,namely GAN-aided GRU,was extensively evaluated for various predictive scenarios,such as interpolation,extrapolation,and a limited dataset size.The model exhibited significant predictability and improved generalizability for estimating the anisotropic compressive behavior of ZK60 Mg alloys under 11 annealing conditions and for three loading directions.The GAN-aided GRU results were superior to those of previous ML models and constitutive equations.The superior performance was attributed to hyperparameter optimization,GAN-based data augmentation,and the inherent predictivity of the GRU for extrapolation.As a first attempt to employ ML techniques other than artificial neural networks,this study proposes a novel perspective on predicting the anisotropic deformation behaviors of wrought Mg alloys.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021201052)。
文摘Data is regarded as a valuable asset,and sharing data is a prerequisite for fully exploiting the value of data.However,the current medical data sharing scheme lacks a fair incentive mechanism,and the authenticity of data cannot be guaranteed,resulting in low enthusiasm of participants.A fair and trusted medical data trading scheme based on smart contracts is proposed,which aims to encourage participants to be honest and improve their enthusiasm for participation.The scheme uses zero-knowledge range proof for trusted verification,verifies the authenticity of the patient’s data and the specific attributes of the data before the transaction,and realizes privacy protection.At the same time,the game pricing strategy selects the best revenue strategy for all parties involved and realizes the fairness and incentive of the transaction price.The smart contract is used to complete the verification and game bargaining process,and the blockchain is used as a distributed ledger to record the medical data transaction process to prevent data tampering and transaction denial.Finally,by deploying smart contracts on the Ethereum test network and conducting experiments and theoretical calculations,it is proved that the transaction scheme achieves trusted verification and fair bargaining while ensuring privacy protection in a decentralized environment.The experimental results show that the model improves the credibility and fairness of medical data transactions,maximizes social benefits,encourages more patients and medical institutions to participate in the circulation of medical data,and more fully taps the potential value of medical data.
基金supported by the National Key Research and Development Program of China(Grant No.2018YFA0605703)the National Natural Science Foundation of China(Grant Nos.42176243,41976193 and 41676190)supported by National Natural Science Foundation of China(Grant No.41975079)。
文摘El Nino-Southern Oscillation(ENSO),the leading mode of global interannual variability,usually intensifies the Hadley Circulation(HC),and meanwhile constrains its meridional extension,leading to an equatorward movement of the jet system.Previous studies have investigated the response of HC to ENSO events using different reanalysis datasets and evaluated their capability in capturing the main features of ENSO-associated HC anomalies.However,these studies mainly focused on the global HC,represented by a zonal-mean mass stream function(MSF).Comparatively fewer studies have evaluated HC responses from a regional perspective,partly due to the prerequisite of the Stokes MSF,which prevents us from integrating a regional HC.In this study,we adopt a recently developed technique to construct the three-dimensional structure of HC and evaluate the capability of eight state-of-the-art reanalyses in reproducing the regional HC response to ENSO events.Results show that all eight reanalyses reproduce the spatial structure of HC responses well,with an intensified HC around the central-eastern Pacific but weakened circulations around the Indo-Pacific warm pool and tropical Atlantic.The spatial correlation coefficient of the three-dimensional HC anomalies among the different datasets is always larger than 0.93.However,these datasets may not capture the amplitudes of the HC responses well.This uncertainty is especially large for ENSO-associated equatorially asymmetric HC anomalies,with the maximum amplitude in Climate Forecast System Reanalysis(CFSR)being about 2.7 times the minimum value in the Twentieth Century Reanalysis(20CR).One should be careful when using reanalysis data to evaluate the intensity of ENSO-associated HC anomalies.
基金supported by the NationalNatural Science Foundation of China(61972136)the Hubei Provincial Department of Education Outstanding Youth Scientific Innovation Team Support Foundation(T201410,T2020017)+1 种基金the Natural Science Foundation of Xiaogan City(XGKJ2022010095,XGKJ2022010094)the Science and Technology Research Project of Education Department of Hubei Province(No.Q20222704).
文摘Effective data communication is a crucial aspect of the Social Internet of Things(SIoT)and continues to be a significant research focus.This paper proposes a data forwarding algorithm based on Multidimensional Social Relations(MSRR)in SIoT to solve this problem.The proposed algorithm separates message forwarding into intra-and cross-community forwarding by analyzing interest traits and social connections among nodes.Three new metrics are defined:the intensity of node social relationships,node activity,and community connectivity.Within the community,messages are sent by determining which node is most similar to the sender by weighing the strength of social connections and node activity.When a node performs cross-community forwarding,the message is forwarded to the most reasonable relay community by measuring the node activity and the connection between communities.The proposed algorithm was compared to three existing routing algorithms in simulation experiments.Results indicate that the proposed algorithmsubstantially improves message delivery efficiency while lessening network overhead and enhancing connectivity and coordination in the SIoT context.