Big data has had significant impacts on our lives,economies,academia and industries over the past decade.The current equations are:What is the future of big data?What era do we live in?This article addresses these que...Big data has had significant impacts on our lives,economies,academia and industries over the past decade.The current equations are:What is the future of big data?What era do we live in?This article addresses these questions by looking at meta as an operation and argues that we are living in the era of big intelligence through analyzing from meta(big data)to big intelligence.More specifically,this article will analyze big data from an evolutionary perspective.The article overviews data,information,knowledge,and intelligence(DIKI)and reveals their relationships.After analyzing meta as an operation,this article explores Meta(DIKE)and its relationship.It reveals 5 Bigs consisting of big data,big information,big knowledge,big intelligence and big analytics.Applying meta on 5 Bigs,this article infers that 4 Big Data 4.0=meta(big data)=big intelligence.This article analyzes how intelligent big analytics support big intelligence.The proposed approach in this research might facilitate the research and development of big data,big data analytics,business intelligence,artificial intelligence,and data science.展开更多
As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by ...As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by everyone.To this end,we discuss some of our explorations and attempts in the construction and teaching process of big data courses for the major of big data management and application from the perspective of course planning,course implementation,and course summary.After interviews with students and feedback from questionnaires,students are highly satisfied with some of the teaching measures and programs currently adopted.展开更多
There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction...There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction temperature estimation approach based on neural network without additional cost is proposed and the lifetime calculation for IGBT using electric vehicle big data is performed.The direct current(DC)voltage,operation current,switching frequency,negative thermal coefficient thermistor(NTC)temperature and IGBT lifetime are inputs.And the junction temperature(Tj)is output.With the rain flow counting method,the classified irregular temperatures are brought into the life model for the failure cycles.The fatigue accumulation method is then used to calculate the IGBT lifetime.To solve the limited computational and storage resources of electric vehicle controllers,the operation of IGBT lifetime calculation is running on a big data platform.The lifetime is then transmitted wirelessly to electric vehicles as input for neural network.Thus the junction temperature of IGBT under long-term operating conditions can be accurately estimated.A test platform of the motor controller combined with the vehicle big data server is built for the IGBT accelerated aging test.Subsequently,the IGBT lifetime predictions are derived from the junction temperature estimation by the neural network method and the thermal network method.The experiment shows that the lifetime prediction based on a neural network with big data demonstrates a higher accuracy than that of the thermal network,which improves the reliability evaluation of system.展开更多
Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Sma...Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Small and medium sized enterprises(SMEs)are the backbone of the global economy,comprising of 90%of businesses worldwide.However,only 10%SMEs have adopted big data analytics despite the competitive advantage they could achieve.Previous research has analysed the barriers to adoption and a strategic framework has been developed to help SMEs adopt big data analytics.The framework was converted into a scoring tool which has been applied to multiple case studies of SMEs in the UK.This paper documents the process of evaluating the framework based on the structured feedback from a focus group composed of experienced practitioners.The results of the evaluation are presented with a discussion on the results,and the paper concludes with recommendations to improve the scoring tool based on the proposed framework.The research demonstrates that this positioning tool is beneficial for SMEs to achieve competitive advantages by increasing the application of business intelligence and big data analytics.展开更多
As big data becomes an apparent challenge to handle when building a business intelligence(BI)system,there is a motivation to handle this challenging issue in higher education institutions(HEIs).Monitoring quality in H...As big data becomes an apparent challenge to handle when building a business intelligence(BI)system,there is a motivation to handle this challenging issue in higher education institutions(HEIs).Monitoring quality in HEIs encompasses handling huge amounts of data coming from different sources.This paper reviews big data and analyses the cases from the literature regarding quality assurance(QA)in HEIs.It also outlines a framework that can address the big data challenge in HEIs to handle QA monitoring using BI dashboards and a prototype dashboard is presented in this paper.The dashboard was developed using a utilisation tool to monitor QA in HEIs to provide visual representations of big data.The prototype dashboard enables stakeholders to monitor compliance with QA standards while addressing the big data challenge associated with the substantial volume of data managed by HEIs’QA systems.This paper also outlines how the developed system integrates big data from social media into the monitoring dashboard.展开更多
Investigating the role of Big Five personality traits in relation to various health outcomes has been extensively studied. The impact of “Big Five” on physical health is here explored for older Europeans with a focu...Investigating the role of Big Five personality traits in relation to various health outcomes has been extensively studied. The impact of “Big Five” on physical health is here explored for older Europeans with a focus on examining age groups differences. The study sample included 378,500 respondents derived from the seventh data wave of Survey of Health, Aging and Retirement in Europe (SHARE). The physical health status of older Europeans was estimated by constructing an index considering the combined effect of well-established health indicators such as the number of chronic diseases, mobility limitations, limitations with basic and instrumental activities of daily living, and self-perceived health. This index was used for an overall physical health assessment, for which the higher the score for an individual, the worst health level. Then, through a dichotomization process applied to the retrieved Principal Component Analysis scores, a two-group discrimination (good or bad health status) of SHARE participants was obtained as regards their physical health condition, allowing for further con-structing logistic regression models to assess the predictive significance of “Big Five” and their protective role for physical health. Results showed that neuroti-cism was the most significant predictor of physical health for all age groups un-der consideration, while extraversion, agreeableness and openness were not found to significantly affect the self-reported physical health levels of midlife adults aged 50 up to 64. Older adults aged 65 up to 79 were more prone to open-ness, whereas the oldest old individuals aged 80 up to 105 were mainly affected by openness and conscientiousness. .展开更多
Analyzing big data, especially medical data, helps to provide good health care to patients and face the risks of death. The COVID-19 pandemic has had a significant impact on public health worldwide, emphasizing the ne...Analyzing big data, especially medical data, helps to provide good health care to patients and face the risks of death. The COVID-19 pandemic has had a significant impact on public health worldwide, emphasizing the need for effective risk prediction models. Machine learning (ML) techniques have shown promise in analyzing complex data patterns and predicting disease outcomes. The accuracy of these techniques is greatly affected by changing their parameters. Hyperparameter optimization plays a crucial role in improving model performance. In this work, the Particle Swarm Optimization (PSO) algorithm was used to effectively search the hyperparameter space and improve the predictive power of the machine learning models by identifying the optimal hyperparameters that can provide the highest accuracy. A dataset with a variety of clinical and epidemiological characteristics linked to COVID-19 cases was used in this study. Various machine learning models, including Random Forests, Decision Trees, Support Vector Machines, and Neural Networks, were utilized to capture the complex relationships present in the data. To evaluate the predictive performance of the models, the accuracy metric was employed. The experimental findings showed that the suggested method of estimating COVID-19 risk is effective. When compared to baseline models, the optimized machine learning models performed better and produced better results.展开更多
This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technol...This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technology,focusing on analyzing the current application status of computer science and technology in big data,including data storage,data processing,and data analysis.Then,it proposes development strategies for big data processing.Computer science and technology play a vital role in big data processing by providing strong technical support.展开更多
The networks are fundamental to our modern world and they appear throughout science and society.Access to a massive amount of data presents a unique opportunity to the researcher’s community.As networks grow in size ...The networks are fundamental to our modern world and they appear throughout science and society.Access to a massive amount of data presents a unique opportunity to the researcher’s community.As networks grow in size the complexity increases and our ability to analyze them using the current state of the art is at severe risk of failing to keep pace.Therefore,this paper initiates a discussion on graph signal processing for large-scale data analysis.We first provide a comprehensive overview of core ideas in Graph signal processing(GSP)and their connection to conventional digital signal processing(DSP).We then summarize recent developments in developing basic GSP tools,including methods for graph filtering or graph learning,graph signal,graph Fourier transform(GFT),spectrum,graph frequency,etc.Graph filtering is a basic task that allows for isolating the contribution of individual frequencies and therefore enables the removal of noise.We then consider a graph filter as a model that helps to extend the application of GSP methods to large datasets.To show the suitability and the effeteness,we first created a noisy graph signal and then applied it to the filter.After several rounds of simulation results.We see that the filtered signal appears to be smoother and is closer to the original noise-free distance-based signal.By using this example application,we thoroughly demonstrated that graph filtration is efficient for big data analytics.展开更多
In recent years,it has been observed that the disclosure of information increases the risk of terrorism.Without restricting the accessibility of information,providing security is difficult.So,there is a demand for tim...In recent years,it has been observed that the disclosure of information increases the risk of terrorism.Without restricting the accessibility of information,providing security is difficult.So,there is a demand for time tofill the gap between security and accessibility of information.In fact,security tools should be usable for improving the security as well as the accessibility of information.Though security and accessibility are not directly influenced,some of their factors are indirectly influenced by each other.Attributes play an important role in bridging the gap between security and accessibility.In this paper,we identify the key attributes of accessibility and security that impact directly and indirectly on each other,such as confidentiality,integrity,availability,and severity.The significance of every attribute on the basis of obtained weight is important for its effect on security during the big data security life cycle process.To calculate the proposed work,researchers utilised the Fuzzy Analytic Hierarchy Process(Fuzzy AHP).Thefindings show that the Fuzzy AHP is a very accurate mechanism for determining the best security solution in a real-time healthcare context.The study also looks at the rapidly evolving security technologies in healthcare that could help improve healthcare services and the future prospects in this area.展开更多
A vast amount of data (known as big data) may now be collected and stored from a variety of data sources, including event logs, the internet, smartphones, databases, sensors, cloud computing, and Internet of Things (I...A vast amount of data (known as big data) may now be collected and stored from a variety of data sources, including event logs, the internet, smartphones, databases, sensors, cloud computing, and Internet of Things (IoT) devices. The term “big data security” refers to all the safeguards and instruments used to protect both the data and analytics processes against intrusions, theft, and other hostile actions that could endanger or adversely influence them. Beyond being a high-value and desirable target, protecting Big Data has particular difficulties. Big Data security does not fundamentally differ from conventional data security. Big Data security issues are caused by extraneous distinctions rather than fundamental ones. This study meticulously outlines the numerous security difficulties Large Data analytics now faces and encourages additional joint research for reducing both big data security challenges utilizing Ontology Web Language (OWL). Although we focus on the Security Challenges of Big Data in this essay, we will also briefly cover the broader Challenges of Big Data. The proposed classification of Big Data security based on ontology web language resulting from the protégé software has 32 classes and 45 subclasses.展开更多
In the analysis of big data,deep learn-ing is a crucial technique.Big data analysis tasks are typically carried out on the cloud since it offers strong computer capabilities and storage areas.Nev-ertheless,there is a ...In the analysis of big data,deep learn-ing is a crucial technique.Big data analysis tasks are typically carried out on the cloud since it offers strong computer capabilities and storage areas.Nev-ertheless,there is a contradiction between the open nature of the cloud and the demand that data own-ers maintain their privacy.To use cloud resources for privacy-preserving data training,a viable method must be found.A privacy-preserving deep learning model(PPDLM)is suggested in this research to ad-dress this preserving issue.To preserve data privacy,we first encrypted the data using homomorphic en-cryption(HE)approach.Moreover,the deep learn-ing algorithm’s activation function—the sigmoid func-tion—uses the least-squares method to process non-addition and non-multiplication operations that are not allowed by homomorphic.Finally,experimental re-sults show that PPDLM has a significant effect on the protection of data privacy information.Compared with Non-Privacy Preserving Deep Learning Model(NPPDLM),PPDLM has higher computational effi-ciency.展开更多
Big data is a concept that deals with large or complex data sets by using data analysis tools(e.g.,data mining,machine learning)to analyze information extracted from several sources systematically.Big data has attract...Big data is a concept that deals with large or complex data sets by using data analysis tools(e.g.,data mining,machine learning)to analyze information extracted from several sources systematically.Big data has attracted wide attention from academia,for example,in supporting patients and health professionals by improving the accuracy of decision-making,diagnosis and disease prediction.This research aimed to perform a Bibliometric Performance and Network Analysis(BPNA)supported by a Scoping Review(SR)to depict the strategic themes,thematic evolution structure,main challenges and opportunities related to the concept of big data applied in the healthcare sector.With this goal in mind,4857 documents from the Web of Science covering the period between 2009 to June 2020 were analyzed with the support of SciMAT software.The bibliometric performance showed the number of publications and citations over time,scientific productivity and the geographic distribution of publications and research fields.The strategic diagram yielded 20 clusters and their relative importance in terms of centrality and density.The thematic evolution structure presented the most important themes and how it changes over time.Lastly,we presented the main challenges and future opportunities of big data in healthcare.展开更多
Big data on product sales are an emerging resource for supporting modular product design to meet diversified customers’requirements of product specification combinations.To better facilitate decision-making of modula...Big data on product sales are an emerging resource for supporting modular product design to meet diversified customers’requirements of product specification combinations.To better facilitate decision-making of modular product design,correlations among specifications and components originated from customers’conscious and subconscious preferences can be investigated by using big data on product sales.This study proposes a framework and the associated methods for supporting modular product design decisions based on correlation analysis of product specifications and components using big sales data.The correlations of the product specifications are determined by analyzing the collected product sales data.By building the relations between the product components and specifications,a matrix for measuring the correlation among product components is formed for component clustering.Six rules for supporting the decision making of modular product design are proposed based on the frequency analysis of the specification values per component cluster.A case study of electric vehicles illustrates the application of the proposed method.展开更多
There are quintillions of data on deoxyribonucleic acid(DNA)and protein in publicly accessible data banks,and that number is expanding at an exponential rate.Many scientific fields,such as bioinformatics and drug disc...There are quintillions of data on deoxyribonucleic acid(DNA)and protein in publicly accessible data banks,and that number is expanding at an exponential rate.Many scientific fields,such as bioinformatics and drug discovery,rely on such data;nevertheless,gathering and extracting data from these resources is a tough undertaking.This data should go through several processes,including mining,data processing,analysis,and classification.This study proposes software that extracts data from big data repositories automatically and with the particular ability to repeat data extraction phases as many times as needed without human intervention.This software simulates the extraction of data from web-based(point-and-click)resources or graphical user interfaces that cannot be accessed using command-line tools.The software was evaluated by creating a novel database of 34 parameters for 1360 physicochemical properties of antimicrobial peptides(AMP)sequences(46240 hits)from various MARVIN software panels,which can be later utilized to develop novel AMPs.Furthermore,for machine learning research,the program was validated by extracting 10,000 protein tertiary structures from the Protein Data Bank.As a result,data collection from the web will become faster and less expensive,with no need for manual data extraction.The software is critical as a first step to preparing large datasets for subsequent stages of analysis,such as those using machine and deep-learning applications.展开更多
A large number of Web APIs have been released as services in mobile communications,but the service provided by a single Web API is usually limited.To enrich the services in mobile communications,developers have combin...A large number of Web APIs have been released as services in mobile communications,but the service provided by a single Web API is usually limited.To enrich the services in mobile communications,developers have combined Web APIs and developed a new service,which is known as a mashup.The emergence of mashups greatly increases the number of services in mobile communications,especially in mobile networks and the Internet-of-Things(IoT),and has encouraged companies and individuals to develop even more mashups,which has led to the dramatic increase in the number of mashups.Such a trend brings with it big data,such as the massive text data from the mashups themselves and continually-generated usage data.Thus,the question of how to determine the most suitable mashups from big data has become a challenging problem.In this paper,we propose a mashup recommendation framework from big data in mobile networks and the IoT.The proposed framework is driven by machine learning techniques,including neural embedding,clustering,and matrix factorization.We employ neural embedding to learn the distributed representation of mashups and propose to use cluster analysis to learn the relationship among the mashups.We also develop a novel Joint Matrix Factorization(JMF)model to complete the mashup recommendation task,where we design a new objective function and an optimization algorithm.We then crawl through a real-world large mashup dataset and perform experiments.The experimental results demonstrate that our framework achieves high accuracy in mashup recommendation and performs better than all compared baselines.展开更多
Blast furnace (BF) ironmaking is the most typical “black box” process, and its complexity and uncertainty bring forth great challenges for furnace condition judgment and BF operation. Rich data resources for BF iron...Blast furnace (BF) ironmaking is the most typical “black box” process, and its complexity and uncertainty bring forth great challenges for furnace condition judgment and BF operation. Rich data resources for BF ironmaking are available, and the rapid development of data science and intelligent technology will provide an effective means to solve the uncertainty problem in the BF ironmaking process. This work focused on the application of artificial intelligence technology in BF ironmaking. The current intelligent BF ironmaking technology was summarized and analyzed from five aspects. These aspects include BF data management, the analyses of time delay and correlation, the prediction of BF key variables, the evaluation of BF status, and the multi-objective intelligent optimization of BF operations. Solutions and suggestions were offered for the problems in the current progress, and some outlooks for future prospects and technological breakthroughs were added. To effectively improve the BF data quality, we comprehensively considered the data problems and the characteristics of algorithms and selected the data processing method scientifically. For analyzing important BF characteristics, the effect of the delay was eliminated to ensure an accurate logical relationship between the BF parameters and economic indicators. As for BF parameter prediction and BF status evaluation,a BF intelligence model that integrates data information and process mechanism was built to effectively achieve the accurate prediction of BF key indexes and the scientific evaluation of BF status. During the optimization of BF parameters, low risk, low cost, and high return were used as the optimization criteria, and while pursuing the optimization effect, the feasibility and site operation cost were considered comprehensively.This work will help increase the process operator’s overall awareness and understanding of intelligent BF technology. Additionally, combining big data technology with the process will improve the practicality of data models in actual production and promote the application of intelligent technology in BF ironmaking.展开更多
Since the concept of“Big Data”was first introduced in Nature in 2008,it has been widely applied in fields,such as business,healthcare,national defense,education,transportation,and security.With the maturity of artif...Since the concept of“Big Data”was first introduced in Nature in 2008,it has been widely applied in fields,such as business,healthcare,national defense,education,transportation,and security.With the maturity of artificial intelligence technology,big data analysis techniques tailored to various fields have made significant progress,but still face many challenges in terms of data quality,algorithms,and computing power.展开更多
基金This research is supported partially by the Papua New Guinea Science and Technology Secretariat(PNGSTS)under the project grant No.1-3962 PNGSTS.
文摘Big data has had significant impacts on our lives,economies,academia and industries over the past decade.The current equations are:What is the future of big data?What era do we live in?This article addresses these questions by looking at meta as an operation and argues that we are living in the era of big intelligence through analyzing from meta(big data)to big intelligence.More specifically,this article will analyze big data from an evolutionary perspective.The article overviews data,information,knowledge,and intelligence(DIKI)and reveals their relationships.After analyzing meta as an operation,this article explores Meta(DIKE)and its relationship.It reveals 5 Bigs consisting of big data,big information,big knowledge,big intelligence and big analytics.Applying meta on 5 Bigs,this article infers that 4 Big Data 4.0=meta(big data)=big intelligence.This article analyzes how intelligent big analytics support big intelligence.The proposed approach in this research might facilitate the research and development of big data,big data analytics,business intelligence,artificial intelligence,and data science.
文摘As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by everyone.To this end,we discuss some of our explorations and attempts in the construction and teaching process of big data courses for the major of big data management and application from the perspective of course planning,course implementation,and course summary.After interviews with students and feedback from questionnaires,students are highly satisfied with some of the teaching measures and programs currently adopted.
文摘There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction temperature estimation approach based on neural network without additional cost is proposed and the lifetime calculation for IGBT using electric vehicle big data is performed.The direct current(DC)voltage,operation current,switching frequency,negative thermal coefficient thermistor(NTC)temperature and IGBT lifetime are inputs.And the junction temperature(Tj)is output.With the rain flow counting method,the classified irregular temperatures are brought into the life model for the failure cycles.The fatigue accumulation method is then used to calculate the IGBT lifetime.To solve the limited computational and storage resources of electric vehicle controllers,the operation of IGBT lifetime calculation is running on a big data platform.The lifetime is then transmitted wirelessly to electric vehicles as input for neural network.Thus the junction temperature of IGBT under long-term operating conditions can be accurately estimated.A test platform of the motor controller combined with the vehicle big data server is built for the IGBT accelerated aging test.Subsequently,the IGBT lifetime predictions are derived from the junction temperature estimation by the neural network method and the thermal network method.The experiment shows that the lifetime prediction based on a neural network with big data demonstrates a higher accuracy than that of the thermal network,which improves the reliability evaluation of system.
文摘Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Small and medium sized enterprises(SMEs)are the backbone of the global economy,comprising of 90%of businesses worldwide.However,only 10%SMEs have adopted big data analytics despite the competitive advantage they could achieve.Previous research has analysed the barriers to adoption and a strategic framework has been developed to help SMEs adopt big data analytics.The framework was converted into a scoring tool which has been applied to multiple case studies of SMEs in the UK.This paper documents the process of evaluating the framework based on the structured feedback from a focus group composed of experienced practitioners.The results of the evaluation are presented with a discussion on the results,and the paper concludes with recommendations to improve the scoring tool based on the proposed framework.The research demonstrates that this positioning tool is beneficial for SMEs to achieve competitive advantages by increasing the application of business intelligence and big data analytics.
文摘As big data becomes an apparent challenge to handle when building a business intelligence(BI)system,there is a motivation to handle this challenging issue in higher education institutions(HEIs).Monitoring quality in HEIs encompasses handling huge amounts of data coming from different sources.This paper reviews big data and analyses the cases from the literature regarding quality assurance(QA)in HEIs.It also outlines a framework that can address the big data challenge in HEIs to handle QA monitoring using BI dashboards and a prototype dashboard is presented in this paper.The dashboard was developed using a utilisation tool to monitor QA in HEIs to provide visual representations of big data.The prototype dashboard enables stakeholders to monitor compliance with QA standards while addressing the big data challenge associated with the substantial volume of data managed by HEIs’QA systems.This paper also outlines how the developed system integrates big data from social media into the monitoring dashboard.
文摘Investigating the role of Big Five personality traits in relation to various health outcomes has been extensively studied. The impact of “Big Five” on physical health is here explored for older Europeans with a focus on examining age groups differences. The study sample included 378,500 respondents derived from the seventh data wave of Survey of Health, Aging and Retirement in Europe (SHARE). The physical health status of older Europeans was estimated by constructing an index considering the combined effect of well-established health indicators such as the number of chronic diseases, mobility limitations, limitations with basic and instrumental activities of daily living, and self-perceived health. This index was used for an overall physical health assessment, for which the higher the score for an individual, the worst health level. Then, through a dichotomization process applied to the retrieved Principal Component Analysis scores, a two-group discrimination (good or bad health status) of SHARE participants was obtained as regards their physical health condition, allowing for further con-structing logistic regression models to assess the predictive significance of “Big Five” and their protective role for physical health. Results showed that neuroti-cism was the most significant predictor of physical health for all age groups un-der consideration, while extraversion, agreeableness and openness were not found to significantly affect the self-reported physical health levels of midlife adults aged 50 up to 64. Older adults aged 65 up to 79 were more prone to open-ness, whereas the oldest old individuals aged 80 up to 105 were mainly affected by openness and conscientiousness. .
文摘Analyzing big data, especially medical data, helps to provide good health care to patients and face the risks of death. The COVID-19 pandemic has had a significant impact on public health worldwide, emphasizing the need for effective risk prediction models. Machine learning (ML) techniques have shown promise in analyzing complex data patterns and predicting disease outcomes. The accuracy of these techniques is greatly affected by changing their parameters. Hyperparameter optimization plays a crucial role in improving model performance. In this work, the Particle Swarm Optimization (PSO) algorithm was used to effectively search the hyperparameter space and improve the predictive power of the machine learning models by identifying the optimal hyperparameters that can provide the highest accuracy. A dataset with a variety of clinical and epidemiological characteristics linked to COVID-19 cases was used in this study. Various machine learning models, including Random Forests, Decision Trees, Support Vector Machines, and Neural Networks, were utilized to capture the complex relationships present in the data. To evaluate the predictive performance of the models, the accuracy metric was employed. The experimental findings showed that the suggested method of estimating COVID-19 risk is effective. When compared to baseline models, the optimized machine learning models performed better and produced better results.
文摘This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technology,focusing on analyzing the current application status of computer science and technology in big data,including data storage,data processing,and data analysis.Then,it proposes development strategies for big data processing.Computer science and technology play a vital role in big data processing by providing strong technical support.
基金supported in part by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(NRF-2019R1A2C1006159)and(NRF-2021R1A6A1A03039493)by the 2021 Yeungnam University Research Grant.
文摘The networks are fundamental to our modern world and they appear throughout science and society.Access to a massive amount of data presents a unique opportunity to the researcher’s community.As networks grow in size the complexity increases and our ability to analyze them using the current state of the art is at severe risk of failing to keep pace.Therefore,this paper initiates a discussion on graph signal processing for large-scale data analysis.We first provide a comprehensive overview of core ideas in Graph signal processing(GSP)and their connection to conventional digital signal processing(DSP).We then summarize recent developments in developing basic GSP tools,including methods for graph filtering or graph learning,graph signal,graph Fourier transform(GFT),spectrum,graph frequency,etc.Graph filtering is a basic task that allows for isolating the contribution of individual frequencies and therefore enables the removal of noise.We then consider a graph filter as a model that helps to extend the application of GSP methods to large datasets.To show the suitability and the effeteness,we first created a noisy graph signal and then applied it to the filter.After several rounds of simulation results.We see that the filtered signal appears to be smoother and is closer to the original noise-free distance-based signal.By using this example application,we thoroughly demonstrated that graph filtration is efficient for big data analytics.
基金Funding for this study was received from the Taif University,Taif,Saudi Arabia under the Grant No.TURSP-2020/150.
文摘In recent years,it has been observed that the disclosure of information increases the risk of terrorism.Without restricting the accessibility of information,providing security is difficult.So,there is a demand for time tofill the gap between security and accessibility of information.In fact,security tools should be usable for improving the security as well as the accessibility of information.Though security and accessibility are not directly influenced,some of their factors are indirectly influenced by each other.Attributes play an important role in bridging the gap between security and accessibility.In this paper,we identify the key attributes of accessibility and security that impact directly and indirectly on each other,such as confidentiality,integrity,availability,and severity.The significance of every attribute on the basis of obtained weight is important for its effect on security during the big data security life cycle process.To calculate the proposed work,researchers utilised the Fuzzy Analytic Hierarchy Process(Fuzzy AHP).Thefindings show that the Fuzzy AHP is a very accurate mechanism for determining the best security solution in a real-time healthcare context.The study also looks at the rapidly evolving security technologies in healthcare that could help improve healthcare services and the future prospects in this area.
文摘A vast amount of data (known as big data) may now be collected and stored from a variety of data sources, including event logs, the internet, smartphones, databases, sensors, cloud computing, and Internet of Things (IoT) devices. The term “big data security” refers to all the safeguards and instruments used to protect both the data and analytics processes against intrusions, theft, and other hostile actions that could endanger or adversely influence them. Beyond being a high-value and desirable target, protecting Big Data has particular difficulties. Big Data security does not fundamentally differ from conventional data security. Big Data security issues are caused by extraneous distinctions rather than fundamental ones. This study meticulously outlines the numerous security difficulties Large Data analytics now faces and encourages additional joint research for reducing both big data security challenges utilizing Ontology Web Language (OWL). Although we focus on the Security Challenges of Big Data in this essay, we will also briefly cover the broader Challenges of Big Data. The proposed classification of Big Data security based on ontology web language resulting from the protégé software has 32 classes and 45 subclasses.
基金This work was partially supported by the Natural Science Foundation of Beijing Municipality(No.4222038)by Open Research Project of the State Key Laboratory of Media Convergence and Communication(Communication University of China),the National Key R&D Program of China(No.2021YFF0307600)Fundamental Research Funds for the Central Universities.
文摘In the analysis of big data,deep learn-ing is a crucial technique.Big data analysis tasks are typically carried out on the cloud since it offers strong computer capabilities and storage areas.Nev-ertheless,there is a contradiction between the open nature of the cloud and the demand that data own-ers maintain their privacy.To use cloud resources for privacy-preserving data training,a viable method must be found.A privacy-preserving deep learning model(PPDLM)is suggested in this research to ad-dress this preserving issue.To preserve data privacy,we first encrypted the data using homomorphic en-cryption(HE)approach.Moreover,the deep learn-ing algorithm’s activation function—the sigmoid func-tion—uses the least-squares method to process non-addition and non-multiplication operations that are not allowed by homomorphic.Finally,experimental re-sults show that PPDLM has a significant effect on the protection of data privacy information.Compared with Non-Privacy Preserving Deep Learning Model(NPPDLM),PPDLM has higher computational effi-ciency.
基金financed in part by the Coordenacao de Aperfeicoamento de Pessoal de Nível Superior-Brazil(CAPES)-Finance Code 001the Spanish Ministry of Science and Innovation under grants PID2019-105381 GA-100(iScience).
文摘Big data is a concept that deals with large or complex data sets by using data analysis tools(e.g.,data mining,machine learning)to analyze information extracted from several sources systematically.Big data has attracted wide attention from academia,for example,in supporting patients and health professionals by improving the accuracy of decision-making,diagnosis and disease prediction.This research aimed to perform a Bibliometric Performance and Network Analysis(BPNA)supported by a Scoping Review(SR)to depict the strategic themes,thematic evolution structure,main challenges and opportunities related to the concept of big data applied in the healthcare sector.With this goal in mind,4857 documents from the Web of Science covering the period between 2009 to June 2020 were analyzed with the support of SciMAT software.The bibliometric performance showed the number of publications and citations over time,scientific productivity and the geographic distribution of publications and research fields.The strategic diagram yielded 20 clusters and their relative importance in terms of centrality and density.The thematic evolution structure presented the most important themes and how it changes over time.Lastly,we presented the main challenges and future opportunities of big data in healthcare.
基金National Key R&D Program of China(Grant No.2018YFB1701701)Sailing Talent Program+1 种基金Guangdong Provincial Science and Technologies Program of China(Grant No.2017B090922008)Special Grand Grant from Tianjin City Government of China。
文摘Big data on product sales are an emerging resource for supporting modular product design to meet diversified customers’requirements of product specification combinations.To better facilitate decision-making of modular product design,correlations among specifications and components originated from customers’conscious and subconscious preferences can be investigated by using big data on product sales.This study proposes a framework and the associated methods for supporting modular product design decisions based on correlation analysis of product specifications and components using big sales data.The correlations of the product specifications are determined by analyzing the collected product sales data.By building the relations between the product components and specifications,a matrix for measuring the correlation among product components is formed for component clustering.Six rules for supporting the decision making of modular product design are proposed based on the frequency analysis of the specification values per component cluster.A case study of electric vehicles illustrates the application of the proposed method.
基金This work was funded by the Graduate Scientific Research School at Yarmouk University under Grant Number:82/2020。
文摘There are quintillions of data on deoxyribonucleic acid(DNA)and protein in publicly accessible data banks,and that number is expanding at an exponential rate.Many scientific fields,such as bioinformatics and drug discovery,rely on such data;nevertheless,gathering and extracting data from these resources is a tough undertaking.This data should go through several processes,including mining,data processing,analysis,and classification.This study proposes software that extracts data from big data repositories automatically and with the particular ability to repeat data extraction phases as many times as needed without human intervention.This software simulates the extraction of data from web-based(point-and-click)resources or graphical user interfaces that cannot be accessed using command-line tools.The software was evaluated by creating a novel database of 34 parameters for 1360 physicochemical properties of antimicrobial peptides(AMP)sequences(46240 hits)from various MARVIN software panels,which can be later utilized to develop novel AMPs.Furthermore,for machine learning research,the program was validated by extracting 10,000 protein tertiary structures from the Protein Data Bank.As a result,data collection from the web will become faster and less expensive,with no need for manual data extraction.The software is critical as a first step to preparing large datasets for subsequent stages of analysis,such as those using machine and deep-learning applications.
基金supported by the National Key R&D Program of China (No.2021YFF0901002)the National Natural Science Foundation of China (No.61802291)+1 种基金Fundamental Research Funds for the Provincial Universities of Zhejiang (GK199900299012-025)Fundamental Research Funds for the Central Universities (No.JB210311).
文摘A large number of Web APIs have been released as services in mobile communications,but the service provided by a single Web API is usually limited.To enrich the services in mobile communications,developers have combined Web APIs and developed a new service,which is known as a mashup.The emergence of mashups greatly increases the number of services in mobile communications,especially in mobile networks and the Internet-of-Things(IoT),and has encouraged companies and individuals to develop even more mashups,which has led to the dramatic increase in the number of mashups.Such a trend brings with it big data,such as the massive text data from the mashups themselves and continually-generated usage data.Thus,the question of how to determine the most suitable mashups from big data has become a challenging problem.In this paper,we propose a mashup recommendation framework from big data in mobile networks and the IoT.The proposed framework is driven by machine learning techniques,including neural embedding,clustering,and matrix factorization.We employ neural embedding to learn the distributed representation of mashups and propose to use cluster analysis to learn the relationship among the mashups.We also develop a novel Joint Matrix Factorization(JMF)model to complete the mashup recommendation task,where we design a new objective function and an optimization algorithm.We then crawl through a real-world large mashup dataset and perform experiments.The experimental results demonstrate that our framework achieves high accuracy in mashup recommendation and performs better than all compared baselines.
基金financially supported by the General Program of the National Natural Science Foundation of China(No.52274326)the Fundamental Research Funds for the Central Universities (Nos.2125018 and 2225008)China Baowu Low Carbon Metallurgy Innovation Foundation(BWLCF202109)。
文摘Blast furnace (BF) ironmaking is the most typical “black box” process, and its complexity and uncertainty bring forth great challenges for furnace condition judgment and BF operation. Rich data resources for BF ironmaking are available, and the rapid development of data science and intelligent technology will provide an effective means to solve the uncertainty problem in the BF ironmaking process. This work focused on the application of artificial intelligence technology in BF ironmaking. The current intelligent BF ironmaking technology was summarized and analyzed from five aspects. These aspects include BF data management, the analyses of time delay and correlation, the prediction of BF key variables, the evaluation of BF status, and the multi-objective intelligent optimization of BF operations. Solutions and suggestions were offered for the problems in the current progress, and some outlooks for future prospects and technological breakthroughs were added. To effectively improve the BF data quality, we comprehensively considered the data problems and the characteristics of algorithms and selected the data processing method scientifically. For analyzing important BF characteristics, the effect of the delay was eliminated to ensure an accurate logical relationship between the BF parameters and economic indicators. As for BF parameter prediction and BF status evaluation,a BF intelligence model that integrates data information and process mechanism was built to effectively achieve the accurate prediction of BF key indexes and the scientific evaluation of BF status. During the optimization of BF parameters, low risk, low cost, and high return were used as the optimization criteria, and while pursuing the optimization effect, the feasibility and site operation cost were considered comprehensively.This work will help increase the process operator’s overall awareness and understanding of intelligent BF technology. Additionally, combining big data technology with the process will improve the practicality of data models in actual production and promote the application of intelligent technology in BF ironmaking.
文摘Since the concept of“Big Data”was first introduced in Nature in 2008,it has been widely applied in fields,such as business,healthcare,national defense,education,transportation,and security.With the maturity of artificial intelligence technology,big data analysis techniques tailored to various fields have made significant progress,but still face many challenges in terms of data quality,algorithms,and computing power.