期刊文献+
共找到43,243篇文章
< 1 2 250 >
每页显示 20 50 100
Exploration and Practice of Big Data Introductory Courses for Big Data Management and Application Majors
1
作者 Tinghui Huang Junchao Dong Liang Min 《Journal of Contemporary Educational Research》 2024年第2期131-137,共7页
As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by ... As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by everyone.To this end,we discuss some of our explorations and attempts in the construction and teaching process of big data courses for the major of big data management and application from the perspective of course planning,course implementation,and course summary.After interviews with students and feedback from questionnaires,students are highly satisfied with some of the teaching measures and programs currently adopted. 展开更多
关键词 big data management and application “Introduction to big data Teaching reform Curriculum exploration
下载PDF
Big data challenge for monitoring quality in higher education institutions using business intelligence dashboards
2
作者 Ali Sorour Anthony S.Atkins 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第1期25-41,共17页
As big data becomes an apparent challenge to handle when building a business intelligence(BI)system,there is a motivation to handle this challenging issue in higher education institutions(HEIs).Monitoring quality in H... As big data becomes an apparent challenge to handle when building a business intelligence(BI)system,there is a motivation to handle this challenging issue in higher education institutions(HEIs).Monitoring quality in HEIs encompasses handling huge amounts of data coming from different sources.This paper reviews big data and analyses the cases from the literature regarding quality assurance(QA)in HEIs.It also outlines a framework that can address the big data challenge in HEIs to handle QA monitoring using BI dashboards and a prototype dashboard is presented in this paper.The dashboard was developed using a utilisation tool to monitor QA in HEIs to provide visual representations of big data.The prototype dashboard enables stakeholders to monitor compliance with QA standards while addressing the big data challenge associated with the substantial volume of data managed by HEIs’QA systems.This paper also outlines how the developed system integrates big data from social media into the monitoring dashboard. 展开更多
关键词 big data Business intelligence(BI) Dashboards Higher education(HE) Quality assurance(QA) Social media
下载PDF
Evaluation of a software positioning tool to support SMEs in adoption of big data analytics
3
作者 Matthew Willetts Anthony S.Atkins 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第1期13-24,共12页
Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Sma... Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Small and medium sized enterprises(SMEs)are the backbone of the global economy,comprising of 90%of businesses worldwide.However,only 10%SMEs have adopted big data analytics despite the competitive advantage they could achieve.Previous research has analysed the barriers to adoption and a strategic framework has been developed to help SMEs adopt big data analytics.The framework was converted into a scoring tool which has been applied to multiple case studies of SMEs in the UK.This paper documents the process of evaluating the framework based on the structured feedback from a focus group composed of experienced practitioners.The results of the evaluation are presented with a discussion on the results,and the paper concludes with recommendations to improve the scoring tool based on the proposed framework.The research demonstrates that this positioning tool is beneficial for SMEs to achieve competitive advantages by increasing the application of business intelligence and big data analytics. 展开更多
关键词 big data analytics EVALUATION Small and medium sized enterprises (SMEs) Strategic framework
下载PDF
Big Data 4.0: The Era of Big Intelligence
4
作者 Zhaohao Sun 《Journal of Computer Science Research》 2024年第1期1-15,共15页
Big data has had significant impacts on our lives,economies,academia and industries over the past decade.The current equations are:What is the future of big data?What era do we live in?This article addresses these que... Big data has had significant impacts on our lives,economies,academia and industries over the past decade.The current equations are:What is the future of big data?What era do we live in?This article addresses these questions by looking at meta as an operation and argues that we are living in the era of big intelligence through analyzing from meta(big data)to big intelligence.More specifically,this article will analyze big data from an evolutionary perspective.The article overviews data,information,knowledge,and intelligence(DIKI)and reveals their relationships.After analyzing meta as an operation,this article explores Meta(DIKE)and its relationship.It reveals 5 Bigs consisting of big data,big information,big knowledge,big intelligence and big analytics.Applying meta on 5 Bigs,this article infers that 4 Big Data 4.0=meta(big data)=big intelligence.This article analyzes how intelligent big analytics support big intelligence.The proposed approach in this research might facilitate the research and development of big data,big data analytics,business intelligence,artificial intelligence,and data science. 展开更多
关键词 big data 4.0 big analytics Business intelligence Artificial intelligence data science
下载PDF
A Review of the Status and Development Strategies of Computer Science and Technology Under the Background of Big Data
5
作者 Junlin Zhang 《Journal of Electronic Research and Application》 2024年第2期49-53,共5页
This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technol... This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technology,focusing on analyzing the current application status of computer science and technology in big data,including data storage,data processing,and data analysis.Then,it proposes development strategies for big data processing.Computer science and technology play a vital role in big data processing by providing strong technical support. 展开更多
关键词 big data Computer science and technology data storage data processing data visualization
下载PDF
Big Data Analytics Using Graph Signal Processing
6
作者 Farhan Amin Omar M.Barukab Gyu Sang Choi 《Computers, Materials & Continua》 SCIE EI 2023年第1期489-502,共14页
The networks are fundamental to our modern world and they appear throughout science and society.Access to a massive amount of data presents a unique opportunity to the researcher’s community.As networks grow in size ... The networks are fundamental to our modern world and they appear throughout science and society.Access to a massive amount of data presents a unique opportunity to the researcher’s community.As networks grow in size the complexity increases and our ability to analyze them using the current state of the art is at severe risk of failing to keep pace.Therefore,this paper initiates a discussion on graph signal processing for large-scale data analysis.We first provide a comprehensive overview of core ideas in Graph signal processing(GSP)and their connection to conventional digital signal processing(DSP).We then summarize recent developments in developing basic GSP tools,including methods for graph filtering or graph learning,graph signal,graph Fourier transform(GFT),spectrum,graph frequency,etc.Graph filtering is a basic task that allows for isolating the contribution of individual frequencies and therefore enables the removal of noise.We then consider a graph filter as a model that helps to extend the application of GSP methods to large datasets.To show the suitability and the effeteness,we first created a noisy graph signal and then applied it to the filter.After several rounds of simulation results.We see that the filtered signal appears to be smoother and is closer to the original noise-free distance-based signal.By using this example application,we thoroughly demonstrated that graph filtration is efficient for big data analytics. 展开更多
关键词 big data data science big data processing graph signal processing social networks
下载PDF
Evaluating Security of Big Data Through Fuzzy Based Decision-Making Technique
7
作者 Fawaz Alassery Ahmed Alzahrani +3 位作者 Asif Irshad Khan Kanika Sharma Masood Ahmad Raees Ahmad Khan 《Computer Systems Science & Engineering》 SCIE EI 2023年第1期859-872,共14页
In recent years,it has been observed that the disclosure of information increases the risk of terrorism.Without restricting the accessibility of information,providing security is difficult.So,there is a demand for tim... In recent years,it has been observed that the disclosure of information increases the risk of terrorism.Without restricting the accessibility of information,providing security is difficult.So,there is a demand for time tofill the gap between security and accessibility of information.In fact,security tools should be usable for improving the security as well as the accessibility of information.Though security and accessibility are not directly influenced,some of their factors are indirectly influenced by each other.Attributes play an important role in bridging the gap between security and accessibility.In this paper,we identify the key attributes of accessibility and security that impact directly and indirectly on each other,such as confidentiality,integrity,availability,and severity.The significance of every attribute on the basis of obtained weight is important for its effect on security during the big data security life cycle process.To calculate the proposed work,researchers utilised the Fuzzy Analytic Hierarchy Process(Fuzzy AHP).Thefindings show that the Fuzzy AHP is a very accurate mechanism for determining the best security solution in a real-time healthcare context.The study also looks at the rapidly evolving security technologies in healthcare that could help improve healthcare services and the future prospects in this area. 展开更多
关键词 Information security big data big data security life cycle fuzzy AHP
下载PDF
Classification of Big Data Security Based on Ontology Web Language
8
作者 Alsadig Mohammed Adam Abdallah Amir Mohamed Talib 《Journal of Information Security》 2023年第1期76-91,共16页
A vast amount of data (known as big data) may now be collected and stored from a variety of data sources, including event logs, the internet, smartphones, databases, sensors, cloud computing, and Internet of Things (I... A vast amount of data (known as big data) may now be collected and stored from a variety of data sources, including event logs, the internet, smartphones, databases, sensors, cloud computing, and Internet of Things (IoT) devices. The term “big data security” refers to all the safeguards and instruments used to protect both the data and analytics processes against intrusions, theft, and other hostile actions that could endanger or adversely influence them. Beyond being a high-value and desirable target, protecting Big Data has particular difficulties. Big Data security does not fundamentally differ from conventional data security. Big Data security issues are caused by extraneous distinctions rather than fundamental ones. This study meticulously outlines the numerous security difficulties Large Data analytics now faces and encourages additional joint research for reducing both big data security challenges utilizing Ontology Web Language (OWL). Although we focus on the Security Challenges of Big Data in this essay, we will also briefly cover the broader Challenges of Big Data. The proposed classification of Big Data security based on ontology web language resulting from the protégé software has 32 classes and 45 subclasses. 展开更多
关键词 big data big data Security Information Security data Security Ontology Web Language PROTÉGÉ
下载PDF
Big data in healthcare:Conceptual network structure,key challenges and opportunities
9
作者 Leonardo B.Furstenaua Pedro Leivas +5 位作者 Michele Kremer Sott Michael S.Dohan Jose Ricardo Lopez-Robles Manuel J.Cobo Nicola Luigi Bragazzi Kim-Kwang Raymond Choo 《Digital Communications and Networks》 SCIE CSCD 2023年第4期856-868,共13页
Big data is a concept that deals with large or complex data sets by using data analysis tools(e.g.,data mining,machine learning)to analyze information extracted from several sources systematically.Big data has attract... Big data is a concept that deals with large or complex data sets by using data analysis tools(e.g.,data mining,machine learning)to analyze information extracted from several sources systematically.Big data has attracted wide attention from academia,for example,in supporting patients and health professionals by improving the accuracy of decision-making,diagnosis and disease prediction.This research aimed to perform a Bibliometric Performance and Network Analysis(BPNA)supported by a Scoping Review(SR)to depict the strategic themes,thematic evolution structure,main challenges and opportunities related to the concept of big data applied in the healthcare sector.With this goal in mind,4857 documents from the Web of Science covering the period between 2009 to June 2020 were analyzed with the support of SciMAT software.The bibliometric performance showed the number of publications and citations over time,scientific productivity and the geographic distribution of publications and research fields.The strategic diagram yielded 20 clusters and their relative importance in terms of centrality and density.The thematic evolution structure presented the most important themes and how it changes over time.Lastly,we presented the main challenges and future opportunities of big data in healthcare. 展开更多
关键词 big data Healthcare digitalization BIBLIOMETRIC Strategic intelligence Co-word analysis SciMAT
下载PDF
Privacy-Preserving Deep Learning on Big Data in Cloud
10
作者 Yongkai Fan Wanyu Zhang +2 位作者 Jianrong Bai Xia Lei Kuanching Li 《China Communications》 SCIE CSCD 2023年第11期176-186,共11页
In the analysis of big data,deep learn-ing is a crucial technique.Big data analysis tasks are typically carried out on the cloud since it offers strong computer capabilities and storage areas.Nev-ertheless,there is a ... In the analysis of big data,deep learn-ing is a crucial technique.Big data analysis tasks are typically carried out on the cloud since it offers strong computer capabilities and storage areas.Nev-ertheless,there is a contradiction between the open nature of the cloud and the demand that data own-ers maintain their privacy.To use cloud resources for privacy-preserving data training,a viable method must be found.A privacy-preserving deep learning model(PPDLM)is suggested in this research to ad-dress this preserving issue.To preserve data privacy,we first encrypted the data using homomorphic en-cryption(HE)approach.Moreover,the deep learn-ing algorithm’s activation function—the sigmoid func-tion—uses the least-squares method to process non-addition and non-multiplication operations that are not allowed by homomorphic.Finally,experimental re-sults show that PPDLM has a significant effect on the protection of data privacy information.Compared with Non-Privacy Preserving Deep Learning Model(NPPDLM),PPDLM has higher computational effi-ciency. 展开更多
关键词 big data cloud computing deep learning homomorphic encryption PRIVACY-PRESERVING
下载PDF
Key issues and progress of industrial big data-based intelligent blast furnace ironmaking technology
11
作者 Quan Shi Jue Tang Mansheng Chu 《International Journal of Minerals,Metallurgy and Materials》 SCIE EI CAS CSCD 2023年第9期1651-1666,共16页
Blast furnace (BF) ironmaking is the most typical “black box” process, and its complexity and uncertainty bring forth great challenges for furnace condition judgment and BF operation. Rich data resources for BF iron... Blast furnace (BF) ironmaking is the most typical “black box” process, and its complexity and uncertainty bring forth great challenges for furnace condition judgment and BF operation. Rich data resources for BF ironmaking are available, and the rapid development of data science and intelligent technology will provide an effective means to solve the uncertainty problem in the BF ironmaking process. This work focused on the application of artificial intelligence technology in BF ironmaking. The current intelligent BF ironmaking technology was summarized and analyzed from five aspects. These aspects include BF data management, the analyses of time delay and correlation, the prediction of BF key variables, the evaluation of BF status, and the multi-objective intelligent optimization of BF operations. Solutions and suggestions were offered for the problems in the current progress, and some outlooks for future prospects and technological breakthroughs were added. To effectively improve the BF data quality, we comprehensively considered the data problems and the characteristics of algorithms and selected the data processing method scientifically. For analyzing important BF characteristics, the effect of the delay was eliminated to ensure an accurate logical relationship between the BF parameters and economic indicators. As for BF parameter prediction and BF status evaluation,a BF intelligence model that integrates data information and process mechanism was built to effectively achieve the accurate prediction of BF key indexes and the scientific evaluation of BF status. During the optimization of BF parameters, low risk, low cost, and high return were used as the optimization criteria, and while pursuing the optimization effect, the feasibility and site operation cost were considered comprehensively.This work will help increase the process operator’s overall awareness and understanding of intelligent BF technology. Additionally, combining big data technology with the process will improve the practicality of data models in actual production and promote the application of intelligent technology in BF ironmaking. 展开更多
关键词 BF ironmaking intelligent BF industrial big data machine learning integrated mechanism and data
下载PDF
Towards machine-learning-driven effective mashup recommendations from big data in mobile networks and the Internet-of-Things
12
作者 Yueshen Xu Zhiying Wang +3 位作者 Honghao Gao Zhiping Jiang Yuyu Yin Rui Li 《Digital Communications and Networks》 SCIE CSCD 2023年第1期138-145,共8页
A large number of Web APIs have been released as services in mobile communications,but the service provided by a single Web API is usually limited.To enrich the services in mobile communications,developers have combin... A large number of Web APIs have been released as services in mobile communications,but the service provided by a single Web API is usually limited.To enrich the services in mobile communications,developers have combined Web APIs and developed a new service,which is known as a mashup.The emergence of mashups greatly increases the number of services in mobile communications,especially in mobile networks and the Internet-of-Things(IoT),and has encouraged companies and individuals to develop even more mashups,which has led to the dramatic increase in the number of mashups.Such a trend brings with it big data,such as the massive text data from the mashups themselves and continually-generated usage data.Thus,the question of how to determine the most suitable mashups from big data has become a challenging problem.In this paper,we propose a mashup recommendation framework from big data in mobile networks and the IoT.The proposed framework is driven by machine learning techniques,including neural embedding,clustering,and matrix factorization.We employ neural embedding to learn the distributed representation of mashups and propose to use cluster analysis to learn the relationship among the mashups.We also develop a novel Joint Matrix Factorization(JMF)model to complete the mashup recommendation task,where we design a new objective function and an optimization algorithm.We then crawl through a real-world large mashup dataset and perform experiments.The experimental results demonstrate that our framework achieves high accuracy in mashup recommendation and performs better than all compared baselines. 展开更多
关键词 Mashup recommendation big data Machine learning Mobile networks Internet-of-Things
下载PDF
Big Data Bot with a Special Reference to Bioinformatics
13
作者 Ahmad M.Al-Omari Shefa M.Tawalbeh +4 位作者 Yazan H.Akkam Mohammad Al-Tawalbeh Shima’a Younis Abdullah A.Mustafa Jonathan Arnold 《Computers, Materials & Continua》 SCIE EI 2023年第5期4155-4173,共19页
There are quintillions of data on deoxyribonucleic acid(DNA)and protein in publicly accessible data banks,and that number is expanding at an exponential rate.Many scientific fields,such as bioinformatics and drug disc... There are quintillions of data on deoxyribonucleic acid(DNA)and protein in publicly accessible data banks,and that number is expanding at an exponential rate.Many scientific fields,such as bioinformatics and drug discovery,rely on such data;nevertheless,gathering and extracting data from these resources is a tough undertaking.This data should go through several processes,including mining,data processing,analysis,and classification.This study proposes software that extracts data from big data repositories automatically and with the particular ability to repeat data extraction phases as many times as needed without human intervention.This software simulates the extraction of data from web-based(point-and-click)resources or graphical user interfaces that cannot be accessed using command-line tools.The software was evaluated by creating a novel database of 34 parameters for 1360 physicochemical properties of antimicrobial peptides(AMP)sequences(46240 hits)from various MARVIN software panels,which can be later utilized to develop novel AMPs.Furthermore,for machine learning research,the program was validated by extracting 10,000 protein tertiary structures from the Protein Data Bank.As a result,data collection from the web will become faster and less expensive,with no need for manual data extraction.The software is critical as a first step to preparing large datasets for subsequent stages of analysis,such as those using machine and deep-learning applications. 展开更多
关键词 BIOINFORMATICS big data data extraction BOT drug design
下载PDF
Filter and Embedded Feature Selection Methods to Meet Big Data Visualization Challenges
14
作者 Kamal A.ElDahshan AbdAllah A.AlHabshy Luay Thamer Mohammed 《Computers, Materials & Continua》 SCIE EI 2023年第1期817-839,共23页
This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while ... This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature. 展开更多
关键词 data Redaction features selection Select from model Select percentile big data visualization data visualization
下载PDF
Self-Tuning Parameters for Decision Tree Algorithm Based on Big Data Analytics
15
作者 Manar Mohamed Hafez Essam Eldin F.Elfakharany +1 位作者 Amr A.Abohany Mostafa Thabet 《Computers, Materials & Continua》 SCIE EI 2023年第4期943-958,共16页
Big data is usually unstructured, and many applications require theanalysis in real-time. Decision tree (DT) algorithm is widely used to analyzebig data. Selecting the optimal depth of DT is time-consuming process as ... Big data is usually unstructured, and many applications require theanalysis in real-time. Decision tree (DT) algorithm is widely used to analyzebig data. Selecting the optimal depth of DT is time-consuming process as itrequires many iterations. In this paper, we have designed a modified versionof a (DT). The tree aims to achieve optimal depth by self-tuning runningparameters and improving the accuracy. The efficiency of the modified (DT)was verified using two datasets (airport and fire datasets). The airport datasethas 500000 instances and the fire dataset has 600000 instances. A comparisonhas been made between the modified (DT) and standard (DT) with resultsshowing that the modified performs better. This comparison was conductedon multi-node on Apache Spark tool using Amazon web services. Resultingin accuracy with an increase of 6.85% for the first dataset and 8.85% for theairport dataset. In conclusion, the modified DT showed better accuracy inhandling different-sized datasets compared to standard DT algorithm. 展开更多
关键词 big data classification decision tree Amazon web services
下载PDF
Research on the optimization strategy of customers’electricity consumption based on big data
16
作者 Jiangping Liu Zong Wang +3 位作者 Hui Hu Shaoxiang Xu Jiabin Wang Ying Liu 《Global Energy Interconnection》 EI CSCD 2023年第3期273-284,共12页
Current power systems face significant challenges in supporting large-scale access to new energy sources,and the potential of existing flexible resources needs to be fully explored from the power supply,grid,and custo... Current power systems face significant challenges in supporting large-scale access to new energy sources,and the potential of existing flexible resources needs to be fully explored from the power supply,grid,and customer perspectives.This paper proposes a multi-objective electricity consumption optimization strategy considering the correlation between equipment and electricity consumption.It constructs a multi-objective electricity consumption optimization model that considers the correlation between equipment and electricity consumption to maximize economy and comfort.The results show that the proposed method can accurately assess the potential for electricity consumption optimization and obtain an optimal multi-objective electricity consumption strategy based on customers’actual electricity consumption demand. 展开更多
关键词 big data Electricity consumption optimization Load elasticity Electricity consumption relevance
下载PDF
Systematic Survey on Big Data Analytics and Artificial Intelligence for COVID-19 Containment
17
作者 Saeed M.Alshahrani Jameel Almalki +4 位作者 Waleed Alshehri Rashid Mehmood Marwan Albahar Najlaa Jannah Nayyar Ahmed Khan 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1793-1817,共25页
Artificial Intelligence(AI)has gained popularity for the containment of COVID-19 pandemic applications.Several AI techniques provide efficient mechanisms for handling pandemic situations.AI methods,protocols,data sets... Artificial Intelligence(AI)has gained popularity for the containment of COVID-19 pandemic applications.Several AI techniques provide efficient mechanisms for handling pandemic situations.AI methods,protocols,data sets,and various validation mechanisms empower the users towards proper decision-making and procedures to handle the situation.Despite so many tools,there still exist conditions in which AI must go a long way.To increase the adaptability and potential of these techniques,a combination of AI and Bigdata is currently gaining popularity.This paper surveys and analyzes the methods within the various computational paradigms used by different researchers and national governments,such as China and South Korea,to fight against this pandemic.The process of vaccine development requires multiple medical experiments.This process requires analyzing datasets from different parts of the world.Deep learning and the Internet of Things(IoT)revolutionized the field of disease diagnosis and disease prediction.The accurate observations from different datasets across the world empowered the process of drug development and drug repurposing.To overcome the issues generated by the pandemic,using such sophisticated computing paradigms such as AI,Machine Learning(ML),deep learning,Robotics and Bigdata is essential. 展开更多
关键词 COVID-19 IoT artificial intelligence big data CORONAVIRUS deep learning ROBOTICS machine learning
下载PDF
Big Data Analytics:Deep Content-Based Prediction with Sampling Perspective
18
作者 Waleed Albattah Saleh Albahli 《Computer Systems Science & Engineering》 SCIE EI 2023年第4期531-544,共14页
The world of information technology is more than ever being flooded with huge amounts of data,nearly 2.5 quintillion bytes every day.This large stream of data is called big data,and the amount is increasing each day.T... The world of information technology is more than ever being flooded with huge amounts of data,nearly 2.5 quintillion bytes every day.This large stream of data is called big data,and the amount is increasing each day.This research uses a technique called sampling,which selects a representative subset of the data points,manipulates and analyzes this subset to identify patterns and trends in the larger dataset being examined,and finally,creates models.Sampling uses a small proportion of the original data for analysis and model training,so that it is relatively faster while maintaining data integrity and achieving accurate results.Two deep neural networks,AlexNet and DenseNet,were used in this research to test two sampling techniques,namely sampling with replacement and reservoir sampling.The dataset used for this research was divided into three classes:acceptable,flagged as easy,and flagged as hard.The base models were trained with the whole dataset,whereas the other models were trained on 50%of the original dataset.There were four combinations of model and sampling technique.The F-measure for the AlexNet model was 0.807 while that for the DenseNet model was 0.808.Combination 1 was the AlexNet model and sampling with replacement,achieving an average F-measure of 0.8852.Combination 3 was the AlexNet model and reservoir sampling.It had an average F-measure of 0.8545.Combination 2 was the DenseNet model and sampling with replacement,achieving an average F-measure of 0.8017.Finally,combination 4 was the DenseNet model and reservoir sampling.It had an average F-measure of 0.8111.Overall,we conclude that both models trained on a sampled dataset gave equal or better results compared to the base models,which used the whole dataset. 展开更多
关键词 Sampling big data deep learning AlexNet DenseNet
下载PDF
Metaheuristic Based Clustering with Deep Learning Model for Big Data Classification
19
作者 R.Krishnaswamy Kamalraj Subramaniam +3 位作者 V.Nandini K.Vijayalakshmi Seifedine Kadry Yunyoung Nam 《Computer Systems Science & Engineering》 SCIE EI 2023年第1期391-406,共16页
Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient te... Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient techniques for mining big data to extract the useful and hidden patterns that exist in it.Density-based clustering techniques have gained significant attention owing to the fact that it helps to effectively recognize complex patterns in spatial dataset.Big data clustering is a trivial process owing to the increasing quantity of data which can be solved by the use of Map Reduce tool.With this motivation,this paper presents an efficient Map Reduce based hybrid density based clustering and classification algorithm for big data analytics(MR-HDBCC).The proposed MR-HDBCC technique is executed on Map Reduce tool for handling the big data.In addition,the MR-HDBCC technique involves three distinct processes namely pre-processing,clustering,and classification.The proposed model utilizes the Density-Based Spatial Clustering of Applications with Noise(DBSCAN)techni-que which is capable of detecting random shapes and diverse clusters with noisy data.For improving the performance of the DBSCAN technique,a hybrid model using cockroach swarm optimization(CSO)algorithm is developed for the exploration of the search space and determine the optimal parameters for density based clustering.Finally,bidirectional gated recurrent neural network(BGRNN)is employed for the classification of big data.The experimental validation of the proposed MR-HDBCC technique takes place using the benchmark dataset and the simulation outcomes demonstrate the promising performance of the proposed model interms of different measures. 展开更多
关键词 big data data classification CLUSTERING MAPREDUCE dbscan algorithm
下载PDF
QLGWONM:Quantum Leaping GWO for Feature Selection in Big Data Analytics
20
作者 Rachna Kulhare S.Veenadhari 《Journal of Harbin Institute of Technology(New Series)》 CAS 2023年第4期85-98,共14页
In learning and classification problems,feature selection(FS)is critical in finding features that are both meaningful and non-redundant.Today,big data is an integral aspect of all industry sectors.All firms in any ind... In learning and classification problems,feature selection(FS)is critical in finding features that are both meaningful and non-redundant.Today,big data is an integral aspect of all industry sectors.All firms in any industry,such as power,finance,commerce,electronics,communications,and so on,create massive amounts of heterogeneous data that needed to be handled effectively and evaluated correctly.When it comes to big data,feature selection approaches are taken as game-changer since they can assist in minimizing the complexity of genetic data,making it simpler to study and translating it into meaningful information.To enhance classification performance,feature selection is done to remove unnecessary and redundant characteristics from the dataset.In this paper,we presented a novel Grey Wolf Approach based on Quantum leaping neighbor memeplexes which is termed QLGWONM for feature selection and reduction to achieve better classification accuracy.The paper implemented other optimization algorithms such as particle swarm optimization(PSO),slime mould algorithm(SMA),salp swarm algorithm(SSA),artificial butterfly algorithm(ABA),whale optimization(WO),crow search optimization algorithm(CSA),and Jaya models.After the implementation of these algorithms,QLGWONM outperformed other algorithms.The QLGWONM model performed well with an accuracy of 100%for Brain Tumor,CNS,Lung dataset and 97.1%for Ionosphere dataset,and 99%for NSL-KDD.Apart from these,some state-of-art comparisons were also evaluated and QLGWONM gave better results as compared with other existing algorithms. 展开更多
关键词 big data feature extraction machine learning GWO CLASSIFICATION
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部