Despite concerted efforts to create employment opportunities and the realized economic growth between 2000 and 2005, the unemployment rate in Namibia currently stands at 27.4%, according to the Labour Force Survey rel...Despite concerted efforts to create employment opportunities and the realized economic growth between 2000 and 2005, the unemployment rate in Namibia currently stands at 27.4%, according to the Labour Force Survey released in April 2013. The percentage of employed males in Namibia stands at 41.6% while that of employed females stand at 28.8% according to the National Human Resources Plan of May 2013. Analysts have put the blame on adverse climatic conditions, limited levels of skills, access to finance, and the structure of the economy. The frustration and discomfort caused by unemployment, especially among the youth, can threaten the country's peace and stability as it negatively impacts on the standard of living, crime rates, family happiness, and drug abuse.To date, studies on employment in Namibia have mainly concentrated on the micro and macro econometric approaches. It is important to examine how bio-demographic characteristics affect employment. This paper uses data from the 2010 Income and expenditure survey to establish the bio-demographic determinants of employment by fitting a binary logistic model. The outcome variable is employment status which is dichotomous. The independent variables which were guided by review of related literature and availability of data in the Income and Expenditure survey data set, included age-group, region, place of residence, marital status, education level, and gender. Results indicated that employment prospects in Namibia were influenced by the region, gender, marital status, and education level.展开更多
Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/appr...Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.展开更多
In this paper, a logistical regression statistical analysis (LR) is presented for a set of variables used in experimental measurements in reversed field pinch (RFP) machines, commonly known as “slinky mode” (SM), ob...In this paper, a logistical regression statistical analysis (LR) is presented for a set of variables used in experimental measurements in reversed field pinch (RFP) machines, commonly known as “slinky mode” (SM), observed to travel around the torus in Madison Symmetric Torus (MST). The LR analysis is used to utilize the modified Sine-Gordon dynamic equation model to predict with high confidence whether the slinky mode will lock or not lock when compared to the experimentally measured motion of the slinky mode. It is observed that under certain conditions, the slinky mode “locks” at or near the intersection of poloidal and/or toroidal gaps in MST. However, locked mode cease to travel around the torus;while unlocked mode keeps traveling without a change in the energy, making it hard to determine an exact set of conditions to predict locking/unlocking behaviour. The significant key model parameters determined by LR analysis are shown to improve the Sine-Gordon model’s ability to determine the locking/unlocking of magnetohydrodyamic (MHD) modes. The LR analysis of measured variables provides high confidence in anticipating locking versus unlocking of slinky mode proven by relational comparisons between simulations and the experimentally measured motion of the slinky mode in MST.展开更多
In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluste...In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.展开更多
Bedding structural planes significantly influence the mechanical properties and stability of engineering rock masses.This study conducts uniaxial compression tests on layered sandstone with various bedding angles(0...Bedding structural planes significantly influence the mechanical properties and stability of engineering rock masses.This study conducts uniaxial compression tests on layered sandstone with various bedding angles(0°,15°,30°,45°,60°,75°and 90°)to explore the impact of bedding angle on the deformational mechanical response,failure mode,and damage evolution processes of rocks.It develops a damage model based on the Logistic equation derived from the modulus’s degradation considering the combined effect of the sandstone bedding dip angle and load.This model is employed to study the damage accumulation state and its evolution within the layered rock mass.This research also introduces a piecewise constitutive model that considers the initial compaction characteristics to simulate the whole deformation process of layered sandstone under uniaxial compression.The results revealed that as the bedding angle increases from 0°to 90°,the uniaxial compressive strength and elastic modulus of layered sandstone significantly decrease,slightly increase,and then decline again.The corresponding failure modes transition from splitting tensile failure to slipping shear failure and back to splitting tensile failure.As indicated by the modulus’s degradation,the damage characteristics can be categorized into four stages:initial no damage,damage initiation,damage acceleration,and damage deceleration termination.The theoretical damage model based on the Logistic equation effectively simulates and predicts the entire damage evolution process.Moreover,the theoretical constitutive model curves closely align with the actual stress−strain curves of layered sandstone under uniaxial compression.The introduced constitutive model is concise,with fewer parameters,a straightforward parameter determination process,and a clear physical interpretation.This study offers valuable insights into the theory of layered rock mechanics and holds implications for ensuring the safety of rock engineering.展开更多
Malware is an ever-present and dynamic threat to networks and computer systems in cybersecurity,and because of its complexity and evasiveness,it is challenging to identify using traditional signature-based detection a...Malware is an ever-present and dynamic threat to networks and computer systems in cybersecurity,and because of its complexity and evasiveness,it is challenging to identify using traditional signature-based detection approaches.The study article discusses the growing danger to cybersecurity that malware hidden in PDF files poses,highlighting the shortcomings of conventional detection techniques and the difficulties presented by adversarial methodologies.The article presents a new method that improves PDF virus detection by using document analysis and a Logistic Model Tree.Using a dataset from the Canadian Institute for Cybersecurity,a comparative analysis is carried out with well-known machine learning models,such as Credal Decision Tree,Naïve Bayes,Average One Dependency Estimator,Locally Weighted Learning,and Stochastic Gradient Descent.Beyond traditional structural and JavaScript-centric PDF analysis,the research makes a substantial contribution to the area by boosting precision and resilience in malware detection.The use of Logistic Model Tree,a thorough feature selection approach,and increased focus on PDF file attributes all contribute to the efficiency of PDF virus detection.The paper emphasizes Logistic Model Tree’s critical role in tackling increasing cybersecurity threats and proposes a viable answer to practical issues in the sector.The results reveal that the Logistic Model Tree is superior,with improved accuracy of 97.46%when compared to benchmark models,demonstrating its usefulness in addressing the ever-changing threat landscape.展开更多
The burning of crop residues in fields is a significant global biomass burning activity which is a key element of the terrestrial carbon cycle,and an important source of atmospheric trace gasses and aerosols.Accurate ...The burning of crop residues in fields is a significant global biomass burning activity which is a key element of the terrestrial carbon cycle,and an important source of atmospheric trace gasses and aerosols.Accurate estimation of cropland burned area is both crucial and challenging,especially for the small and fragmented burned scars in China.Here we developed an automated burned area mapping algorithm that was implemented using Sentinel-2 Multi Spectral Instrument(MSI)data and its effectiveness was tested taking Songnen Plain,Northeast China as a case using satellite image of 2020.We employed a logistic regression method for integrating multiple spectral data into a synthetic indicator,and compared the results with manually interpreted burned area reference maps and the Moderate-Resolution Imaging Spectroradiometer(MODIS)MCD64A1 burned area product.The overall accuracy of the single variable logistic regression was 77.38%to 86.90%and 73.47%to 97.14%for the 52TCQ and 51TYM cases,respectively.In comparison,the accuracy of the burned area map was improved to 87.14%and 98.33%for the 52TCQ and 51TYM cases,respectively by multiple variable logistic regression of Sentind-2 images.The balance of omission error and commission error was also improved.The integration of multiple spectral data combined with a logistic regression method proves to be effective for burned area detection,offering a highly automated process with an automatic threshold determination mechanism.This method exhibits excellent extensibility and flexibility taking the image tile as the operating unit.It is suitable for burned area detection at a regional scale and can also be implemented with other satellite data.展开更多
This research introduces a novel approach to improve and optimize the predictive capacity of consumer purchase behaviors on e-commerce platforms. This study presented an introduction to the fundamental concepts of the...This research introduces a novel approach to improve and optimize the predictive capacity of consumer purchase behaviors on e-commerce platforms. This study presented an introduction to the fundamental concepts of the logistic regression algorithm. In addition, it analyzed user data obtained from an e-commerce platform. The original data were preprocessed, and a consumer purchase prediction model was developed for the e-commerce platform using the logistic regression method. The comparison study used the classic random forest approach, further enhanced by including the K-fold cross-validation method. Evaluation of the accuracy of the model’s classification was conducted using performance indicators that included the accuracy rate, the precision rate, the recall rate, and the F1 score. A visual examination determined the significance of the findings. The findings suggest that employing the logistic regression algorithm to forecast customer purchase behaviors on e-commerce platforms can improve the efficacy of the approach and yield more accurate predictions. This study serves as a valuable resource for improving the precision of forecasting customers’ purchase behaviors on e-commerce platforms. It has significant practical implications for optimizing the operational efficiency of e-commerce platforms.展开更多
Internet of Things(IoT)is a popular social network in which devices are virtually connected for communicating and sharing information.This is applied greatly in business enterprises and government sectors for deliveri...Internet of Things(IoT)is a popular social network in which devices are virtually connected for communicating and sharing information.This is applied greatly in business enterprises and government sectors for delivering the services to their customers,clients and citizens.But,the interaction is success-ful only based on the trust that each device has on another.Thus trust is very much essential for a social network.As Internet of Things have access over sen-sitive information,it urges to many threats that lead data management to risk.This issue is addressed by trust management that help to take decision about trust-worthiness of requestor and provider before communication and sharing.Several trust-based systems are existing for different domain using Dynamic weight meth-od,Fuzzy classification,Bayes inference and very few Regression analysis for IoT.The proposed algorithm is based on Logistic Regression,which provide strong statistical background to trust prediction.To make our stand strong on regression support to trust,we have compared the performance with equivalent sound Bayes analysis using Beta distribution.The performance is studied in simu-lated IoT setup with Quality of Service(QoS)and Social parameters for the nodes.The proposed model performs better in terms of various metrics.An IoT connects heterogeneous devices such as tags and sensor devices for sharing of information and avail different application services.The most salient features of IoT system is to design it with scalability,extendibility,compatibility and resiliency against attack.The existing worksfinds a way to integrate direct and indirect trust to con-verge quickly and estimate the bias due to attacks in addition to the above features.展开更多
BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale c...BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale cannot be fully understood due to lack of information.AIM To identify key factors that may explain the variability in case lethality across countries.METHODS We identified 21 Potential risk factors for coronavirus disease 2019(COVID-19)case fatality rate for all the countries with available data.We examined univariate relationships of each variable with case fatality rate(CFR),and all independent variables to identify candidate variables for our final multiple model.Multiple regression analysis technique was used to assess the strength of relationship.RESULTS The mean of COVID-19 mortality was 1.52±1.72%.There was a statistically significant inverse correlation between health expenditure,and number of computed tomography scanners per 1 million with CFR,and significant direct correlation was found between literacy,and air pollution with CFR.This final model can predict approximately 97%of the changes in CFR.CONCLUSION The current study recommends some new predictors explaining affect mortality rate.Thus,it could help decision-makers develop health policies to fight COVID-19.展开更多
Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. ...Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. The above statement holds for West Texas, Midland, and Odessa Precisely. Two machine learning regression algorithms (Random Forest and XGBoost) were employed to develop models for the prediction of total dissolved solids (TDS) and sodium absorption ratio (SAR) for efficient water quality monitoring of two vital aquifers: Edward-Trinity (plateau), and Ogallala aquifers. These two aquifers have contributed immensely to providing water for different uses ranging from domestic, agricultural, industrial, etc. The data was obtained from the Texas Water Development Board (TWDB). The XGBoost and Random Forest models used in this study gave an accurate prediction of observed data (TDS and SAR) for both the Edward-Trinity (plateau) and Ogallala aquifers with the R<sup>2</sup> values consistently greater than 0.83. The Random Forest model gave a better prediction of TDS and SAR concentration with an average R, MAE, RMSE and MSE of 0.977, 0.015, 0.029 and 0.00, respectively. For the XGBoost, an average R, MAE, RMSE, and MSE of 0.953, 0.016, 0.037 and 0.00, respectively, were achieved. The overall performance of the models produced was impressive. From this study, we can clearly understand that Random Forest and XGBoost are appropriate for water quality prediction and monitoring in an area of high hydrocarbon activities like Midland and Odessa and West Texas at large.展开更多
Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water r...Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water resource planning, therefore, obtaining seasonal prediction models that allow these variations to be characterized in detail, it’s a concern, specially for island states. This research proposes the construction of statistical-dynamic models based on PCA regression methods. It is used as predictand the monthly precipitation accumulated, while the predictors (6) are extracted from the ECMWF-SEAS5 ensemble mean forecasts with a lag of one month with respect to the target month. In the construction of the models, two sequential training schemes are evaluated, obtaining that only the shorter preserves the seasonal characteristics of the predictand. The evaluation metrics used, where cell-point and dichotomous methodologies are combined, suggest that the predictors related to sea surface temperatures do not adequately represent the seasonal variability of the predictand, however, others such as the temperature at 850 hPa and the Outgoing Longwave Radiation are represented with a good approximation regardless of the model chosen. In this sense, the models built with the nearest neighbor methodology were the most efficient. Using the individual models with the best results, an ensemble is built that allows improving the individual skill of the models selected as members by correcting the underestimation of precipitation in the dynamic model during the wet season, although problems of overestimation persist for thresholds lower than 50 mm.展开更多
This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By re...This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By reviewing relevant theories and literature,qualitative prediction methods,regression prediction models,and other related theories were explored.Through the analysis of annual cigarette sales data and government procurement data in City A,a comprehensive understanding of the development of the tobacco industry and the economic trends of tobacco companies in the county was obtained.By predicting and analyzing the average price per box of cigarette sales across different years,corresponding prediction results were derived and compared with actual sales data.The prediction results indicate that the correlation coefficient between the average price per box of cigarette sales and government procurement is 0.982,implying that government procurement accounts for 96.4%of the changes in the average price per box of cigarettes.These findings offer an in-depth exploration of the relationship between the average price per box of cigarettes in City A and government procurement,providing a scientific foundation for corporate decision-making and market operations.展开更多
This paper focuses on ozone prediction in the atmosphere using a machine learning approach. We utilize air pollutant and meteorological variable datasets from the El Paso area to classify ozone levels as high or low. ...This paper focuses on ozone prediction in the atmosphere using a machine learning approach. We utilize air pollutant and meteorological variable datasets from the El Paso area to classify ozone levels as high or low. The LR and ANN algorithms are employed to train the datasets. The models demonstrate a remarkably high classification accuracy of 89.3% in predicting ozone levels on a given day. Evaluation metrics reveal that both the ANN and LR models exhibit accuracies of 89.3% and 88.4%, respectively. Additionally, the AUC values for both models are comparable, with the ANN achieving 95.4% and the LR obtaining 95.2%. The lower the cross-entropy loss (log loss), the higher the model’s accuracy or performance. Our ANN model yields a log loss of 3.74, while the LR model shows a log loss of 6.03. The prediction time for the ANN model is approximately 0.00 seconds, whereas the LR model takes 0.02 seconds. Our odds ratio analysis indicates that features such as “Solar radiation”, “Std. Dev. Wind Direction”, “outdoor temperature”, “dew point temperature”, and “PM10” contribute to high ozone levels in El Paso, Texas. Based on metrics such as accuracy, error rate, log loss, and prediction time, the ANN model proves to be faster and more suitable for ozone classification in the El Paso, Texas area.展开更多
This paper presents a case study on the IPUMS NHIS database,which provides data from censuses and surveys on the health of the U.S.population,including data related to COVID-19.By addressing gaps in previous studies,w...This paper presents a case study on the IPUMS NHIS database,which provides data from censuses and surveys on the health of the U.S.population,including data related to COVID-19.By addressing gaps in previous studies,we propose a machine learning approach to train predictive models for identifying and measuring factors that affect the severity of COVID-19 symptoms.Our experiments focus on four groups of factors:demographic,socio-economic,health condition,and related to COVID-19 vaccination.By analysing the sensitivity of the variables used to train the models and the VEC(variable effect characteristics)analysis on the variable values,we identify and measure importance of various factors that influence the severity of COVID-19 symptoms.展开更多
目的:比较决策树和Logistic回归模型对体外受精-胚胎移植(in vitro fertilization and embryo transfer,IVF-ET)患者妊娠结局的预测价值。方法:纳入2021年1月至2022年10月在长治医学院附属和平医院接受IVF-ET的患者350例为研究对象,根...目的:比较决策树和Logistic回归模型对体外受精-胚胎移植(in vitro fertilization and embryo transfer,IVF-ET)患者妊娠结局的预测价值。方法:纳入2021年1月至2022年10月在长治医学院附属和平医院接受IVF-ET的患者350例为研究对象,根据妊娠结局分为妊娠成功组(215例)和妊娠失败组(135例)。收集患者临床资料,建立IVF-ET患者妊娠结局Logistic回归和决策树预测模型,并在是否基于Logistic回归结果条件下建立决策树分析模型(决策树1和决策树2),采用受试者工作特征(receiver operating characteristic,ROC)曲线对模型预测效果进行评价。结果:350例患者中,妊娠成功患者占61.43%,妊娠失败者占38.57%。妊娠失败组年龄≥35岁、不孕年限≥5年、周期次数≥1次、有心理精神障碍的患者比例及HCG日血清孕酮水平均高于妊娠成功组,获卵数≥10枚、受精率≥75%的患者比例及HCG日子宫内膜厚度、优质胚胎数小于妊娠成功组(P<0.05)。多因素Logistic回归分析结果显示,年龄、HCG日血清孕酮水平、优质胚胎数及心理精神障碍均是IVF-ET患者妊娠结局的影响因素(P<0.05)。决策树模型显示,年龄、HCG日血清孕酮水平、优质胚胎数为IVF-ET患者妊娠结局的影响因素。Logistic回归模型曲线下面积(area under curve,AUC)为0.832,预测敏感度、特异度和准确度分别为87.3%、71.4%、83.5%;决策树1的AUC为0.859,预测敏感度、特异度和准确度分别为85.1%、76.8%、85.6%;决策树2的AUC为0.820,预测敏感度、特异度和准确度分别为83.7%、73.2%、82.4%。决策树1的AUC大于决策树2(P<0.05),但与Logistic回归模型的AUC比较差异无统计学意义(P>0.05)。结论:Logistic回归模型和决策树模型对于IVF-ET患者妊娠结局均有一定的预测价值。展开更多
文摘Despite concerted efforts to create employment opportunities and the realized economic growth between 2000 and 2005, the unemployment rate in Namibia currently stands at 27.4%, according to the Labour Force Survey released in April 2013. The percentage of employed males in Namibia stands at 41.6% while that of employed females stand at 28.8% according to the National Human Resources Plan of May 2013. Analysts have put the blame on adverse climatic conditions, limited levels of skills, access to finance, and the structure of the economy. The frustration and discomfort caused by unemployment, especially among the youth, can threaten the country's peace and stability as it negatively impacts on the standard of living, crime rates, family happiness, and drug abuse.To date, studies on employment in Namibia have mainly concentrated on the micro and macro econometric approaches. It is important to examine how bio-demographic characteristics affect employment. This paper uses data from the 2010 Income and expenditure survey to establish the bio-demographic determinants of employment by fitting a binary logistic model. The outcome variable is employment status which is dichotomous. The independent variables which were guided by review of related literature and availability of data in the Income and Expenditure survey data set, included age-group, region, place of residence, marital status, education level, and gender. Results indicated that employment prospects in Namibia were influenced by the region, gender, marital status, and education level.
文摘Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.
文摘In this paper, a logistical regression statistical analysis (LR) is presented for a set of variables used in experimental measurements in reversed field pinch (RFP) machines, commonly known as “slinky mode” (SM), observed to travel around the torus in Madison Symmetric Torus (MST). The LR analysis is used to utilize the modified Sine-Gordon dynamic equation model to predict with high confidence whether the slinky mode will lock or not lock when compared to the experimentally measured motion of the slinky mode. It is observed that under certain conditions, the slinky mode “locks” at or near the intersection of poloidal and/or toroidal gaps in MST. However, locked mode cease to travel around the torus;while unlocked mode keeps traveling without a change in the energy, making it hard to determine an exact set of conditions to predict locking/unlocking behaviour. The significant key model parameters determined by LR analysis are shown to improve the Sine-Gordon model’s ability to determine the locking/unlocking of magnetohydrodyamic (MHD) modes. The LR analysis of measured variables provides high confidence in anticipating locking versus unlocking of slinky mode proven by relational comparisons between simulations and the experimentally measured motion of the slinky mode in MST.
文摘In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.
基金Projects(52074299,41941018)supported by the National Natural Science Foundation of ChinaProject(2023JCCXSB02)supported by the Fundamental Research Funds for the Central Universities,China。
文摘Bedding structural planes significantly influence the mechanical properties and stability of engineering rock masses.This study conducts uniaxial compression tests on layered sandstone with various bedding angles(0°,15°,30°,45°,60°,75°and 90°)to explore the impact of bedding angle on the deformational mechanical response,failure mode,and damage evolution processes of rocks.It develops a damage model based on the Logistic equation derived from the modulus’s degradation considering the combined effect of the sandstone bedding dip angle and load.This model is employed to study the damage accumulation state and its evolution within the layered rock mass.This research also introduces a piecewise constitutive model that considers the initial compaction characteristics to simulate the whole deformation process of layered sandstone under uniaxial compression.The results revealed that as the bedding angle increases from 0°to 90°,the uniaxial compressive strength and elastic modulus of layered sandstone significantly decrease,slightly increase,and then decline again.The corresponding failure modes transition from splitting tensile failure to slipping shear failure and back to splitting tensile failure.As indicated by the modulus’s degradation,the damage characteristics can be categorized into four stages:initial no damage,damage initiation,damage acceleration,and damage deceleration termination.The theoretical damage model based on the Logistic equation effectively simulates and predicts the entire damage evolution process.Moreover,the theoretical constitutive model curves closely align with the actual stress−strain curves of layered sandstone under uniaxial compression.The introduced constitutive model is concise,with fewer parameters,a straightforward parameter determination process,and a clear physical interpretation.This study offers valuable insights into the theory of layered rock mechanics and holds implications for ensuring the safety of rock engineering.
基金This research work was funded by Institutional Fund Projects under Grant No.(IFPIP:211-611-1443).
文摘Malware is an ever-present and dynamic threat to networks and computer systems in cybersecurity,and because of its complexity and evasiveness,it is challenging to identify using traditional signature-based detection approaches.The study article discusses the growing danger to cybersecurity that malware hidden in PDF files poses,highlighting the shortcomings of conventional detection techniques and the difficulties presented by adversarial methodologies.The article presents a new method that improves PDF virus detection by using document analysis and a Logistic Model Tree.Using a dataset from the Canadian Institute for Cybersecurity,a comparative analysis is carried out with well-known machine learning models,such as Credal Decision Tree,Naïve Bayes,Average One Dependency Estimator,Locally Weighted Learning,and Stochastic Gradient Descent.Beyond traditional structural and JavaScript-centric PDF analysis,the research makes a substantial contribution to the area by boosting precision and resilience in malware detection.The use of Logistic Model Tree,a thorough feature selection approach,and increased focus on PDF file attributes all contribute to the efficiency of PDF virus detection.The paper emphasizes Logistic Model Tree’s critical role in tackling increasing cybersecurity threats and proposes a viable answer to practical issues in the sector.The results reveal that the Logistic Model Tree is superior,with improved accuracy of 97.46%when compared to benchmark models,demonstrating its usefulness in addressing the ever-changing threat landscape.
基金Under the auspices of National Natural Science Foundation of China(No.42101414)Natural Science Found for Outstanding Young Scholars in Jilin Province(No.20230508106RC)。
文摘The burning of crop residues in fields is a significant global biomass burning activity which is a key element of the terrestrial carbon cycle,and an important source of atmospheric trace gasses and aerosols.Accurate estimation of cropland burned area is both crucial and challenging,especially for the small and fragmented burned scars in China.Here we developed an automated burned area mapping algorithm that was implemented using Sentinel-2 Multi Spectral Instrument(MSI)data and its effectiveness was tested taking Songnen Plain,Northeast China as a case using satellite image of 2020.We employed a logistic regression method for integrating multiple spectral data into a synthetic indicator,and compared the results with manually interpreted burned area reference maps and the Moderate-Resolution Imaging Spectroradiometer(MODIS)MCD64A1 burned area product.The overall accuracy of the single variable logistic regression was 77.38%to 86.90%and 73.47%to 97.14%for the 52TCQ and 51TYM cases,respectively.In comparison,the accuracy of the burned area map was improved to 87.14%and 98.33%for the 52TCQ and 51TYM cases,respectively by multiple variable logistic regression of Sentind-2 images.The balance of omission error and commission error was also improved.The integration of multiple spectral data combined with a logistic regression method proves to be effective for burned area detection,offering a highly automated process with an automatic threshold determination mechanism.This method exhibits excellent extensibility and flexibility taking the image tile as the operating unit.It is suitable for burned area detection at a regional scale and can also be implemented with other satellite data.
文摘This research introduces a novel approach to improve and optimize the predictive capacity of consumer purchase behaviors on e-commerce platforms. This study presented an introduction to the fundamental concepts of the logistic regression algorithm. In addition, it analyzed user data obtained from an e-commerce platform. The original data were preprocessed, and a consumer purchase prediction model was developed for the e-commerce platform using the logistic regression method. The comparison study used the classic random forest approach, further enhanced by including the K-fold cross-validation method. Evaluation of the accuracy of the model’s classification was conducted using performance indicators that included the accuracy rate, the precision rate, the recall rate, and the F1 score. A visual examination determined the significance of the findings. The findings suggest that employing the logistic regression algorithm to forecast customer purchase behaviors on e-commerce platforms can improve the efficacy of the approach and yield more accurate predictions. This study serves as a valuable resource for improving the precision of forecasting customers’ purchase behaviors on e-commerce platforms. It has significant practical implications for optimizing the operational efficiency of e-commerce platforms.
文摘Internet of Things(IoT)is a popular social network in which devices are virtually connected for communicating and sharing information.This is applied greatly in business enterprises and government sectors for delivering the services to their customers,clients and citizens.But,the interaction is success-ful only based on the trust that each device has on another.Thus trust is very much essential for a social network.As Internet of Things have access over sen-sitive information,it urges to many threats that lead data management to risk.This issue is addressed by trust management that help to take decision about trust-worthiness of requestor and provider before communication and sharing.Several trust-based systems are existing for different domain using Dynamic weight meth-od,Fuzzy classification,Bayes inference and very few Regression analysis for IoT.The proposed algorithm is based on Logistic Regression,which provide strong statistical background to trust prediction.To make our stand strong on regression support to trust,we have compared the performance with equivalent sound Bayes analysis using Beta distribution.The performance is studied in simu-lated IoT setup with Quality of Service(QoS)and Social parameters for the nodes.The proposed model performs better in terms of various metrics.An IoT connects heterogeneous devices such as tags and sensor devices for sharing of information and avail different application services.The most salient features of IoT system is to design it with scalability,extendibility,compatibility and resiliency against attack.The existing worksfinds a way to integrate direct and indirect trust to con-verge quickly and estimate the bias due to attacks in addition to the above features.
文摘BACKGROUND The spread of the severe acute respiratory syndrome coronavirus 2 outbreak worldwide has caused concern regarding the mortality rate caused by the infection.The determinants of mortality on a global scale cannot be fully understood due to lack of information.AIM To identify key factors that may explain the variability in case lethality across countries.METHODS We identified 21 Potential risk factors for coronavirus disease 2019(COVID-19)case fatality rate for all the countries with available data.We examined univariate relationships of each variable with case fatality rate(CFR),and all independent variables to identify candidate variables for our final multiple model.Multiple regression analysis technique was used to assess the strength of relationship.RESULTS The mean of COVID-19 mortality was 1.52±1.72%.There was a statistically significant inverse correlation between health expenditure,and number of computed tomography scanners per 1 million with CFR,and significant direct correlation was found between literacy,and air pollution with CFR.This final model can predict approximately 97%of the changes in CFR.CONCLUSION The current study recommends some new predictors explaining affect mortality rate.Thus,it could help decision-makers develop health policies to fight COVID-19.
文摘Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. The above statement holds for West Texas, Midland, and Odessa Precisely. Two machine learning regression algorithms (Random Forest and XGBoost) were employed to develop models for the prediction of total dissolved solids (TDS) and sodium absorption ratio (SAR) for efficient water quality monitoring of two vital aquifers: Edward-Trinity (plateau), and Ogallala aquifers. These two aquifers have contributed immensely to providing water for different uses ranging from domestic, agricultural, industrial, etc. The data was obtained from the Texas Water Development Board (TWDB). The XGBoost and Random Forest models used in this study gave an accurate prediction of observed data (TDS and SAR) for both the Edward-Trinity (plateau) and Ogallala aquifers with the R<sup>2</sup> values consistently greater than 0.83. The Random Forest model gave a better prediction of TDS and SAR concentration with an average R, MAE, RMSE and MSE of 0.977, 0.015, 0.029 and 0.00, respectively. For the XGBoost, an average R, MAE, RMSE, and MSE of 0.953, 0.016, 0.037 and 0.00, respectively, were achieved. The overall performance of the models produced was impressive. From this study, we can clearly understand that Random Forest and XGBoost are appropriate for water quality prediction and monitoring in an area of high hydrocarbon activities like Midland and Odessa and West Texas at large.
文摘Possible changes in the structure and seasonal variability of the subtropical ridge may lead to changes in the rainfall’s variability modes over Caribbean region. This generates additional difficulties around water resource planning, therefore, obtaining seasonal prediction models that allow these variations to be characterized in detail, it’s a concern, specially for island states. This research proposes the construction of statistical-dynamic models based on PCA regression methods. It is used as predictand the monthly precipitation accumulated, while the predictors (6) are extracted from the ECMWF-SEAS5 ensemble mean forecasts with a lag of one month with respect to the target month. In the construction of the models, two sequential training schemes are evaluated, obtaining that only the shorter preserves the seasonal characteristics of the predictand. The evaluation metrics used, where cell-point and dichotomous methodologies are combined, suggest that the predictors related to sea surface temperatures do not adequately represent the seasonal variability of the predictand, however, others such as the temperature at 850 hPa and the Outgoing Longwave Radiation are represented with a good approximation regardless of the model chosen. In this sense, the models built with the nearest neighbor methodology were the most efficient. Using the individual models with the best results, an ensemble is built that allows improving the individual skill of the models selected as members by correcting the underestimation of precipitation in the dynamic model during the wet season, although problems of overestimation persist for thresholds lower than 50 mm.
基金National Social Science Fund Project“Research on the Operational Risks and Prevention of Government Procurement of Community Services Project System”(Project No.21CSH018)Research and Application of SDM Cigarette Supply Strategy Based on Consumer Data Analysis(Project No.2023ASXM07)。
文摘This study aims to analyze and predict the relationship between the average price per box in the cigarette market of City A and government procurement,providing a scientific basis and support for decision-making.By reviewing relevant theories and literature,qualitative prediction methods,regression prediction models,and other related theories were explored.Through the analysis of annual cigarette sales data and government procurement data in City A,a comprehensive understanding of the development of the tobacco industry and the economic trends of tobacco companies in the county was obtained.By predicting and analyzing the average price per box of cigarette sales across different years,corresponding prediction results were derived and compared with actual sales data.The prediction results indicate that the correlation coefficient between the average price per box of cigarette sales and government procurement is 0.982,implying that government procurement accounts for 96.4%of the changes in the average price per box of cigarettes.These findings offer an in-depth exploration of the relationship between the average price per box of cigarettes in City A and government procurement,providing a scientific foundation for corporate decision-making and market operations.
文摘This paper focuses on ozone prediction in the atmosphere using a machine learning approach. We utilize air pollutant and meteorological variable datasets from the El Paso area to classify ozone levels as high or low. The LR and ANN algorithms are employed to train the datasets. The models demonstrate a remarkably high classification accuracy of 89.3% in predicting ozone levels on a given day. Evaluation metrics reveal that both the ANN and LR models exhibit accuracies of 89.3% and 88.4%, respectively. Additionally, the AUC values for both models are comparable, with the ANN achieving 95.4% and the LR obtaining 95.2%. The lower the cross-entropy loss (log loss), the higher the model’s accuracy or performance. Our ANN model yields a log loss of 3.74, while the LR model shows a log loss of 6.03. The prediction time for the ANN model is approximately 0.00 seconds, whereas the LR model takes 0.02 seconds. Our odds ratio analysis indicates that features such as “Solar radiation”, “Std. Dev. Wind Direction”, “outdoor temperature”, “dew point temperature”, and “PM10” contribute to high ozone levels in El Paso, Texas. Based on metrics such as accuracy, error rate, log loss, and prediction time, the ANN model proves to be faster and more suitable for ozone classification in the El Paso, Texas area.
文摘This paper presents a case study on the IPUMS NHIS database,which provides data from censuses and surveys on the health of the U.S.population,including data related to COVID-19.By addressing gaps in previous studies,we propose a machine learning approach to train predictive models for identifying and measuring factors that affect the severity of COVID-19 symptoms.Our experiments focus on four groups of factors:demographic,socio-economic,health condition,and related to COVID-19 vaccination.By analysing the sensitivity of the variables used to train the models and the VEC(variable effect characteristics)analysis on the variable values,we identify and measure importance of various factors that influence the severity of COVID-19 symptoms.
文摘目的:比较决策树和Logistic回归模型对体外受精-胚胎移植(in vitro fertilization and embryo transfer,IVF-ET)患者妊娠结局的预测价值。方法:纳入2021年1月至2022年10月在长治医学院附属和平医院接受IVF-ET的患者350例为研究对象,根据妊娠结局分为妊娠成功组(215例)和妊娠失败组(135例)。收集患者临床资料,建立IVF-ET患者妊娠结局Logistic回归和决策树预测模型,并在是否基于Logistic回归结果条件下建立决策树分析模型(决策树1和决策树2),采用受试者工作特征(receiver operating characteristic,ROC)曲线对模型预测效果进行评价。结果:350例患者中,妊娠成功患者占61.43%,妊娠失败者占38.57%。妊娠失败组年龄≥35岁、不孕年限≥5年、周期次数≥1次、有心理精神障碍的患者比例及HCG日血清孕酮水平均高于妊娠成功组,获卵数≥10枚、受精率≥75%的患者比例及HCG日子宫内膜厚度、优质胚胎数小于妊娠成功组(P<0.05)。多因素Logistic回归分析结果显示,年龄、HCG日血清孕酮水平、优质胚胎数及心理精神障碍均是IVF-ET患者妊娠结局的影响因素(P<0.05)。决策树模型显示,年龄、HCG日血清孕酮水平、优质胚胎数为IVF-ET患者妊娠结局的影响因素。Logistic回归模型曲线下面积(area under curve,AUC)为0.832,预测敏感度、特异度和准确度分别为87.3%、71.4%、83.5%;决策树1的AUC为0.859,预测敏感度、特异度和准确度分别为85.1%、76.8%、85.6%;决策树2的AUC为0.820,预测敏感度、特异度和准确度分别为83.7%、73.2%、82.4%。决策树1的AUC大于决策树2(P<0.05),但与Logistic回归模型的AUC比较差异无统计学意义(P>0.05)。结论:Logistic回归模型和决策树模型对于IVF-ET患者妊娠结局均有一定的预测价值。