随着工业化和城市化的快速发展,空气质量问题日益凸显,成为影响人们生活质量和社会可持续发展的重要因素,因此,研究空气质量的影响因素,对于改善空气质量、保护生态环境具有重要意义。本文旨在探讨影响重庆市空气质量的各个因素,选取重...随着工业化和城市化的快速发展,空气质量问题日益凸显,成为影响人们生活质量和社会可持续发展的重要因素,因此,研究空气质量的影响因素,对于改善空气质量、保护生态环境具有重要意义。本文旨在探讨影响重庆市空气质量的各个因素,选取重庆空气质量在线检测分析平台的相关数据,运用统计学方法进行预处理和深入分析,将PM2.5、PM10、SO2、NO2、CO及O3作为自变量,空气质量指数作为因变量,构建多元线性回归模型,研究其因素对空气质量的影响,并识别重庆市近一年来空气质量的主要污染物,从而提出改善空气质量的针对性建议,为重庆市进行空气质量的改善提供科学依据和策略。With the rapid development of industrialization and urbanization, air quality issues are becoming more and more prominent, and have become an important factor affecting people’s quality of life and the sustainable development of society. Therefore, it is of great significance to study the influencing factors of air quality in order to improve air quality and protect the ecological environment. The purpose of this paper is to discuss the various factors affecting air quality in Chongqing Municipality, and relevant data from the Chongqing Air Quality Online Detection and Analysis Platform have been selected. And this data was pre-processed and analyzed in depth using statistical methods. In this paper, the PM2.5, PM10, SO2, NO2, CO and O3 were used as independent variables, and the air quality index was used as the dependent variable to construct a multiple linear regression model to investigate the effects of their factors on air quality, and identify the main pollutants in the air quality of Chongqing Municipality in the past year, so as to put forward targeted recommendations to improve air quality, and provide a scientific basis and strategy for Chongqing Municipality to carry out air quality improvement.展开更多
本文将图模型运用于环境污染问题,通过高斯图模型方法与AP聚类算法分析浙江地区空气质量的主要影响因素。重点从气象及经济方向挖掘对空气质量指数的影响,同时在经济方向的分析结合了EKC曲线概念,更为科学的解释判断经济发展对环境污染...本文将图模型运用于环境污染问题,通过高斯图模型方法与AP聚类算法分析浙江地区空气质量的主要影响因素。重点从气象及经济方向挖掘对空气质量指数的影响,同时在经济方向的分析结合了EKC曲线概念,更为科学的解释判断经济发展对环境污染的影响与目前的现状,并通过空间插值的方法来对一定区域空气质量分布进行估计,以此解决了空气质量监测站点数量有限的问题。结果表明:杭嘉湖地区的空气污染问题较为严重,而舟山地区保持着最好的空气质量,浙江地区空气质量受到气象及经济两方面影响,包括气压、相对湿度等气象因素及第二产业比重、人口规模等经济因素。In this paper, the graph model is applied to the problem of environmental pollution, and the main influencing factors of air quality in Zhejiang are analyzed by the Gaussian graph model method and AP clustering algorithm. At the same time, the analysis of the economic direction combines the concept of EKC curve to explain and judge the impact of economic development on environmental pollution and the current status quo more scientifically, and estimates the air quality distribution in a certain area through spatial interpolation, so as to solve the problem of limited number of air quality monitoring stations. The results show that the air pollution in Hangjiahu area is more serious, while Zhoushan area maintains the best air quality. The air quality in Zhejiang area is affected by meteorological and economic factors, including atmospheric pressure, relative humidity and other meteorological factors, the proportion of the second industry, population size and other economic factors.展开更多
近年来,个性化医疗引起研究者们的广泛关注,抗癌药物敏感性预测便是个性化医疗的一个主要挑战。本文将CCLE作为抗癌药物敏感性研究的数据集,选取了不同细胞系上的基因表达数据以及药物敏感性数据。同时我们设计了一种名为PCA Transforme...近年来,个性化医疗引起研究者们的广泛关注,抗癌药物敏感性预测便是个性化医疗的一个主要挑战。本文将CCLE作为抗癌药物敏感性研究的数据集,选取了不同细胞系上的基因表达数据以及药物敏感性数据。同时我们设计了一种名为PCA Transformer (PCAT)的混合深度学习与机器学习的方法来对抗癌药物敏感性进行预测。首先构造一个PCA模型来提取在不同细胞系上的基因表达数据中的重要变量,使得约5万的基因维度降至500;随后基于降维后的基因表达值建立了一个神经网络Transformer模型来预测药物敏感性,通过均方根误差(RMSE)来评估我们模型的性能,以结果最优的潜变量数量建立的模型作为最终模型。为了验证PCA Transformer的性能,本文将Transformer模型与预测模型随机森林(RF)和支持向量回归(SVR)来进行对比,为了排除降维方法的影响,统一使用PCA进行降维。具体组合包括:PCA Transformer、PCA + SVR、PCA + RF。最后与前人研究方法(ISIRS)的结果进行比较并优化。最终的预测结果看出,对于CCLE中的24种药物,本方法预测得到的平均RMSE为0.7564,有6种药物的RMSE小于0.5 (L-685458、PF2341066等),有18种药物的RMSE小于1。与其比较的预测方法的平均RMSE分别为:0.8284 (PCA + SVR)、0.8757 (PCA + RF)、ISIRS (0.9258),体现出本方法有着更强的泛化能力。In recent years, personalized medicine has attracted extensive attention from researchers, and the prediction of anticancer drug susceptibility is a major challenge for personalized medicine. In this paper, CCLE was used as a dataset for anticancer drug susceptibility studies, and gene expression data and drug sensitivity data on different cell lines were selected. At the same time, we designed a hybrid deep learning and machine learning method called PCA Transformer (PCAT) to predict the susceptibility of anticancer drugs. Firstly, a PCA model was constructed to extract important variables in gene expression data on different cell lines, so that the gene dimension of about 50,000 was reduced to 500. Then, a neural network Transformer model was established based on the dimensionality reduction gene expression value to predict drug sensitivity, the performance of our model was evaluated by root mean square error (RMSE), and the model established with the optimal number of latent variables was used as the final model. In order to verify the performance of PCA Transformer, this paper compares the Transformer model with the prediction model random forest (RF) and support vector regression (SVR). Specific combinations include: PCA Transformer, PCA + SVR, PCA + RF. Finally, the results were compared and optimized with the results of previous research methods (ISIRS). The final prediction results showed that for the 24 drugs in CCLE, the average RMSE predicted by this method was 0.7564, 6 drugs had RMSE less than 0.5 (L-685458, PF2341066, etc.), and 18 drugs had RMSE less than 1. The average RMSE of the prediction method is 0.8284 (PCA + SVR), 0.8757 (PCA + RF) and ISIRS (0.9258), respectively, indicating that the proposed method has stronger generalization ability.展开更多
随着信息技术和大数据的发展,农产品价格预测对市场分析和决策起着越来越重要。本文爬取了农产品从2022年1月1日至2024年6月23日的数据,并基于这些数据建立了线性回归模型,ARIMA模型,以及随机森林模型三种不同的模型进行预测。研究结果...随着信息技术和大数据的发展,农产品价格预测对市场分析和决策起着越来越重要。本文爬取了农产品从2022年1月1日至2024年6月23日的数据,并基于这些数据建立了线性回归模型,ARIMA模型,以及随机森林模型三种不同的模型进行预测。研究结果表明,将时间与地点作为自变量预测价格时,随机森林模型预测的效果优于其他两个模型,能有效地捕捉价格变化趋势,为市场参与者提供决策支持。本文基于随机森林模型,利用FLASK框架构建了WEB端,使用者只需要在该网页选择产地及时间便可直接看到当日所预测的价格。With the development of information technology and big data, agricultural product price forecasting plays an increasingly important role in market analysis and decision-making. This paper crawls the data of agricultural products from 1 January 2022 to 23 June 2024, and based on these data, three different models, linear regression model, ARIMA model, and random forest model, are established for prediction. The results of the study show that when time and place are used as independent variables to predict prices, the Random Forest Model predicts better than the other two models, effectively capturing price trends and providing decision support for market participants. Based on the Random Forest model, this paper constructs a web page using the FLASK framework, in which users only need to select the origin and time to see the predicted price on the same day directly.展开更多
在金融市场中,极端事件往往会对投资者造成较大的损失,建立有效的极端值模型可以降低极端风险对投资者产生的影响。本文考虑了极端尾部风险的情况,基于极值理论和SGEL分布,将POT模型中的超额分布用SGEL分布近似,提出了POT-SGEL模型;应用...在金融市场中,极端事件往往会对投资者造成较大的损失,建立有效的极端值模型可以降低极端风险对投资者产生的影响。本文考虑了极端尾部风险的情况,基于极值理论和SGEL分布,将POT模型中的超额分布用SGEL分布近似,提出了POT-SGEL模型;应用POT-SGEL模型来估计标普100指数日对数收益率的极端VaR值;通过与POT模型进行对比发现,POT-SGEL模型能够对极端VaR值进行估计,且在一定程度上比POT模型更优。In financial markets, extreme events tend to cause large losses to investors, and modelling effective extremes can reduce the impact of extreme risks on investors. This article considers the case of extreme tail risk, and based on the extreme value theory and SGEL distribution, the excess distribution in the POT model is approximated by the SGEL distribution, and the POT-SGEL model is proposed;the POT-SGEL model is applied to estimate the extreme VaR values of the daily log returns of the S&P 100 index;through comparison with the POT model, it is found that the POT-SGEL model is able to estimate the extreme VaR values and is to some extent better than the POT model.展开更多
文摘随着工业化和城市化的快速发展,空气质量问题日益凸显,成为影响人们生活质量和社会可持续发展的重要因素,因此,研究空气质量的影响因素,对于改善空气质量、保护生态环境具有重要意义。本文旨在探讨影响重庆市空气质量的各个因素,选取重庆空气质量在线检测分析平台的相关数据,运用统计学方法进行预处理和深入分析,将PM2.5、PM10、SO2、NO2、CO及O3作为自变量,空气质量指数作为因变量,构建多元线性回归模型,研究其因素对空气质量的影响,并识别重庆市近一年来空气质量的主要污染物,从而提出改善空气质量的针对性建议,为重庆市进行空气质量的改善提供科学依据和策略。With the rapid development of industrialization and urbanization, air quality issues are becoming more and more prominent, and have become an important factor affecting people’s quality of life and the sustainable development of society. Therefore, it is of great significance to study the influencing factors of air quality in order to improve air quality and protect the ecological environment. The purpose of this paper is to discuss the various factors affecting air quality in Chongqing Municipality, and relevant data from the Chongqing Air Quality Online Detection and Analysis Platform have been selected. And this data was pre-processed and analyzed in depth using statistical methods. In this paper, the PM2.5, PM10, SO2, NO2, CO and O3 were used as independent variables, and the air quality index was used as the dependent variable to construct a multiple linear regression model to investigate the effects of their factors on air quality, and identify the main pollutants in the air quality of Chongqing Municipality in the past year, so as to put forward targeted recommendations to improve air quality, and provide a scientific basis and strategy for Chongqing Municipality to carry out air quality improvement.
文摘本文将图模型运用于环境污染问题,通过高斯图模型方法与AP聚类算法分析浙江地区空气质量的主要影响因素。重点从气象及经济方向挖掘对空气质量指数的影响,同时在经济方向的分析结合了EKC曲线概念,更为科学的解释判断经济发展对环境污染的影响与目前的现状,并通过空间插值的方法来对一定区域空气质量分布进行估计,以此解决了空气质量监测站点数量有限的问题。结果表明:杭嘉湖地区的空气污染问题较为严重,而舟山地区保持着最好的空气质量,浙江地区空气质量受到气象及经济两方面影响,包括气压、相对湿度等气象因素及第二产业比重、人口规模等经济因素。In this paper, the graph model is applied to the problem of environmental pollution, and the main influencing factors of air quality in Zhejiang are analyzed by the Gaussian graph model method and AP clustering algorithm. At the same time, the analysis of the economic direction combines the concept of EKC curve to explain and judge the impact of economic development on environmental pollution and the current status quo more scientifically, and estimates the air quality distribution in a certain area through spatial interpolation, so as to solve the problem of limited number of air quality monitoring stations. The results show that the air pollution in Hangjiahu area is more serious, while Zhoushan area maintains the best air quality. The air quality in Zhejiang area is affected by meteorological and economic factors, including atmospheric pressure, relative humidity and other meteorological factors, the proportion of the second industry, population size and other economic factors.
文摘近年来,个性化医疗引起研究者们的广泛关注,抗癌药物敏感性预测便是个性化医疗的一个主要挑战。本文将CCLE作为抗癌药物敏感性研究的数据集,选取了不同细胞系上的基因表达数据以及药物敏感性数据。同时我们设计了一种名为PCA Transformer (PCAT)的混合深度学习与机器学习的方法来对抗癌药物敏感性进行预测。首先构造一个PCA模型来提取在不同细胞系上的基因表达数据中的重要变量,使得约5万的基因维度降至500;随后基于降维后的基因表达值建立了一个神经网络Transformer模型来预测药物敏感性,通过均方根误差(RMSE)来评估我们模型的性能,以结果最优的潜变量数量建立的模型作为最终模型。为了验证PCA Transformer的性能,本文将Transformer模型与预测模型随机森林(RF)和支持向量回归(SVR)来进行对比,为了排除降维方法的影响,统一使用PCA进行降维。具体组合包括:PCA Transformer、PCA + SVR、PCA + RF。最后与前人研究方法(ISIRS)的结果进行比较并优化。最终的预测结果看出,对于CCLE中的24种药物,本方法预测得到的平均RMSE为0.7564,有6种药物的RMSE小于0.5 (L-685458、PF2341066等),有18种药物的RMSE小于1。与其比较的预测方法的平均RMSE分别为:0.8284 (PCA + SVR)、0.8757 (PCA + RF)、ISIRS (0.9258),体现出本方法有着更强的泛化能力。In recent years, personalized medicine has attracted extensive attention from researchers, and the prediction of anticancer drug susceptibility is a major challenge for personalized medicine. In this paper, CCLE was used as a dataset for anticancer drug susceptibility studies, and gene expression data and drug sensitivity data on different cell lines were selected. At the same time, we designed a hybrid deep learning and machine learning method called PCA Transformer (PCAT) to predict the susceptibility of anticancer drugs. Firstly, a PCA model was constructed to extract important variables in gene expression data on different cell lines, so that the gene dimension of about 50,000 was reduced to 500. Then, a neural network Transformer model was established based on the dimensionality reduction gene expression value to predict drug sensitivity, the performance of our model was evaluated by root mean square error (RMSE), and the model established with the optimal number of latent variables was used as the final model. In order to verify the performance of PCA Transformer, this paper compares the Transformer model with the prediction model random forest (RF) and support vector regression (SVR). Specific combinations include: PCA Transformer, PCA + SVR, PCA + RF. Finally, the results were compared and optimized with the results of previous research methods (ISIRS). The final prediction results showed that for the 24 drugs in CCLE, the average RMSE predicted by this method was 0.7564, 6 drugs had RMSE less than 0.5 (L-685458, PF2341066, etc.), and 18 drugs had RMSE less than 1. The average RMSE of the prediction method is 0.8284 (PCA + SVR), 0.8757 (PCA + RF) and ISIRS (0.9258), respectively, indicating that the proposed method has stronger generalization ability.
文摘随着信息技术和大数据的发展,农产品价格预测对市场分析和决策起着越来越重要。本文爬取了农产品从2022年1月1日至2024年6月23日的数据,并基于这些数据建立了线性回归模型,ARIMA模型,以及随机森林模型三种不同的模型进行预测。研究结果表明,将时间与地点作为自变量预测价格时,随机森林模型预测的效果优于其他两个模型,能有效地捕捉价格变化趋势,为市场参与者提供决策支持。本文基于随机森林模型,利用FLASK框架构建了WEB端,使用者只需要在该网页选择产地及时间便可直接看到当日所预测的价格。With the development of information technology and big data, agricultural product price forecasting plays an increasingly important role in market analysis and decision-making. This paper crawls the data of agricultural products from 1 January 2022 to 23 June 2024, and based on these data, three different models, linear regression model, ARIMA model, and random forest model, are established for prediction. The results of the study show that when time and place are used as independent variables to predict prices, the Random Forest Model predicts better than the other two models, effectively capturing price trends and providing decision support for market participants. Based on the Random Forest model, this paper constructs a web page using the FLASK framework, in which users only need to select the origin and time to see the predicted price on the same day directly.
文摘在金融市场中,极端事件往往会对投资者造成较大的损失,建立有效的极端值模型可以降低极端风险对投资者产生的影响。本文考虑了极端尾部风险的情况,基于极值理论和SGEL分布,将POT模型中的超额分布用SGEL分布近似,提出了POT-SGEL模型;应用POT-SGEL模型来估计标普100指数日对数收益率的极端VaR值;通过与POT模型进行对比发现,POT-SGEL模型能够对极端VaR值进行估计,且在一定程度上比POT模型更优。In financial markets, extreme events tend to cause large losses to investors, and modelling effective extremes can reduce the impact of extreme risks on investors. This article considers the case of extreme tail risk, and based on the extreme value theory and SGEL distribution, the excess distribution in the POT model is approximated by the SGEL distribution, and the POT-SGEL model is proposed;the POT-SGEL model is applied to estimate the extreme VaR values of the daily log returns of the S&P 100 index;through comparison with the POT model, it is found that the POT-SGEL model is able to estimate the extreme VaR values and is to some extent better than the POT model.