A quantitative research on the effect of coal mining on the soil organic carbon(SOC)pool at regional scale is beneficial to the scientific management of SOC pools in coal mining areas and the realization of coal low-c...A quantitative research on the effect of coal mining on the soil organic carbon(SOC)pool at regional scale is beneficial to the scientific management of SOC pools in coal mining areas and the realization of coal low-carbon mining.Moreover,the spatial prediction model of SOC content suitable for coal mining subsidence area is a scientific problem that must be solved.Tak-ing the Changhe River Basin of Jincheng City,Shanxi Province,China,as the study area,this paper proposed a radial basis function neural network model combined with the ordinary kriging method.The model includes topography and vegetation factors,which have large influence on soil properties in mining areas,as input parameters to predict the spatial distribution of SOC in the 0-20 and 2040 cm soil layers of the study area.And comparing the prediction effect with the direct kriging method,the results show that the mean error,the mean absolute error and the root mean square error between the predicted and measured values of SOC content predicted by the radial basis function neural network are lower than those obtained by the direct kriging method.Based on the fitting effect of the predicted and measured values,the R^(2) obtained by the radial basis artificial neural network are 0.81,0.70,respectively,higher than the value of 0.44 and 0.36 obtained by the direct kriging method.Therefore,the model combining the artificial neural network and kriging,and considering environmental factors can improve the prediction accuracy of the SOC content in mining areas.展开更多
Understanding the mechanisms and risks of forest fires by building a spatial prediction model is an important means of controlling forest fires.Non-fire point data are important training data for constructing a model,...Understanding the mechanisms and risks of forest fires by building a spatial prediction model is an important means of controlling forest fires.Non-fire point data are important training data for constructing a model,and their quality significantly impacts the prediction performance of the model.However,non-fire point data obtained using existing sampling methods generally suffer from low representativeness.Therefore,this study proposes a non-fire point data sampling method based on geographical similarity to improve the quality of non-fire point samples.The method is based on the idea that the less similar the geographical environment between a sample point and an already occurred fire point,the greater the confidence in being a non-fire point sample.Yunnan Province,China,with a high frequency of forest fires,was used as the study area.We compared the prediction performance of traditional sampling methods and the proposed method using three commonly used forest fire risk prediction models:logistic regression(LR),support vector machine(SVM),and random forest(RF).The results show that the modeling and prediction accuracies of the forest fire prediction models established based on the proposed sampling method are significantly improved compared with those of the traditional sampling method.Specifically,in 2010,the modeling and prediction accuracies improved by 19.1%and 32.8%,respectively,and in 2020,they improved by 13.1%and 24.3%,respectively.Therefore,we believe that collecting non-fire point samples based on the principle of geographical similarity is an effective way to improve the quality of forest fire samples,and thus enhance the prediction of forest fire risk.展开更多
Hue-Saturation-Intensity (HSI) color model, a psychologically appealing color model, was employed to visualize uncertainty represented by relative prediction error based on the case of spatial prediction of pH of to...Hue-Saturation-Intensity (HSI) color model, a psychologically appealing color model, was employed to visualize uncertainty represented by relative prediction error based on the case of spatial prediction of pH of topsoil in the peri-urban Beijing. A two-dimensional legend was designed to accompany the visualization-vertical axis (hues) for visualizing the predicted values and horizontal axis (whiteness) for visualizing the prediction error. Moreover, different ways of visualizing uncertainty were briefly reviewed in this paper. This case study indicated that visualization of both predictions and prediction uncertainty offered a possibility to enhance visual exploration of the data uncertainty and to compare different prediction methods or predictions of totally different variables. The whitish region of the visualization map can be simply interpreted as unsatisfactory prediction results, where may need additional samples or more suitable prediction models for a better prediction results.展开更多
Fuzzy classification combined with spatial prediction was used to assess the state of soil pollution in the peri-urban Beijing area. Total concentrations of As, Cr, Cd, Hg, and Pb were determined in 220 topsoil sampl...Fuzzy classification combined with spatial prediction was used to assess the state of soil pollution in the peri-urban Beijing area. Total concentrations of As, Cr, Cd, Hg, and Pb were determined in 220 topsoil samples (0-20 cm) collected using a grid design in a study area of 2 600 kin2. Heavy metal concentrations were grouped into three classes according to the optimum number of classes and fuzziness exponent using the fuzzy comean (FCM) algorithm. Membership values were interpolated using ordinary kriging. The polluted soils of the study area induced by the measured heavy metals were concentrated in the northwest corner and eastern part, especially the southeastern part close to the urban zone, whereas the soils free of pollution were mainly distributed in the southwestern part. The soils with potential risk of heavy metal pollution were located in isolated spots mainly in the northern part and southeastern corner of the study region. The FCM algorithm combined with geostatistical techniques, as compared to conventional single geostatistical kriging methods, could produce a prediction with a quantitative uncertainty evaluation and higher reliability. Successful prediction of soil pollution achieved with FCM algorithm in this study indicated that fuzzy set theory had great potential for use in other areas of soil science.展开更多
Geostatistics provides a coherent framework for spatial prediction and uncertainty assessment, whereby spatial dependence, as quantified by variograms, is utilized for best linear unbiased estimation of a regionalized...Geostatistics provides a coherent framework for spatial prediction and uncertainty assessment, whereby spatial dependence, as quantified by variograms, is utilized for best linear unbiased estimation of a regionalized variable at unsampied locations. Geostatistics for prediction of continuous regionalized variables is reviewed, with key methods underlying the derivation of major variants of uni-vafiate Kriging described in an easy-to-follow manner. This paper will contribute to demysti- fication and, hence, popularization of geostatistics in geoinformatics communities.展开更多
Identifying the ecological environment suitable for the growth of Thuja sutchuenensis and predicting other potential distribution areas are essential to protect this endangered species. After selecting 24 environmenta...Identifying the ecological environment suitable for the growth of Thuja sutchuenensis and predicting other potential distribution areas are essential to protect this endangered species. After selecting 24 environmental factors thatcould affect the distribution of T. sutchuenensis, including climate, topography, soil and Normalized DifferenceVegetation Index (NDVI), we adopted the Random Forest-MaxEnt integrated model to analyze our data. Basedon the Random Forest study, the contribution of the mean temperature of the warmest quarter, mean temperatureof the coldest quarter, annual mean temperature and mean temperature of the driest quarter was large. Based onMaxEnt model prediction outputs, the potential distribution map not only identified areas that originallyrecorded T. sutchuenensis, such as Xuanhan County, Kai County and Chengkou County, but also identified highlysuitable distribution areas where T. sutchuenensis may exist, including Wanyuan County, Sichuan Province, andthe junction of Chongqing and Hubei Province. This provides a more explicit geographic range for ex situ conservation and reintroduction of T. sutchuenensis. Our results also indicate that, in addition to climate factors,topography and soil factors are also important environmental factors that affect distribution. This provides a theoretical basis for subsequent laboratory construction to simulate the indoor growth of T. sutchuenensis.展开更多
By using the landslide risk evaluating model and the advantages of GIS technology in image processing and space analysis, the relative landslide hazard and risk evaluating system of the new county site of Badong is bu...By using the landslide risk evaluating model and the advantages of GIS technology in image processing and space analysis, the relative landslide hazard and risk evaluating system of the new county site of Badong is built up. The system is mainly consisted of four subsystems: Information management subsystem, hazard as- sessment subsystem, vulnerability evaluation subsystem and risk prediction subsystem. In the system, landslide hazard assessment, vulnerability evaluation, risk predictions are carried out automatically based on irregular units. At last the landslide hazard and risk map of the study area is compiled. During the whole procedure, Matter-Element Model, Artificial Neural Network, ancl Information Model are used as assessment models. This system provides an effective way for the landslide hazard information management and risk prediction of each district in the Reservoir of Three Gorge Project. The result of the assessment can be a gist and ensure for the land planning and the emigration project in Badong.展开更多
Machine learning methods are increasingly used for spatially predicting a categorical target variable when spatially exhaustive predictor variables are available within the study region.Even though these methods exhib...Machine learning methods are increasingly used for spatially predicting a categorical target variable when spatially exhaustive predictor variables are available within the study region.Even though these methods exhibit competitive spatial prediction performance,they do not exactly honor the categorical target variable's observed values at sampling locations by construction.On the other side,competitor geostatistical methods perfectly match the categorical target variable's observed values at sampling locations by essence.In many geoscience applications,it is often desirable to perfectly match the observed values of the categorical target variable at sampling locations,especially when the categorical target variable's measurements can be reasonably considered error-free.This paper addresses the problem of exact conditioning of machine learning methods for the spatial prediction of categorical variables.It introduces a classification random forest-based approach in which the categorical target variable is exactly conditioned to the data,thus having the exact conditioning property like competitor geostatistical methods.The proposed method extends a previous work dedicated to continuous target variables by using an implicit representation of the categorical target variable.The basic idea consists of transforming the ensemble of classification tree predictors'(categorical)resulting from the traditional classification random forest into an ensemble of signed distances(continuous)associated with each category of the categorical target variable.Then,an orthogonal representation of the ensemble of signed distances is created through the principal component analysis,thus allowing to reformulate the exact conditioning problem as a system of linear inequalities on principal component scores.Then,the sampling of new principal component scores ensuring the data's exact conditioning is performed via randomized quadratic programming.The resulting conditional signed distances are turned out into an ensemble of categorical outputs,which perfectly honor the categorical target variable's observed values at sampling locations.Then,the majority vote is used to aggregate the ensemble of categorical outputs.The effectiveness of the proposed method is illustrated on a simulated dataset for which ground-truth is available and showcased on a real-world dataset,including geochemical data.A comparison with geostatistical and traditional machine learning methods show that the proposed technique can perfectly match the categorical target variable's observed values at sampling locations while maintaining competitive out-of-sample predictive performance.展开更多
Regression random forest is becoming a widely-used machine learning technique for spatial prediction that shows competitive prediction performance in various geoscience fields.Like other popular machine learning metho...Regression random forest is becoming a widely-used machine learning technique for spatial prediction that shows competitive prediction performance in various geoscience fields.Like other popular machine learning methods for spatial prediction,regression random forest does not exactly honor the response variable’s measured values at sampled locations.However,competitor methods such as regression-kriging perfectly fit the response variable’s observed values at sampled locations by construction.Exactly matching the response variable’s measured values at sampled locations is often desirable in many geoscience applications.This paper presents a new approach ensuring that regression random forest perfectly matches the response variable’s observed values at sampled locations.The main idea consists of using the principal component analysis to create an orthogonal representation of the ensemble of regression tree predictors resulting from the traditional regression random forest.Then,the exact conditioning problem is reformulated as a Bayes-linear-Gauss problem on principal component scores.This problem has an analytical solution making it easy to perform Monte Carlo sampling of new principal component scores and then reconstruct regression tree predictors that perfectly match the response variable’s observed values at sampled locations.The reconstructed regression tree predictors’average also precisely matches the response variable’s measured values at sampled locations by construction.The proposed method’s effectiveness is illustrated on the one hand using a synthetic dataset where the ground-truth is available everywhere within the study region,and on the other hand,using a real dataset comprising southwest England’s geochemical concentration data.It is compared with the regression-kriging and the traditional regression random forest.It appears that the proposed method can perfectly fit the response variable’s measured values at sampled locations while achieving good out of sample predictive performance comparatively to regression-kriging and traditional regression random forest.展开更多
The spatial prediction of a continuous response variable when spatially exhaustive predictor variables are available within the region under study has become ubiquitous in many geoscience fields.The response variable ...The spatial prediction of a continuous response variable when spatially exhaustive predictor variables are available within the region under study has become ubiquitous in many geoscience fields.The response variable is often subject to detection limits due to limitations of the measuring instrument or the sampling protocol used.Consequently,the response variable's observations are censored(left-censored,right-censored,or intervalcensored).Machine learning methods dedicated to the spatial prediction of uncensored response variables can not explicitly account for the response variable's censored observations.In such cases,they are routinely applied through ad hoc approaches such as ignoring the response variable's censored observations or replacing them with arbitrary values.Therefore,the response variable's spatial prediction may be inaccurate and sensitive to the assumptions and approximations involved in those arbitrary choices.This paper introduces a random forest-based machine learning method for spatially predicting a censored response variable,in which the response variable's censored observations are explicitly taken into account.The basic idea consists of building an ensemble of regression tree predictors by training the classical regression random forest on the subset of data containing only the response variable's uncensored observations.Then,the principal component analysis applied to this ensemble allows translating the response variable's observations(uncensored and censored)into a linear equalities and inequalities system.This system of linear equalities and inequalities is solved through randomized quadratic programming,which allows obtaining an ensemble of reconstructed regression tree predictors that exactly honor the response variable's observations(uncensored and censored).The response variable's spatial prediction is then obtained by averaging this latter ensemble.The effectiveness of the proposed machine learning method is illustrated on simulated data for which ground truth is available and showcased on real-world data,including geochemical data.The results suggest that the proposed machine learning technique allows greater utilization of the response variable's censored observations than ad hoc methods.展开更多
Geoscientists are increasingly tasked with spatially predicting a target variable in the presence of auxiliary information using supervised machine learning algorithms.Typically,the target variable is observed at a fe...Geoscientists are increasingly tasked with spatially predicting a target variable in the presence of auxiliary information using supervised machine learning algorithms.Typically,the target variable is observed at a few sampling locations due to the relatively time-consuming and costly process of obtaining measurements.In contrast,auxiliary variables are often exhaustively observed within the region under study through the increasing development of remote sensing platforms and sensor networks.Supervised machine learning methods do not fully leverage this large amount of auxiliary spatial data.Indeed,in these methods,the training dataset includes only labeled data locations(where both target and auxiliary variables were measured).At the same time,unlabeled data locations(where auxiliary variables were measured but not the target variable)are not considered during the model training phase.Consequently,only a limited amount of auxiliary spatial data is utilized during the model training stage.As an alternative to supervised learning,semi-supervised learning,which learns from labeled as well as unlabeled data,can be used to address this problem.However,conventional semi-supervised learning techniques do not account for the specificities of spatial data.This paper introduces a spatial semi-supervised learning framework where geostatistics and machine learning are combined to harness a large amount of unlabeled spatial data in combination with typically a smaller set of labeled spatial data.The main idea consists of leveraging the target variable’s spatial autocorrelation to generate pseudo labels at unlabeled data points that are geographically close to labeled data points.This is achieved through geostatistical conditional simulation,where an ensemble of pseudo labels is generated to account for the uncertainty in the pseudo labeling process.The observed labels are augmented by this ensemble of pseudo labels to create an ensemble of pseudo training datasets.A supervised machine learning model is then trained on each pseudo training dataset,followed by an aggregation of trained models.The proposed geostatistical semi-supervised learning method is applied to synthetic and real-world spatial datasets.Its predictive performance is compared with some classical supervised and semi-supervised machine learning methods.It appears that it can effectively leverage a large amount of unlabeled spatial data to improve the target variable’s spatial prediction.展开更多
Near real-time spatial prediction of earthquake-induced landslides(EQILs)can rapidly forecast the occurrence position of widespread landslides just after a violent earthquake;thus,EQIL prediction is very crucial to th...Near real-time spatial prediction of earthquake-induced landslides(EQILs)can rapidly forecast the occurrence position of widespread landslides just after a violent earthquake;thus,EQIL prediction is very crucial to the 72-hour‘golden window’for survivors.This work focuses on a series of earthquake events from 2008 to 2022 occurring in the Tibetan Plateau,a famous seismically-active zone,and proposes a novel interpretable self-supervised learning(ISeL)method for the near real-time spatial prediction of EQILs.This new method innovatively introduces swap noise at the unsupervised mechanism,which can improve the generalization performance and transferability of the model,and can effectively reduce false alarm and improve accuracy through supervisedfine-tuning.An interpretable module is built based on a self-attention mechanism to reveal the importance and contribution of various influencing factors to EQIL spatial distribution.Experimental results demonstrate that the ISeL model is superior to the excellent state-of-the-art machine learning and deep learning methods.Furthermore,according to the interpretable module in the ISeL method,the critical controlling and triggering factors are revealed.The ISeL method can also be applied in other earthquake-frequent regions worldwide because of its good generalization and transferability.展开更多
Spatial prediction of any geographic phenomenon can be an intractable problem.Predicting sparse and uncertain spatial events related to many influencing factors necessitates the integration of multiple data sources.We...Spatial prediction of any geographic phenomenon can be an intractable problem.Predicting sparse and uncertain spatial events related to many influencing factors necessitates the integration of multiple data sources.We present an innovative approach that combines data in a Discrete Global Grid System(DGGS)and uses machine learning for analysis.A DGGS provides a structured input for multiple types of spatial data,consistent over multiple scales.This data framework facilitates the training of an Artificial Neural Network(ANN)to map and predict a phenomenon.Spatial lag regression models(SLRM)are used to evaluate and rank the outputs of the ANN.In our case study,we predict hate crimes in the USA.Hate crimes get attention from mass media and the scientific community,but data on such events is sparse.We trained the ANN with data ingested in the DGGS based on a 50%sample of hate crimes as identified by the Southern Poverty Law Center(SPLC).Our spatial prediction is up to 78%accurate and verified at the state level against the independent FBI hate crime statistics with a fit of 80%.The derived risk maps are a guide to action for policy makers and law enforcement.展开更多
Green manure use in China has declined rapidly since the 1980 s with the extensive use of chemical fertilizers.The deterioration of field environments and the demand for green agricultural products have resulted in mo...Green manure use in China has declined rapidly since the 1980 s with the extensive use of chemical fertilizers.The deterioration of field environments and the demand for green agricultural products have resulted in more attention to green manure.Human intervention and policy-oriented behaviors likely have large impacts on promoting green manure planting.However,little information is available regarding on where,at what rates,and in which ways(i.e.,intercropping green manure in orchards or rotating green manure in cropland) to develop green manure and what benefits could be gained by incorporating green manure in fields at the county scale.This paper presents the conversion of land use and its effects at small region extent(CLUE-S) model,which is specifically developed for the simulation of land use changes originally,to predict spatial distribution of green manure in cropland and orchards in 2020 in Pinggu District located in Beijing,China.Four types of land use for planting or not planting green manure were classified and the future land use dynamics(mainly croplands and orchards) were considered in the prediction.Two scenarios were used to predict the spatial distribution of green manure based on data from 2011:The promotion of green manure planting in orchards(scenario 1) and the promotion of simultaneous green manure planting in orchards and croplands(scenario 2).The predictions were generally accurate based on the receiver operating characteristic(ROC) and Kappa indices,which validated the effectiveness of the CLUE-S model in the prediction.In addition,the spatial distribution of the green manure was acquired,which indicated that green manure mainly located in the orchards of the middle and southern regions of Dahuashan,the western and southern regions of Wangxinzhuang,the middle region of Shandongzhuang,the eastern region of Pinggu and the middle region of Xiagezhuang under scenario 1.Green manure planting under scenario 2 occurred in orchards in the middle region of Wangxinzhuang,and croplands in most regions of Daxingzhuang,southern Pinggu,northern Xiagezhuang and most of Mafang.The spatially explicit results allowed for the assessment of the benefits of these changes based on different economic and ecological indicators.The economic and ecological gains of scenarios 1 and 2 were 175691 900 and143000 300 CNY,respectively,which indicated that the first scenario was more beneficial for promoting the same area of green manure.These results can facilitate policies of promoting green manure and guide the extensive use of green manure in local agricultural production in suitable ways.展开更多
Based on the evolution of geological dynamics and spatial chaos theory, we proposed the advanced prediction an advanced prediction method of a gas desorption index of drill cuttings to predict coal and gas outbursts. ...Based on the evolution of geological dynamics and spatial chaos theory, we proposed the advanced prediction an advanced prediction method of a gas desorption index of drill cuttings to predict coal and gas outbursts. We investigated and verified the prediction method by a spatial series data of a gas desorption index of drill cuttings obtained from the 113112 coal roadway at the Shitai Mine. Our experimental results show that the spatial distribution of the gas desorption index of drill cuttings has some chaotic charac- teristics, which implies that the risk of coal and gas outbursts can be predicted by spatial chaos theory. We also found that a proper amount of sample data needs to be chosen in order to ensure the accuracy and practical maneuverability of prediction. The relative prediction error is small when the prediction pace is chosen carefully. In our experiments, it turned out that the optimum number of sample points is 80 and the optimum prediction pace 30. The corresponding advanced prediction pace basically meets the requirements of engineering applications.展开更多
Rapid and accurate acquisition of soil organic matter(SOM)information in cultivated land is important for sustainable agricultural development and carbon balance management.This study proposed a novel approach to pred...Rapid and accurate acquisition of soil organic matter(SOM)information in cultivated land is important for sustainable agricultural development and carbon balance management.This study proposed a novel approach to predict SOM with high accuracy using multiyear synthetic remote sensing variables on a monthly scale.We obtained 12 monthly synthetic Sentinel-2 images covering the study area from 2016 to 2021 through the Google Earth Engine(GEE)platform,and reflectance bands and vegetation indices were extracted from these composite images.Then the random forest(RF),support vector machine(SVM)and gradient boosting regression tree(GBRT)models were tested to investigate the difference in SOM prediction accuracy under different combinations of monthly synthetic variables.Results showed that firstly,all monthly synthetic spectral bands of Sentinel-2 showed a significant correlation with SOM(P<0.05)for the months of January,March,April,October,and November.Secondly,in terms of single-monthly composite variables,the prediction accuracy was relatively poor,with the highest R^(2)value of 0.36 being observed in January.When monthly synthetic environmental variables were grouped in accordance with the four quarters of the year,the first quarter and the fourth quarter showed good performance,and any combination of three quarters was similar in estimation accuracy.The overall best performance was observed when all monthly synthetic variables were incorporated into the models.Thirdly,among the three models compared,the RF model was consistently more accurate than the SVM and GBRT models,achieving an R^(2)value of 0.56.Except for band 12 in December,the importance of the remaining bands did not exhibit significant differences.This research offers a new attempt to map SOM with high accuracy and fine spatial resolution based on monthly synthetic Sentinel-2 images.展开更多
With Zengcheng City, Guangdong Province, as the object of study, 200 soil sampling points were col ected for the spatial interpolation prediction of soil properties by using Kriging method and BP neural network method...With Zengcheng City, Guangdong Province, as the object of study, 200 soil sampling points were col ected for the spatial interpolation prediction of soil properties by using Kriging method and BP neural network method. After comparing the interpolation results with the measured values, the root mean square error of the prediction data was obtained. The results showed that the interpolation accuracy of BP neural network was higher than that of Kriging method under the same cir-cumstances, and there was no smoothness in using BP neural network method when there were few sample points. In addition, with no requirement on the distri-bution of sample data, BP neural network method had stronger generalization ability than traditional interpolation method, which was an alternative interpolation method.展开更多
A general regression neural network model,combined with an interative algorithm(GRNNI)using sparsely distributed samples and auxiliary environmental variables was proposed to predict both spatial distribution and vari...A general regression neural network model,combined with an interative algorithm(GRNNI)using sparsely distributed samples and auxiliary environmental variables was proposed to predict both spatial distribution and variability of soil organic matter(SOM)in a bamboo forest.The auxiliary environmental variables were:elevation,slope,mean annual temperature,mean annual precipitation,and normalized difference vegetation index.The prediction accuracy of this model was assessed via three accuracy indices,mean error(ME),mean absolute error(MAE),and root mean squared error(RMSE)for validation in sampling sites.Both the prediction accuracy and reliability of this model were compared to those of regression kriging(RK)and ordinary kriging(OK).The results show that the prediction accuracy of the GRNNI model was higher than that of both RK and OK.The three accuracy indices(ME,MAE,and RMSE)of the GRNNI model were lower than those of RK and OK.Relative improvements of RMSE of the GRNNI model compared with RK and OK were 13.6%and 17.5%,respectively.In addition,a more realistic spatial pattern of SOM was produced by the model because the GRNNI model was more suitable than multiple linear regression to capture the nonlinear relationship between SOM and the auxiliary environmental variables.Therefore,the GRNNI model can improve both prediction accuracy and reliability for determining spatial distribution and variability of SOM.展开更多
Conventional soil maps generally contain one or more soil types within a single soil polygon.But their geographic locations within the polygon are not specified.This restricts current applications of the maps in site-...Conventional soil maps generally contain one or more soil types within a single soil polygon.But their geographic locations within the polygon are not specified.This restricts current applications of the maps in site-specific agricultural management and environmental modelling.We examined the utility of legacy pedon data for disaggregating soil polygons and the effectiveness of similarity-based prediction for making use of the under-or over-sampled legacy pedon data for the disaggregation.The method consisted of three steps.First,environmental similarities between the pedon sites and each location were computed based on soil formative environmental factors.Second,according to soil types of the pedon sites,the similarities were aggregated to derive similarity distribution for each soil type.Third,a hardening process was performed on the maps to allocate candidate soil types within the polygons.The study was conducted at the soil subgroup level in a semi-arid area situated in Manitoba,Canada.Based on 186 independent pedon sites,the evaluation of the disaggregated map of soil subgroups showed an overall accuracy of 67% and a Kappa statistic of 0.62.The map represented a better spatial pattern of soil subgroups in both detail and accuracy compared to a dominant soil subgroup map,which was commonly used in practice.Incorrect predictions mainly occurred in the agricultural plain area and the soil subgroups that are very similar in taxonomy,indicating that new environmental covariates need to be developed.We concluded that the combination of legacy pedon data with similarity-based prediction is an effective solution for soil polygon disaggregation.展开更多
[Objective] The objective of this project was to evaluate and compare spa- tial estimation accuracy by ordinary kriging and regression kriging with MODIS data, predicting SOM contents using limited available data in S...[Objective] The objective of this project was to evaluate and compare spa- tial estimation accuracy by ordinary kriging and regression kriging with MODIS data, predicting SOM contents using limited available data in Shimen County, Hunan Province, China. [Method] Terrain parameters (derived from DEM) and Normalized differential vegetation index (NDVI), Land surface temperature (LST) (derived from MODIS data) were used as auxiliary data to predict the SOM spatial distribution. The mean error (ME) and mean square error (RMSE) were adopted to validate the SOM prediction accuracy. The descriptive statistics and data transformation were conducted by using computer technology. [Result] Regression kriging with terrain and remotely sensed data was superior to ordinary kriging in the case of limited available samples; even the linear relationship between environmental variables and SOM content was moderate. The accuracy assessment showed that the regression kriging method combining with environmental factors obtained a lower mean predication error and root mean square prediction error. The relative improvement was 6.03% compared with ordinary kriging. [Conclusion] Remotely sensed data such as MODIS im- age have the potential as useful auxiliary variables for improving the precision and reliability of SOM prediction in the hilly regions.展开更多
基金supported by the National Natural Science Foundation of China (51304130)the Natural Science Foundation of Shanxi Province,China (2015021125)+4 种基金Shanxi Provincial People's Government Major Decision Consulting Project (ZB20211703)Program for the Soft Science research of Shanxi (2018041060-2)Program for the Philosophy and Social Sciences Research of Higher Learning Institutions of Shanxi (201803010)Philosophy and Social Sciences Planning Project of Shanxi Province (2020YJ052)Basic Research Program of Shanxi Province (20210302123403).
文摘A quantitative research on the effect of coal mining on the soil organic carbon(SOC)pool at regional scale is beneficial to the scientific management of SOC pools in coal mining areas and the realization of coal low-carbon mining.Moreover,the spatial prediction model of SOC content suitable for coal mining subsidence area is a scientific problem that must be solved.Tak-ing the Changhe River Basin of Jincheng City,Shanxi Province,China,as the study area,this paper proposed a radial basis function neural network model combined with the ordinary kriging method.The model includes topography and vegetation factors,which have large influence on soil properties in mining areas,as input parameters to predict the spatial distribution of SOC in the 0-20 and 2040 cm soil layers of the study area.And comparing the prediction effect with the direct kriging method,the results show that the mean error,the mean absolute error and the root mean square error between the predicted and measured values of SOC content predicted by the radial basis function neural network are lower than those obtained by the direct kriging method.Based on the fitting effect of the predicted and measured values,the R^(2) obtained by the radial basis artificial neural network are 0.81,0.70,respectively,higher than the value of 0.44 and 0.36 obtained by the direct kriging method.Therefore,the model combining the artificial neural network and kriging,and considering environmental factors can improve the prediction accuracy of the SOC content in mining areas.
基金financially supported by the National Natural Science Fundation of China(Grant Nos.42161065 and 41461038)。
文摘Understanding the mechanisms and risks of forest fires by building a spatial prediction model is an important means of controlling forest fires.Non-fire point data are important training data for constructing a model,and their quality significantly impacts the prediction performance of the model.However,non-fire point data obtained using existing sampling methods generally suffer from low representativeness.Therefore,this study proposes a non-fire point data sampling method based on geographical similarity to improve the quality of non-fire point samples.The method is based on the idea that the less similar the geographical environment between a sample point and an already occurred fire point,the greater the confidence in being a non-fire point sample.Yunnan Province,China,with a high frequency of forest fires,was used as the study area.We compared the prediction performance of traditional sampling methods and the proposed method using three commonly used forest fire risk prediction models:logistic regression(LR),support vector machine(SVM),and random forest(RF).The results show that the modeling and prediction accuracies of the forest fire prediction models established based on the proposed sampling method are significantly improved compared with those of the traditional sampling method.Specifically,in 2010,the modeling and prediction accuracies improved by 19.1%and 32.8%,respectively,and in 2020,they improved by 13.1%and 24.3%,respectively.Therefore,we believe that collecting non-fire point samples based on the principle of geographical similarity is an effective way to improve the quality of forest fire samples,and thus enhance the prediction of forest fire risk.
基金Under the auspices of Knowledge Innovation Frontier Project of Institute of Soil Science,Chinese Academy of Sciences(No.ISSASIP0716 )the National Nature Science Foundation of China ( No.40701070,40571065)
文摘Hue-Saturation-Intensity (HSI) color model, a psychologically appealing color model, was employed to visualize uncertainty represented by relative prediction error based on the case of spatial prediction of pH of topsoil in the peri-urban Beijing. A two-dimensional legend was designed to accompany the visualization-vertical axis (hues) for visualizing the predicted values and horizontal axis (whiteness) for visualizing the prediction error. Moreover, different ways of visualizing uncertainty were briefly reviewed in this paper. This case study indicated that visualization of both predictions and prediction uncertainty offered a possibility to enhance visual exploration of the data uncertainty and to compare different prediction methods or predictions of totally different variables. The whitish region of the visualization map can be simply interpreted as unsatisfactory prediction results, where may need additional samples or more suitable prediction models for a better prediction results.
基金Project supported by the National Natural Science Foundation of China (Nos. 40571065 and 40235054)the National Key Basic Research Support Foundation of China (No. G1999045707).
文摘Fuzzy classification combined with spatial prediction was used to assess the state of soil pollution in the peri-urban Beijing area. Total concentrations of As, Cr, Cd, Hg, and Pb were determined in 220 topsoil samples (0-20 cm) collected using a grid design in a study area of 2 600 kin2. Heavy metal concentrations were grouped into three classes according to the optimum number of classes and fuzziness exponent using the fuzzy comean (FCM) algorithm. Membership values were interpolated using ordinary kriging. The polluted soils of the study area induced by the measured heavy metals were concentrated in the northwest corner and eastern part, especially the southeastern part close to the urban zone, whereas the soils free of pollution were mainly distributed in the southwestern part. The soils with potential risk of heavy metal pollution were located in isolated spots mainly in the northern part and southeastern corner of the study region. The FCM algorithm combined with geostatistical techniques, as compared to conventional single geostatistical kriging methods, could produce a prediction with a quantitative uncertainty evaluation and higher reliability. Successful prediction of soil pollution achieved with FCM algorithm in this study indicated that fuzzy set theory had great potential for use in other areas of soil science.
基金the National 973 Program of China (No. 2007CB714402-5).
文摘Geostatistics provides a coherent framework for spatial prediction and uncertainty assessment, whereby spatial dependence, as quantified by variograms, is utilized for best linear unbiased estimation of a regionalized variable at unsampied locations. Geostatistics for prediction of continuous regionalized variables is reviewed, with key methods underlying the derivation of major variants of uni-vafiate Kriging described in an easy-to-follow manner. This paper will contribute to demysti- fication and, hence, popularization of geostatistics in geoinformatics communities.
文摘Identifying the ecological environment suitable for the growth of Thuja sutchuenensis and predicting other potential distribution areas are essential to protect this endangered species. After selecting 24 environmental factors thatcould affect the distribution of T. sutchuenensis, including climate, topography, soil and Normalized DifferenceVegetation Index (NDVI), we adopted the Random Forest-MaxEnt integrated model to analyze our data. Basedon the Random Forest study, the contribution of the mean temperature of the warmest quarter, mean temperatureof the coldest quarter, annual mean temperature and mean temperature of the driest quarter was large. Based onMaxEnt model prediction outputs, the potential distribution map not only identified areas that originallyrecorded T. sutchuenensis, such as Xuanhan County, Kai County and Chengkou County, but also identified highlysuitable distribution areas where T. sutchuenensis may exist, including Wanyuan County, Sichuan Province, andthe junction of Chongqing and Hubei Province. This provides a more explicit geographic range for ex situ conservation and reintroduction of T. sutchuenensis. Our results also indicate that, in addition to climate factors,topography and soil factors are also important environmental factors that affect distribution. This provides a theoretical basis for subsequent laboratory construction to simulate the indoor growth of T. sutchuenensis.
文摘By using the landslide risk evaluating model and the advantages of GIS technology in image processing and space analysis, the relative landslide hazard and risk evaluating system of the new county site of Badong is built up. The system is mainly consisted of four subsystems: Information management subsystem, hazard as- sessment subsystem, vulnerability evaluation subsystem and risk prediction subsystem. In the system, landslide hazard assessment, vulnerability evaluation, risk predictions are carried out automatically based on irregular units. At last the landslide hazard and risk map of the study area is compiled. During the whole procedure, Matter-Element Model, Artificial Neural Network, ancl Information Model are used as assessment models. This system provides an effective way for the landslide hazard information management and risk prediction of each district in the Reservoir of Three Gorge Project. The result of the assessment can be a gist and ensure for the land planning and the emigration project in Badong.
文摘Machine learning methods are increasingly used for spatially predicting a categorical target variable when spatially exhaustive predictor variables are available within the study region.Even though these methods exhibit competitive spatial prediction performance,they do not exactly honor the categorical target variable's observed values at sampling locations by construction.On the other side,competitor geostatistical methods perfectly match the categorical target variable's observed values at sampling locations by essence.In many geoscience applications,it is often desirable to perfectly match the observed values of the categorical target variable at sampling locations,especially when the categorical target variable's measurements can be reasonably considered error-free.This paper addresses the problem of exact conditioning of machine learning methods for the spatial prediction of categorical variables.It introduces a classification random forest-based approach in which the categorical target variable is exactly conditioned to the data,thus having the exact conditioning property like competitor geostatistical methods.The proposed method extends a previous work dedicated to continuous target variables by using an implicit representation of the categorical target variable.The basic idea consists of transforming the ensemble of classification tree predictors'(categorical)resulting from the traditional classification random forest into an ensemble of signed distances(continuous)associated with each category of the categorical target variable.Then,an orthogonal representation of the ensemble of signed distances is created through the principal component analysis,thus allowing to reformulate the exact conditioning problem as a system of linear inequalities on principal component scores.Then,the sampling of new principal component scores ensuring the data's exact conditioning is performed via randomized quadratic programming.The resulting conditional signed distances are turned out into an ensemble of categorical outputs,which perfectly honor the categorical target variable's observed values at sampling locations.Then,the majority vote is used to aggregate the ensemble of categorical outputs.The effectiveness of the proposed method is illustrated on a simulated dataset for which ground-truth is available and showcased on a real-world dataset,including geochemical data.A comparison with geostatistical and traditional machine learning methods show that the proposed technique can perfectly match the categorical target variable's observed values at sampling locations while maintaining competitive out-of-sample predictive performance.
文摘Regression random forest is becoming a widely-used machine learning technique for spatial prediction that shows competitive prediction performance in various geoscience fields.Like other popular machine learning methods for spatial prediction,regression random forest does not exactly honor the response variable’s measured values at sampled locations.However,competitor methods such as regression-kriging perfectly fit the response variable’s observed values at sampled locations by construction.Exactly matching the response variable’s measured values at sampled locations is often desirable in many geoscience applications.This paper presents a new approach ensuring that regression random forest perfectly matches the response variable’s observed values at sampled locations.The main idea consists of using the principal component analysis to create an orthogonal representation of the ensemble of regression tree predictors resulting from the traditional regression random forest.Then,the exact conditioning problem is reformulated as a Bayes-linear-Gauss problem on principal component scores.This problem has an analytical solution making it easy to perform Monte Carlo sampling of new principal component scores and then reconstruct regression tree predictors that perfectly match the response variable’s observed values at sampled locations.The reconstructed regression tree predictors’average also precisely matches the response variable’s measured values at sampled locations by construction.The proposed method’s effectiveness is illustrated on the one hand using a synthetic dataset where the ground-truth is available everywhere within the study region,and on the other hand,using a real dataset comprising southwest England’s geochemical concentration data.It is compared with the regression-kriging and the traditional regression random forest.It appears that the proposed method can perfectly fit the response variable’s measured values at sampled locations while achieving good out of sample predictive performance comparatively to regression-kriging and traditional regression random forest.
文摘The spatial prediction of a continuous response variable when spatially exhaustive predictor variables are available within the region under study has become ubiquitous in many geoscience fields.The response variable is often subject to detection limits due to limitations of the measuring instrument or the sampling protocol used.Consequently,the response variable's observations are censored(left-censored,right-censored,or intervalcensored).Machine learning methods dedicated to the spatial prediction of uncensored response variables can not explicitly account for the response variable's censored observations.In such cases,they are routinely applied through ad hoc approaches such as ignoring the response variable's censored observations or replacing them with arbitrary values.Therefore,the response variable's spatial prediction may be inaccurate and sensitive to the assumptions and approximations involved in those arbitrary choices.This paper introduces a random forest-based machine learning method for spatially predicting a censored response variable,in which the response variable's censored observations are explicitly taken into account.The basic idea consists of building an ensemble of regression tree predictors by training the classical regression random forest on the subset of data containing only the response variable's uncensored observations.Then,the principal component analysis applied to this ensemble allows translating the response variable's observations(uncensored and censored)into a linear equalities and inequalities system.This system of linear equalities and inequalities is solved through randomized quadratic programming,which allows obtaining an ensemble of reconstructed regression tree predictors that exactly honor the response variable's observations(uncensored and censored).The response variable's spatial prediction is then obtained by averaging this latter ensemble.The effectiveness of the proposed machine learning method is illustrated on simulated data for which ground truth is available and showcased on real-world data,including geochemical data.The results suggest that the proposed machine learning technique allows greater utilization of the response variable's censored observations than ad hoc methods.
文摘Geoscientists are increasingly tasked with spatially predicting a target variable in the presence of auxiliary information using supervised machine learning algorithms.Typically,the target variable is observed at a few sampling locations due to the relatively time-consuming and costly process of obtaining measurements.In contrast,auxiliary variables are often exhaustively observed within the region under study through the increasing development of remote sensing platforms and sensor networks.Supervised machine learning methods do not fully leverage this large amount of auxiliary spatial data.Indeed,in these methods,the training dataset includes only labeled data locations(where both target and auxiliary variables were measured).At the same time,unlabeled data locations(where auxiliary variables were measured but not the target variable)are not considered during the model training phase.Consequently,only a limited amount of auxiliary spatial data is utilized during the model training stage.As an alternative to supervised learning,semi-supervised learning,which learns from labeled as well as unlabeled data,can be used to address this problem.However,conventional semi-supervised learning techniques do not account for the specificities of spatial data.This paper introduces a spatial semi-supervised learning framework where geostatistics and machine learning are combined to harness a large amount of unlabeled spatial data in combination with typically a smaller set of labeled spatial data.The main idea consists of leveraging the target variable’s spatial autocorrelation to generate pseudo labels at unlabeled data points that are geographically close to labeled data points.This is achieved through geostatistical conditional simulation,where an ensemble of pseudo labels is generated to account for the uncertainty in the pseudo labeling process.The observed labels are augmented by this ensemble of pseudo labels to create an ensemble of pseudo training datasets.A supervised machine learning model is then trained on each pseudo training dataset,followed by an aggregation of trained models.The proposed geostatistical semi-supervised learning method is applied to synthetic and real-world spatial datasets.Its predictive performance is compared with some classical supervised and semi-supervised machine learning methods.It appears that it can effectively leverage a large amount of unlabeled spatial data to improve the target variable’s spatial prediction.
基金funded by the National Natural Science Foundation of China(U21A2013,71874165)Opening Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education[Grant Nos.GLAB2020ZR02,GLAB2022ZR02]+2 种基金State Key Laboratory of Biogeology and Environmental Geology[grant number GBL12107]the Fundamental Research Funds for the Central Universities,China University of Geosciences(Wuhan)[CUG2642022006]Hunan Provincial Natural Science Foundation of China[2021JC0009].
文摘Near real-time spatial prediction of earthquake-induced landslides(EQILs)can rapidly forecast the occurrence position of widespread landslides just after a violent earthquake;thus,EQIL prediction is very crucial to the 72-hour‘golden window’for survivors.This work focuses on a series of earthquake events from 2008 to 2022 occurring in the Tibetan Plateau,a famous seismically-active zone,and proposes a novel interpretable self-supervised learning(ISeL)method for the near real-time spatial prediction of EQILs.This new method innovatively introduces swap noise at the unsupervised mechanism,which can improve the generalization performance and transferability of the model,and can effectively reduce false alarm and improve accuracy through supervisedfine-tuning.An interpretable module is built based on a self-attention mechanism to reveal the importance and contribution of various influencing factors to EQIL spatial distribution.Experimental results demonstrate that the ISeL model is superior to the excellent state-of-the-art machine learning and deep learning methods.Furthermore,according to the interpretable module in the ISeL method,the critical controlling and triggering factors are revealed.The ISeL method can also be applied in other earthquake-frequent regions worldwide because of its good generalization and transferability.
文摘Spatial prediction of any geographic phenomenon can be an intractable problem.Predicting sparse and uncertain spatial events related to many influencing factors necessitates the integration of multiple data sources.We present an innovative approach that combines data in a Discrete Global Grid System(DGGS)and uses machine learning for analysis.A DGGS provides a structured input for multiple types of spatial data,consistent over multiple scales.This data framework facilitates the training of an Artificial Neural Network(ANN)to map and predict a phenomenon.Spatial lag regression models(SLRM)are used to evaluate and rank the outputs of the ANN.In our case study,we predict hate crimes in the USA.Hate crimes get attention from mass media and the scientific community,but data on such events is sparse.We trained the ANN with data ingested in the DGGS based on a 50%sample of hate crimes as identified by the Southern Poverty Law Center(SPLC).Our spatial prediction is up to 78%accurate and verified at the state level against the independent FBI hate crime statistics with a fit of 80%.The derived risk maps are a guide to action for policy makers and law enforcement.
基金supported by the Special Fund for Agroscientific Research in the Public Interest,China(20110300501-01)the Special Fund for First-Class University (4572-18101510)
文摘Green manure use in China has declined rapidly since the 1980 s with the extensive use of chemical fertilizers.The deterioration of field environments and the demand for green agricultural products have resulted in more attention to green manure.Human intervention and policy-oriented behaviors likely have large impacts on promoting green manure planting.However,little information is available regarding on where,at what rates,and in which ways(i.e.,intercropping green manure in orchards or rotating green manure in cropland) to develop green manure and what benefits could be gained by incorporating green manure in fields at the county scale.This paper presents the conversion of land use and its effects at small region extent(CLUE-S) model,which is specifically developed for the simulation of land use changes originally,to predict spatial distribution of green manure in cropland and orchards in 2020 in Pinggu District located in Beijing,China.Four types of land use for planting or not planting green manure were classified and the future land use dynamics(mainly croplands and orchards) were considered in the prediction.Two scenarios were used to predict the spatial distribution of green manure based on data from 2011:The promotion of green manure planting in orchards(scenario 1) and the promotion of simultaneous green manure planting in orchards and croplands(scenario 2).The predictions were generally accurate based on the receiver operating characteristic(ROC) and Kappa indices,which validated the effectiveness of the CLUE-S model in the prediction.In addition,the spatial distribution of the green manure was acquired,which indicated that green manure mainly located in the orchards of the middle and southern regions of Dahuashan,the western and southern regions of Wangxinzhuang,the middle region of Shandongzhuang,the eastern region of Pinggu and the middle region of Xiagezhuang under scenario 1.Green manure planting under scenario 2 occurred in orchards in the middle region of Wangxinzhuang,and croplands in most regions of Daxingzhuang,southern Pinggu,northern Xiagezhuang and most of Mafang.The spatially explicit results allowed for the assessment of the benefits of these changes based on different economic and ecological indicators.The economic and ecological gains of scenarios 1 and 2 were 175691 900 and143000 300 CNY,respectively,which indicated that the first scenario was more beneficial for promoting the same area of green manure.These results can facilitate policies of promoting green manure and guide the extensive use of green manure in local agricultural production in suitable ways.
基金Financial support for this work, provided by the National Basic Research Program of China (No.2011CB201204)the National Youth Science Foundation Program (No.50904068)+1 种基金the Heilongjiang Science & Technology Scientific Research Foundation Program for the Eighth Introduction of Talent (No.06-26)the National Engineering Research Center for Coal Gas Control
文摘Based on the evolution of geological dynamics and spatial chaos theory, we proposed the advanced prediction an advanced prediction method of a gas desorption index of drill cuttings to predict coal and gas outbursts. We investigated and verified the prediction method by a spatial series data of a gas desorption index of drill cuttings obtained from the 113112 coal roadway at the Shitai Mine. Our experimental results show that the spatial distribution of the gas desorption index of drill cuttings has some chaotic charac- teristics, which implies that the risk of coal and gas outbursts can be predicted by spatial chaos theory. We also found that a proper amount of sample data needs to be chosen in order to ensure the accuracy and practical maneuverability of prediction. The relative prediction error is small when the prediction pace is chosen carefully. In our experiments, it turned out that the optimum number of sample points is 80 and the optimum prediction pace 30. The corresponding advanced prediction pace basically meets the requirements of engineering applications.
基金National Key Research and Development Program of China(2022YFB3903302 and 2021YFC1809104)。
文摘Rapid and accurate acquisition of soil organic matter(SOM)information in cultivated land is important for sustainable agricultural development and carbon balance management.This study proposed a novel approach to predict SOM with high accuracy using multiyear synthetic remote sensing variables on a monthly scale.We obtained 12 monthly synthetic Sentinel-2 images covering the study area from 2016 to 2021 through the Google Earth Engine(GEE)platform,and reflectance bands and vegetation indices were extracted from these composite images.Then the random forest(RF),support vector machine(SVM)and gradient boosting regression tree(GBRT)models were tested to investigate the difference in SOM prediction accuracy under different combinations of monthly synthetic variables.Results showed that firstly,all monthly synthetic spectral bands of Sentinel-2 showed a significant correlation with SOM(P<0.05)for the months of January,March,April,October,and November.Secondly,in terms of single-monthly composite variables,the prediction accuracy was relatively poor,with the highest R^(2)value of 0.36 being observed in January.When monthly synthetic environmental variables were grouped in accordance with the four quarters of the year,the first quarter and the fourth quarter showed good performance,and any combination of three quarters was similar in estimation accuracy.The overall best performance was observed when all monthly synthetic variables were incorporated into the models.Thirdly,among the three models compared,the RF model was consistently more accurate than the SVM and GBRT models,achieving an R^(2)value of 0.56.Except for band 12 in December,the importance of the remaining bands did not exhibit significant differences.This research offers a new attempt to map SOM with high accuracy and fine spatial resolution based on monthly synthetic Sentinel-2 images.
基金Supported by the National Natural Science Foundation of China(40971125)the Science and Technology Planning Project of Guangdong Province,China(2012A020200006,2012B091100220)~~
文摘With Zengcheng City, Guangdong Province, as the object of study, 200 soil sampling points were col ected for the spatial interpolation prediction of soil properties by using Kriging method and BP neural network method. After comparing the interpolation results with the measured values, the root mean square error of the prediction data was obtained. The results showed that the interpolation accuracy of BP neural network was higher than that of Kriging method under the same cir-cumstances, and there was no smoothness in using BP neural network method when there were few sample points. In addition, with no requirement on the distri-bution of sample data, BP neural network method had stronger generalization ability than traditional interpolation method, which was an alternative interpolation method.
基金The article is supported by National Key Research and Development Projects of P.R.China(No.2018YFD0600100).
文摘A general regression neural network model,combined with an interative algorithm(GRNNI)using sparsely distributed samples and auxiliary environmental variables was proposed to predict both spatial distribution and variability of soil organic matter(SOM)in a bamboo forest.The auxiliary environmental variables were:elevation,slope,mean annual temperature,mean annual precipitation,and normalized difference vegetation index.The prediction accuracy of this model was assessed via three accuracy indices,mean error(ME),mean absolute error(MAE),and root mean squared error(RMSE)for validation in sampling sites.Both the prediction accuracy and reliability of this model were compared to those of regression kriging(RK)and ordinary kriging(OK).The results show that the prediction accuracy of the GRNNI model was higher than that of both RK and OK.The three accuracy indices(ME,MAE,and RMSE)of the GRNNI model were lower than those of RK and OK.Relative improvements of RMSE of the GRNNI model compared with RK and OK were 13.6%and 17.5%,respectively.In addition,a more realistic spatial pattern of SOM was produced by the model because the GRNNI model was more suitable than multiple linear regression to capture the nonlinear relationship between SOM and the auxiliary environmental variables.Therefore,the GRNNI model can improve both prediction accuracy and reliability for determining spatial distribution and variability of SOM.
基金supported by the National Natural Science Foundation of China (41130530,91325301,41431177,41571212,41401237)the Project of "One-Three-Five" Strategic Planning & Frontier Sciences of the Institute of Soil Science,Chinese Academy of Sciences (ISSASIP1622)+1 种基金the Government Interest Related Program between Canadian Space Agency and Agriculture and Agri-Food,Canada (13MOA01002)the Natural Science Research Program of Jiangsu Province (14KJA170001)
文摘Conventional soil maps generally contain one or more soil types within a single soil polygon.But their geographic locations within the polygon are not specified.This restricts current applications of the maps in site-specific agricultural management and environmental modelling.We examined the utility of legacy pedon data for disaggregating soil polygons and the effectiveness of similarity-based prediction for making use of the under-or over-sampled legacy pedon data for the disaggregation.The method consisted of three steps.First,environmental similarities between the pedon sites and each location were computed based on soil formative environmental factors.Second,according to soil types of the pedon sites,the similarities were aggregated to derive similarity distribution for each soil type.Third,a hardening process was performed on the maps to allocate candidate soil types within the polygons.The study was conducted at the soil subgroup level in a semi-arid area situated in Manitoba,Canada.Based on 186 independent pedon sites,the evaluation of the disaggregated map of soil subgroups showed an overall accuracy of 67% and a Kappa statistic of 0.62.The map represented a better spatial pattern of soil subgroups in both detail and accuracy compared to a dominant soil subgroup map,which was commonly used in practice.Incorrect predictions mainly occurred in the agricultural plain area and the soil subgroups that are very similar in taxonomy,indicating that new environmental covariates need to be developed.We concluded that the combination of legacy pedon data with similarity-based prediction is an effective solution for soil polygon disaggregation.
基金Supported by National Natural Science Foundation of China(41071204)Hunan Provincial Innovation Foundation for Postgraduate(CX2011B310)~~
文摘[Objective] The objective of this project was to evaluate and compare spa- tial estimation accuracy by ordinary kriging and regression kriging with MODIS data, predicting SOM contents using limited available data in Shimen County, Hunan Province, China. [Method] Terrain parameters (derived from DEM) and Normalized differential vegetation index (NDVI), Land surface temperature (LST) (derived from MODIS data) were used as auxiliary data to predict the SOM spatial distribution. The mean error (ME) and mean square error (RMSE) were adopted to validate the SOM prediction accuracy. The descriptive statistics and data transformation were conducted by using computer technology. [Result] Regression kriging with terrain and remotely sensed data was superior to ordinary kriging in the case of limited available samples; even the linear relationship between environmental variables and SOM content was moderate. The accuracy assessment showed that the regression kriging method combining with environmental factors obtained a lower mean predication error and root mean square prediction error. The relative improvement was 6.03% compared with ordinary kriging. [Conclusion] Remotely sensed data such as MODIS im- age have the potential as useful auxiliary variables for improving the precision and reliability of SOM prediction in the hilly regions.