Choosing appropriate statistical tests is crucial but deciding which tests to use can be challenging. Different tests suit different types of data and research questions, so it is important to choose the right one. Kn...Choosing appropriate statistical tests is crucial but deciding which tests to use can be challenging. Different tests suit different types of data and research questions, so it is important to choose the right one. Knowing how to select an appropriate test can lead to more accurate results. Invalid results and misleading conclusions may be drawn from a study if an incorrect statistical test is used. Therefore, to avoid these it is essential to understand the nature of the data, the research question, and the assumptions of the tests before selecting one. This is because there are a wide variety of tests available. This paper provides a step-by-step approach to selecting the right statistical test for any study, with an explanation of when it is appropriate to use it and relevant examples of each statistical test. Furthermore, this guide provides a comprehensive overview of the assumptions of each test and what to do if these assumptions are violated.展开更多
Choosing appropriate statistical tests is crucial but deciding which tests to use can be challenging. Different tests suit different types of data and research questions, so it is important to choose the right one. Kn...Choosing appropriate statistical tests is crucial but deciding which tests to use can be challenging. Different tests suit different types of data and research questions, so it is important to choose the right one. Knowing how to select an appropriate test can lead to more accurate results. Invalid results and misleading conclusions may be drawn from a study if an incorrect statistical test is used. Therefore, to avoid these it is essential to understand the nature of the data, the research question, and the assumptions of the tests before selecting one. This is because there are a wide variety of tests available. This paper provides a step-by-step approach to selecting the right statistical test for any study, with an explanation of when it is appropriate to use it and relevant examples of each statistical test. Furthermore, this guide provides a comprehensive overview of the assumptions of each test and what to do if these assumptions are violated.展开更多
We propose a new nonparametric test based on the rank difference between the paired sample for testing the equality of the marginal distributions from a bivariate distribution. We also consider a modification of the n...We propose a new nonparametric test based on the rank difference between the paired sample for testing the equality of the marginal distributions from a bivariate distribution. We also consider a modification of the novel nonparametric test based on the test proposed by Baumgartern, Weiβ, and Schindler (1998). An extensive numerical power comparison for various parametric and nonparametric tests was conducted under a wide range of bivariate distributions for small sample sizes. The two new nonparametric tests have comparable power to the paired t test for the data simulated from bivariate normal distributions, and are generally more powerful than the paired t test and other commonly used nonparametric tests in several important bivariate distributions.展开更多
The objectives of this paper are to demonstrate the algorithms employed by three statistical software programs (R, Real Statistics using Excel, and SPSS) for calculating the exact two-tailed probability of the Wald-Wo...The objectives of this paper are to demonstrate the algorithms employed by three statistical software programs (R, Real Statistics using Excel, and SPSS) for calculating the exact two-tailed probability of the Wald-Wolfowitz one-sample runs test for randomness, to present a novel approach for computing this probability, and to compare the four procedures by generating samples of 10 and 11 data points, varying the parameters n<sub>0</sub> (number of zeros) and n<sub>1</sub> (number of ones), as well as the number of runs. Fifty-nine samples are created to replicate the behavior of the distribution of the number of runs with 10 and 11 data points. The exact two-tailed probabilities for the four procedures were compared using Friedman’s test. Given the significant difference in central tendency, post-hoc comparisons were conducted using Conover’s test with Benjamini-Yekutielli correction. It is concluded that the procedures of Real Statistics using Excel and R exhibit some inadequacies in the calculation of the exact two-tailed probability, whereas the new proposal and the SPSS procedure are deemed more suitable. The proposed robust algorithm has a more transparent rationale than the SPSS one, albeit being somewhat more conservative. We recommend its implementation for this test and its application to others, such as the binomial and sign test.展开更多
Normality testing is a fundamental hypothesis test in the statistical analysis of key biological indicators of diabetes.If this assumption is violated,it may cause the test results to deviate from the true value,leadi...Normality testing is a fundamental hypothesis test in the statistical analysis of key biological indicators of diabetes.If this assumption is violated,it may cause the test results to deviate from the true value,leading to incorrect inferences and conclusions,and ultimately affecting the validity and accuracy of statistical inferences.Considering this,the study designs a unified analysis scheme for different data types based on parametric statistical test methods and non-parametric test methods.The data were grouped according to sample type and divided into discrete data and continuous data.To account for differences among subgroups,the conventional chi-squared test was used for discrete data.The normal distribution is the basis of many statistical methods;if the data does not follow a normal distribution,many statistical methods will fail or produce incorrect results.Therefore,before data analysis and modeling,the data were divided into normal and non-normal groups through normality testing.For normally distributed data,parametric statistical methods were used to judge the differences between groups.For non-normal data,non-parametric tests were employed to improve the accuracy of the analysis.Statistically significant indicators were retained according to the significance index P-value of the statistical test or corresponding statistics.These indicators were then combined with relevant medical background to further explore the etiology leading to the occurrence or transformation of diabetes status.展开更多
This paper is focused on the goodness-of-fit test of the functional linear composite quantile regression model.A nonparametric test is proposed by using the orthogonality of the residual and its conditional expectatio...This paper is focused on the goodness-of-fit test of the functional linear composite quantile regression model.A nonparametric test is proposed by using the orthogonality of the residual and its conditional expectation under the null model.The proposed test statistic has an asymptotic standard normal distribution under the null hypothesis,and tends to infinity in probability under the alternative hypothesis,which implies the consistency of the test.Furthermore,it is proved that the test statistic converges to a normal distribution with nonzero mean under a local alternative hypothesis.Extensive simulations are reported,and the results show that the proposed test has proper sizes and is sensitive to the considered model discrepancies.The proposed methods are also applied to two real datasets.展开更多
Nonparametric time-of-arrival(TOA) estimators for impulse radio ultra-wideband(IR-UWB) signals are proposed. Nonparametric detection is obviously useful in situations where detailed information about the statistic...Nonparametric time-of-arrival(TOA) estimators for impulse radio ultra-wideband(IR-UWB) signals are proposed. Nonparametric detection is obviously useful in situations where detailed information about the statistics of the noise is unavailable or not accurate. Such TOA estimators are obtained based on conditional statistical tests with only a symmetry distribution assumption on the noise probability density function. The nonparametric estimators are attractive choices for low-resolution IR-UWB digital receivers which can be implemented by fast comparators or high sampling rate low resolution analog-to-digital converters(ADCs),in place of high sampling rate high resolution ADCs which may not be available in practice. Simulation results demonstrate that nonparametric TOA estimators provide more effective and robust performance than typical energy detection(ED) based estimators.展开更多
Healthcare decisions are based on scientific evidence obtained from medical studies by gathering data and analyzing it to obtain the best results. When analyzing data, biostatistics is a powerful tool, but healthcare ...Healthcare decisions are based on scientific evidence obtained from medical studies by gathering data and analyzing it to obtain the best results. When analyzing data, biostatistics is a powerful tool, but healthcare professionals lack knowledge in this field. This lack of knowledge can manifest itself in situations such as choosing the wrong statistical test for the right situation or applying a statistical test without checking its assumptions, leading to inaccurate results and misleading conclusions. With the help of this “narrative review”, the aim is to bring biostatistics closer to healthcare professionals by answering certain questions: how to describe the distribution of data? how to assess the normality of data? how to transform data? and how to choose between nonparametric and parametric tests? Through this work, our hope is that the reader will be able to choose the right test for the right situation, in order to obtain the most accurate results.展开更多
Detecting changes in surface air temperature in mid-and low-altitude mountainous regions is essential for a comprehensive understanding of warming trend with altitude.We use daily surface air temperature data from 64 ...Detecting changes in surface air temperature in mid-and low-altitude mountainous regions is essential for a comprehensive understanding of warming trend with altitude.We use daily surface air temperature data from 64 meteorological stations in Wuyi Mountains and its adjacent regions to analyze the spatio-temporal patterns of temperature change.The results show that Wuyi Mountains have experienced significant warming from 1961 to 2018.The warming trend of the mean temperature is 0.20℃/decade,the maximum temperature is 0.17℃/decade,and the minimum temperature is 0.26℃/decade.In 1961-1990,more than 63%of the stations showed a decreasing trend in annual mean temperature,mainly because the maximum temperature decreased during this period.However,in 1971-2000,1981-2010 and 1991-2018,the maximum,minimum and mean temperatures increased.The fastest increasing trend of mean temperature occurred in the southeastern coastal plains,the quickest increasing trend of maximum temperature occurred in the northwestern mountainous region,and the increase of minimum temperature occurred faster in the southeastern coastal and northwestern mountainous regions than that in the central area.Meanwhile,this study suggests that elevation does not affect warming in the Wuyi Mountains.These results are beneficial for understanding climate change in humid subtropical middle and low mountains.展开更多
In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For exampl...In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.展开更多
Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-v...Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.展开更多
This paper, comparison of two sample tests, is motivated by the fact that in the test of significant difference between two independent samples, numerous methods can be adopted;each may lead to significant different r...This paper, comparison of two sample tests, is motivated by the fact that in the test of significant difference between two independent samples, numerous methods can be adopted;each may lead to significant different results;this implies that wrong choice of test statistic could lead to erroneous conclusion. To prevent misleading information, there is a need for proper investigation of some selected methods for test of significant difference between variables/subjects most especially, independent samples. The paper examines the efficiency and sensitivity of four test statistics to ascertain which test performs better. Based on the results, the relative efficiency favours median test as being more efficient than modified median test for both symmetric and asymmetric distributions. In terms of power of test, median test is more sensitive than Modified Median (MMED) test since it has higher power irrespective of the sample sizes for both symmetric and asymmetric distribution. In terms of relative efficiency for asymmetric distribution Modified Mann-Whitney U test is more efficient than Mann-Whitney U test (MMWU), and then for symmetric distribution, Mann-Whitney U test (MMWU) is more efficient than Modified Mann-Whitney in sample size of 5;but for other sample sizes considered Modified Mann-Whitney U test (MMWU) is better than Mann-Whitney. Using power of test for both symmetric and asymmetric distributions, Mann-Whitney is more sensitive than Modified Mann-Whitney U test (MMWU) because it has higher power.展开更多
Testing the equality of percentiles (quantiles) between populations is an effective method for robust, nonparametric comparison, especially when the distributions are asymmetric or irregularly shaped. Unlike global no...Testing the equality of percentiles (quantiles) between populations is an effective method for robust, nonparametric comparison, especially when the distributions are asymmetric or irregularly shaped. Unlike global nonparametric tests for homogeneity such as the Kolmogorv-Smirnov test, testing the equality of a set of percentiles (i.e., a percentile profile) yields an estimate of the location and extent of the differences between the populations along the entire domain. The Wald test using bootstrap estimates of variance of the order statistics provides a unified method for hypothesis testing of functions of the population percentiles. Simulation studies are conducted to show performance of the method under various scenarios and to give suggestions on its use. Several examples are given to illustrate some useful applications to real data.展开更多
文摘Choosing appropriate statistical tests is crucial but deciding which tests to use can be challenging. Different tests suit different types of data and research questions, so it is important to choose the right one. Knowing how to select an appropriate test can lead to more accurate results. Invalid results and misleading conclusions may be drawn from a study if an incorrect statistical test is used. Therefore, to avoid these it is essential to understand the nature of the data, the research question, and the assumptions of the tests before selecting one. This is because there are a wide variety of tests available. This paper provides a step-by-step approach to selecting the right statistical test for any study, with an explanation of when it is appropriate to use it and relevant examples of each statistical test. Furthermore, this guide provides a comprehensive overview of the assumptions of each test and what to do if these assumptions are violated.
文摘Choosing appropriate statistical tests is crucial but deciding which tests to use can be challenging. Different tests suit different types of data and research questions, so it is important to choose the right one. Knowing how to select an appropriate test can lead to more accurate results. Invalid results and misleading conclusions may be drawn from a study if an incorrect statistical test is used. Therefore, to avoid these it is essential to understand the nature of the data, the research question, and the assumptions of the tests before selecting one. This is because there are a wide variety of tests available. This paper provides a step-by-step approach to selecting the right statistical test for any study, with an explanation of when it is appropriate to use it and relevant examples of each statistical test. Furthermore, this guide provides a comprehensive overview of the assumptions of each test and what to do if these assumptions are violated.
文摘We propose a new nonparametric test based on the rank difference between the paired sample for testing the equality of the marginal distributions from a bivariate distribution. We also consider a modification of the novel nonparametric test based on the test proposed by Baumgartern, Weiβ, and Schindler (1998). An extensive numerical power comparison for various parametric and nonparametric tests was conducted under a wide range of bivariate distributions for small sample sizes. The two new nonparametric tests have comparable power to the paired t test for the data simulated from bivariate normal distributions, and are generally more powerful than the paired t test and other commonly used nonparametric tests in several important bivariate distributions.
文摘The objectives of this paper are to demonstrate the algorithms employed by three statistical software programs (R, Real Statistics using Excel, and SPSS) for calculating the exact two-tailed probability of the Wald-Wolfowitz one-sample runs test for randomness, to present a novel approach for computing this probability, and to compare the four procedures by generating samples of 10 and 11 data points, varying the parameters n<sub>0</sub> (number of zeros) and n<sub>1</sub> (number of ones), as well as the number of runs. Fifty-nine samples are created to replicate the behavior of the distribution of the number of runs with 10 and 11 data points. The exact two-tailed probabilities for the four procedures were compared using Friedman’s test. Given the significant difference in central tendency, post-hoc comparisons were conducted using Conover’s test with Benjamini-Yekutielli correction. It is concluded that the procedures of Real Statistics using Excel and R exhibit some inadequacies in the calculation of the exact two-tailed probability, whereas the new proposal and the SPSS procedure are deemed more suitable. The proposed robust algorithm has a more transparent rationale than the SPSS one, albeit being somewhat more conservative. We recommend its implementation for this test and its application to others, such as the binomial and sign test.
基金National Natural Science Foundation of China(No.12271261)Postgraduate Research and Practice Innovation Program of Jiangsu Province,China(Grant No.SJCX230368)。
文摘Normality testing is a fundamental hypothesis test in the statistical analysis of key biological indicators of diabetes.If this assumption is violated,it may cause the test results to deviate from the true value,leading to incorrect inferences and conclusions,and ultimately affecting the validity and accuracy of statistical inferences.Considering this,the study designs a unified analysis scheme for different data types based on parametric statistical test methods and non-parametric test methods.The data were grouped according to sample type and divided into discrete data and continuous data.To account for differences among subgroups,the conventional chi-squared test was used for discrete data.The normal distribution is the basis of many statistical methods;if the data does not follow a normal distribution,many statistical methods will fail or produce incorrect results.Therefore,before data analysis and modeling,the data were divided into normal and non-normal groups through normality testing.For normally distributed data,parametric statistical methods were used to judge the differences between groups.For non-normal data,non-parametric tests were employed to improve the accuracy of the analysis.Statistically significant indicators were retained according to the significance index P-value of the statistical test or corresponding statistics.These indicators were then combined with relevant medical background to further explore the etiology leading to the occurrence or transformation of diabetes status.
基金supported by the Natural Science Foundation of China under Grant Nos.11271014 and 11971045。
文摘This paper is focused on the goodness-of-fit test of the functional linear composite quantile regression model.A nonparametric test is proposed by using the orthogonality of the residual and its conditional expectation under the null model.The proposed test statistic has an asymptotic standard normal distribution under the null hypothesis,and tends to infinity in probability under the alternative hypothesis,which implies the consistency of the test.Furthermore,it is proved that the test statistic converges to a normal distribution with nonzero mean under a local alternative hypothesis.Extensive simulations are reported,and the results show that the proposed test has proper sizes and is sensitive to the considered model discrepancies.The proposed methods are also applied to two real datasets.
基金supported by the National High Technology Research and Development Program of China(863 Program)(2009AA011204)
文摘Nonparametric time-of-arrival(TOA) estimators for impulse radio ultra-wideband(IR-UWB) signals are proposed. Nonparametric detection is obviously useful in situations where detailed information about the statistics of the noise is unavailable or not accurate. Such TOA estimators are obtained based on conditional statistical tests with only a symmetry distribution assumption on the noise probability density function. The nonparametric estimators are attractive choices for low-resolution IR-UWB digital receivers which can be implemented by fast comparators or high sampling rate low resolution analog-to-digital converters(ADCs),in place of high sampling rate high resolution ADCs which may not be available in practice. Simulation results demonstrate that nonparametric TOA estimators provide more effective and robust performance than typical energy detection(ED) based estimators.
文摘Healthcare decisions are based on scientific evidence obtained from medical studies by gathering data and analyzing it to obtain the best results. When analyzing data, biostatistics is a powerful tool, but healthcare professionals lack knowledge in this field. This lack of knowledge can manifest itself in situations such as choosing the wrong statistical test for the right situation or applying a statistical test without checking its assumptions, leading to inaccurate results and misleading conclusions. With the help of this “narrative review”, the aim is to bring biostatistics closer to healthcare professionals by answering certain questions: how to describe the distribution of data? how to assess the normality of data? how to transform data? and how to choose between nonparametric and parametric tests? Through this work, our hope is that the reader will be able to choose the right test for the right situation, in order to obtain the most accurate results.
基金supported by the Projects for National Natural Science Foundation of China(U22A20554)the Natural Science Foundation of Fujian Province(2023J01285)+1 种基金the Public Welfare Scientific Institutions of Fujian Province(2022R1002005)the Scientific Project from Fujian Provincial Department of Science and Technology(2022Y0007).
文摘Detecting changes in surface air temperature in mid-and low-altitude mountainous regions is essential for a comprehensive understanding of warming trend with altitude.We use daily surface air temperature data from 64 meteorological stations in Wuyi Mountains and its adjacent regions to analyze the spatio-temporal patterns of temperature change.The results show that Wuyi Mountains have experienced significant warming from 1961 to 2018.The warming trend of the mean temperature is 0.20℃/decade,the maximum temperature is 0.17℃/decade,and the minimum temperature is 0.26℃/decade.In 1961-1990,more than 63%of the stations showed a decreasing trend in annual mean temperature,mainly because the maximum temperature decreased during this period.However,in 1971-2000,1981-2010 and 1991-2018,the maximum,minimum and mean temperatures increased.The fastest increasing trend of mean temperature occurred in the southeastern coastal plains,the quickest increasing trend of maximum temperature occurred in the northwestern mountainous region,and the increase of minimum temperature occurred faster in the southeastern coastal and northwestern mountainous regions than that in the central area.Meanwhile,this study suggests that elevation does not affect warming in the Wuyi Mountains.These results are beneficial for understanding climate change in humid subtropical middle and low mountains.
文摘In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10th, 50th, and 90th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.
文摘Zero-inflated distributions are common in statistical problems where there is interest in testing homogeneity of two or more independent groups. Often, the underlying distribution that has an inflated number of zero-valued observations is asymmetric, and its functional form may not be known or easily characterized. In this case, comparisons of the groups in terms of their respective percentiles may be appropriate as these estimates are nonparametric and more robust to outliers and other irregularities. The median test is often used to compare distributions with similar but asymmetric shapes but may be uninformative when there are excess zeros or dissimilar shapes. For zero-inflated distributions, it is useful to compare the distributions with respect to their proportion of zeros, coupled with the comparison of percentile profiles for the observed non-zero values. A simple chi-square test for simultaneous testing of these two components is proposed, applicable to both continuous and discrete data. Results of simulation studies are reported to summarize empirical power under several scenarios. We give recommendations for the minimum sample size which is necessary to achieve suitable test performance in specific examples.
文摘This paper, comparison of two sample tests, is motivated by the fact that in the test of significant difference between two independent samples, numerous methods can be adopted;each may lead to significant different results;this implies that wrong choice of test statistic could lead to erroneous conclusion. To prevent misleading information, there is a need for proper investigation of some selected methods for test of significant difference between variables/subjects most especially, independent samples. The paper examines the efficiency and sensitivity of four test statistics to ascertain which test performs better. Based on the results, the relative efficiency favours median test as being more efficient than modified median test for both symmetric and asymmetric distributions. In terms of power of test, median test is more sensitive than Modified Median (MMED) test since it has higher power irrespective of the sample sizes for both symmetric and asymmetric distribution. In terms of relative efficiency for asymmetric distribution Modified Mann-Whitney U test is more efficient than Mann-Whitney U test (MMWU), and then for symmetric distribution, Mann-Whitney U test (MMWU) is more efficient than Modified Mann-Whitney in sample size of 5;but for other sample sizes considered Modified Mann-Whitney U test (MMWU) is better than Mann-Whitney. Using power of test for both symmetric and asymmetric distributions, Mann-Whitney is more sensitive than Modified Mann-Whitney U test (MMWU) because it has higher power.
文摘Testing the equality of percentiles (quantiles) between populations is an effective method for robust, nonparametric comparison, especially when the distributions are asymmetric or irregularly shaped. Unlike global nonparametric tests for homogeneity such as the Kolmogorv-Smirnov test, testing the equality of a set of percentiles (i.e., a percentile profile) yields an estimate of the location and extent of the differences between the populations along the entire domain. The Wald test using bootstrap estimates of variance of the order statistics provides a unified method for hypothesis testing of functions of the population percentiles. Simulation studies are conducted to show performance of the method under various scenarios and to give suggestions on its use. Several examples are given to illustrate some useful applications to real data.