dask.array.stats.ttest_1samp

dask.array.stats.ttest_1samp

dask.array.stats.ttest_1samp(a, popmean, axis=0, nan_policy='propagate')[source]

Calculate the T-test for the mean of ONE group of scores.

This docstring was copied from scipy.stats.ttest_1samp.

Some inconsistencies with the Dask version may exist.

This is a test for the null hypothesis that the expected value (mean) of a sample of independent observations a is equal to the given population mean, popmean.

Parameters
aarray_like

Sample observation.

popmeanfloat or array_like

Expected value in null hypothesis. If array_like, then it must have the same shape as a excluding the axis dimension.

axisint or None, optional

Axis along which to compute test; default is 0. If None, compute over the whole array a.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

  • ‘propagate’: returns nan

  • ‘raise’: throws an error

  • ‘omit’: performs the calculations ignoring nan values

alternative{‘two-sided’, ‘less’, ‘greater’}, optional (Not supported in Dask)

Defines the alternative hypothesis. The following options are available (default is ‘two-sided’):

  • ‘two-sided’: the mean of the underlying distribution of the sample is different than the given population mean (popmean)

  • ‘less’: the mean of the underlying distribution of the sample is less than the given population mean (popmean)

  • ‘greater’: the mean of the underlying distribution of the sample is greater than the given population mean (popmean)

New in version 1.6.0.

Returns
statisticfloat or array

t-statistic.

pvaluefloat or array

Two-sided p-value.

Notes

The statistic is calculated as (np.mean(a) - popmean)/se, where se is the standard error. Therefore, the statistic will be positive when the sample mean is greater than the population mean and negative when the sample mean is less than the population mean.

Examples

Suppose we wish to test the null hypothesis that the mean of a population is equal to 0.5. We choose a confidence level of 99%; that is, we will reject the null hypothesis in favor of the alternative if the p-value is less than 0.01.

When testing random variates from the standard uniform distribution, which has a mean of 0.5, we expect the data to be consistent with the null hypothesis most of the time.

>>> from scipy import stats  
>>> rng = np.random.default_rng()  
>>> rvs = stats.uniform.rvs(size=50, random_state=rng)  
>>> stats.ttest_1samp(rvs, popmean=0.5)  
Ttest_1sampResult(statistic=2.456308468440, pvalue=0.017628209047638)

As expected, the p-value of 0.017 is not below our threshold of 0.01, so we cannot reject the null hypothesis.

When testing data from the standard normal distribution, which has a mean of 0, we would expect the null hypothesis to be rejected.

>>> rvs = stats.norm.rvs(size=50, random_state=rng)  
>>> stats.ttest_1samp(rvs, popmean=0.5)  
Ttest_1sampResult(statistic=-7.433605518875, pvalue=1.416760157221e-09)

Indeed, the p-value is lower than our threshold of 0.01, so we reject the null hypothesis in favor of the default “two-sided” alternative: the mean of the population is not equal to 0.5.

However, suppose we were to test the null hypothesis against the one-sided alternative that the mean of the population is greater than 0.5. Since the mean of the standard normal is less than 0.5, we would not expect the null hypothesis to be rejected.

>>> stats.ttest_1samp(rvs, popmean=0.5, alternative='greater')  
Ttest_1sampResult(statistic=-7.433605518875, pvalue=0.99999999929)

Unsurprisingly, with a p-value greater than our threshold, we would not reject the null hypothesis.

Note that when working with a confidence level of 99%, a true null hypothesis will be rejected approximately 1% of the time.

>>> rvs = stats.uniform.rvs(size=(100, 50), random_state=rng)  
>>> res = stats.ttest_1samp(rvs, popmean=0.5, axis=1)  
>>> np.sum(res.pvalue < 0.01)  
1

Indeed, even though all 100 samples above were drawn from the standard uniform distribution, which does have a population mean of 0.5, we would mistakenly reject the null hypothesis for one of them.