The OpenD Programming Language

dstats.tests

Hypothesis testing beyond simple CDFs. All functions work with input * ranges with elements implicitly convertible to double unless otherwise noted. * * Author: David Simcha

Members

Enums

Alt
enum Alt

Alternative hypotheses. Exact meaning varies with test used.

Dependency
enum Dependency

For falseDiscoveryRate.

Expected
enum Expected

For chiSquareFit and gTestFit, is expected value range counts or proportions?

Functions

binomialTest
double binomialTest(ulong k, ulong n, double p)

Two-sided binomial test for whether P(success) == p. The one-sided alternatives are covered by dstats.distrib.binomialCDF and binomialCDFR. k is the number of successes observed, n is the number of trials, p is the probability of success under the null.

chiSquareContingency
TestRes chiSquareContingency(T inputData)

Performs a Pearson's chi-square test on a contingency table of arbitrary dimensions. When the chi-square test is mentioned, this is usually the one being referred to. Takes a set of finite forward ranges, one for each column in the contingency table. These can be expressed either as a tuple of ranges or a range of ranges. Returns a P-value for the alternative hypothesis that frequencies in each row of the contingency table depend on the column against the null that they don't.

chiSquareFit
TestRes chiSquareFit(T observed, U expected, Expected countProp)

Performs a one-way Pearson's chi-square goodness of fit test between a range of observed and a range of expected values. This is a useful statistical test for testing whether a set of observations fits a discrete distribution.

chiSquareObs
TestRes chiSquareObs(T x, U y)

Given two vectors of observations of jointly distributed variables x, y, tests the null hypothesis that values in x are independent of the corresponding values in y. This is done using Pearson's Chi-Square Test. For a similar test that assumes the data has already been tabulated into a contingency table, see chiSquareContingency.

correlatedAnova
TestRes correlatedAnova(T dataIn)

Performs a correlated sample (within-subjects) ANOVA. This is a generalization of the paired T-test to 3 or more treatments. This function accepts data as either a tuple of ranges (1 for each treatment, such that a given index represents the same subject in each range) or similarly as a range of ranges.

dAgostinoK
TestRes dAgostinoK(T range)

A test for normality of the distribution of a range of values. Based on the assumption that normally distributed values will have a sample skewness and sample kurtosis very close to zero.

fTest
TestRes fTest(T data)

The F-test is a one-way ANOVA extension of the T-test to >2 groups. It's useful when you have 3 or more groups with equal variance and want to test whether their means are equal. Data can be input as either a tuple or a range. This may contain any combination of ranges of numeric types, MeanSD structs and Summary structs.

falseDiscoveryRate
float[] falseDiscoveryRate(T pVals, Dependency dep)

Computes the false discovery rate statistic given a list of p-values, according to Benjamini and Hochberg (1995) (independent) or Benjamini and Yekutieli (2001) (dependent). The Dependency parameter controls whether hypotheses are assumed to be independent, or whether the more conservative assumption that they are correlated must be made.

fisherExact
TestRes fisherExact(T[2][2] contingencyTable, Alt alt)

Fisher's Exact test for difference in odds between rows/columns in a 2x2 contingency table. Specifically, this function tests the odds ratio, which is defined, for a contingency table c, as (c[0][0] * c[1][1]) / (c[1][0] * c[0][1]). Alternatives are Alt.less, meaning true odds ratio < 1, Alt.greater, meaning true odds ratio > 1, and Alt.twoSided, meaning true odds ratio != 1.

fisherExact
TestRes fisherExact(T[][] contingencyTable, Alt alt)

Convenience function. Converts a dynamic array to a static one, then calls the overload.

fishersMethod
TestRes fishersMethod(R pVals)

Fisher's method of meta-analyzing a set of P-values to determine whether there are more significant results than would be expected by chance. Based on a chi-square statistic for the sum of the logs of the P-values.

friedmanTest
TestRes friedmanTest(T dataIn)

The Friedman test is a non-parametric within-subject ANOVA. It's useful when parametric assumptions cannot be made. Usage is identical to correlatedAnova().

gTestContingency
GTestRes gTestContingency(T inputData)

The G or likelihood ratio chi-square test for contingency tables. Roughly the same as Pearson's chi-square test (chiSquareContingency), but may be more accurate in certain situations and less accurate in others.

gTestFit
TestRes gTestFit(T observed, U expected, Expected countProp)

The G or likelihood ratio chi-square test for goodness of fit. Roughly the same as Pearson's chi-square test (chiSquareFit), but may be more accurate in certain situations and less accurate in others. However, it is still based on asymptotic distributions, and is not exact. Usage is is identical to chiSquareFit.

gTestObs
GTestRes gTestObs(T x, U y)

Given two ranges of observations of jointly distributed variables x, y, tests the null hypothesis that values in x are independent of the corresponding values in y. This is done using the Likelihood Ratio G test. Usage is similar to chiSquareObs. For an otherwise identical test that assumes the data has already been tabulated into a contingency table, see gTestContingency.

hochberg
float[] hochberg(T pVals)

Uses the Hochberg procedure to control the familywise error rate assuming that hypothesis tests are independent. This is more powerful than Holm-Bonferroni correction, but requires the independence assumption.

holmBonferroni
float[] holmBonferroni(T pVals)

Uses the Holm-Bonferroni method to adjust a set of P-values in a way that controls the familywise error rate (The probability of making at least one Type I error). This is basically a less conservative version of Bonferroni correction that is still valid for arbitrary assumptions and controls the familywise error rate. Therefore, there aren't too many good reasons to use regular Bonferroni correction instead.

kendallCorTest
TestRes kendallCorTest(T range1, U range2, Alt alt, uint exactThresh)

Tests the hypothesis that the Kendall Tau-b between two ranges is different from 0. Alternatives are Alt.less (kendallCor(range1, range2) < 0), Alt.greater (kendallCor(range1, range2) > 0) and Alt.twoSided (kendallCor(range1, range2) != 0).

kruskalWallis
TestRes kruskalWallis(T dataIn)

The Kruskal-Wallis rank sum test. Tests the null hypothesis that data in each group is not stochastically ordered with respect to data in each other groups. This is a one-way non-parametric ANOVA and can be thought of as either a generalization of the Wilcoxon rank sum test to >2 groups or a non-parametric equivalent to the F-test. Data can be input as either a tuple of ranges (one range for each group) or a range of ranges (one element for each group).

ksTest
TestRes ksTest(T F, U Fprime)

Performs a Kolmogorov-Smirnov (K-S) 2-sample test. The K-S test is a non-parametric test for a difference between two empirical distributions or between an empirical distribution and a reference distribution.

ksTest
TestRes ksTest(T Femp, Func F)

One-sample Kolmogorov-Smirnov test against a reference distribution. Takes a callable object for the CDF of refernce distribution.

ksTestDestructive
TestRes ksTestDestructive(T F, U Fprime)
TestRes ksTestDestructive(T Femp, Func F)

Same as ksTest, except sorts in place, avoiding memory allocations.

levenesTest
TestRes levenesTest(T data)

Tests the null hypothesis that the variances of all groups are equal against the alternative that heteroscedasticity exists. data must be either a tuple of ranges or a range of ranges. central is an alias for the measure of central tendency to be used. This can be any function that maps a forward range of numeric types to a numeric type. The commonly used ones are median (default) and mean (less robust). Trimmed mean is sometimes useful, but is currently not implemented in dstats.summary.

multinomialTest
double multinomialTest(U countsIn, F proportions)

The exact multinomial goodness of fit test for whether a set of counts fits a hypothetical distribution. counts is an input range of counts. proportions is an input range of expected proportions. These are normalized automatically, so they can sum to any value.

pairedTTest
ConfInt pairedTTest(T before, U after, double testMean, Alt alt, double confLevel)

Paired T test. Tests the hypothesis that the mean difference between corresponding elements of before and after is testMean. Alternatives are Alt.less, meaning the that the true mean difference (beforei - afteri) is less than testMean, Alt.greater, meaning the true mean difference is greater than testMean, and Alt.twoSided, meaning the true mean difference is not equal to testMean.

pairedTTest
ConfInt pairedTTest(T diffSummary, double testMean, Alt alt, double confLevel)

Compute a paired T test directly from summary statistics of the differences between corresponding samples.

pearsonCorTest
ConfInt pearsonCorTest(T range1, U range2, Alt alt, double confLevel)

Tests the hypothesis that the Pearson correlation between two ranges is different from some 0. Alternatives are Alt.less (pearsonCor(range1, range2) < 0), Alt.greater (pearsonCor(range1, range2) 0) and Alt.twoSided (pearsonCor(range1, range2) != 0).

pearsonCorTest
ConfInt pearsonCorTest(double cor, double N, Alt alt, double confLevel)

Same as overload, but uses pre-computed correlation coefficient and sample size instead of computing them.

runsTest
double runsTest(T obs, Alt alt)

Wald-wolfowitz or runs test for randomness of the distribution of elements for which positive() evaluates to true. For example, given a sequence of coin flips [H,H,H,H,H,T,T,T,T,T] and a positive() function of "a == 'H'", this test would determine that the heads are non-randomly distributed, since they are all at the beginning of obs. This is done by counting the number of runs of consecutive elements for which positive() evaluates to true, and the number of consecutive runs for which it evaluates to false. In the example above, we have 2 runs. These are the block of 5 consecutive heads at the beginning and the 5 consecutive tails at the end.

signTest
TestRes signTest(T before, U after, Alt alt)

Sign test for differences between paired values. This is a very robust but very low power test. Alternatives are Alt.less, meaning elements of before are typically less than corresponding elements of after, Alt.greater, meaning elements of before are typically greater than elements of after, and Alt.twoSided, meaning that there is a significant difference in either direction.

signTest
TestRes signTest(T data, double mu, Alt alt)

Similar to the overload, but allows testing for a difference between a * range and a fixed value mu.

spearmanCorTest
TestRes spearmanCorTest(T range1, U range2, Alt alt)

Tests the hypothesis that the Spearman correlation between two ranges is different from some 0. Alternatives are Alt.less (spearmanCor(range1, range2) < 0), Alt.greater (spearmanCor(range1, range2) > 0) and Alt.twoSided (spearmanCor(range1, range2) != 0).

studentsTTest
ConfInt studentsTTest(T data, double testMean, Alt alt, double confLevel)

One-sample Student's T-test for difference between mean of data and a fixed value. Alternatives are Alt.less, meaning mean(data) < testMean, Alt.greater, meaning mean(data) > testMean, and Alt.twoSided, meaning mean(data)!= testMean.

studentsTTest
ConfInt studentsTTest(T sample1, U sample2, double testMean, Alt alt, double confLevel)

Two-sample T test for a difference in means, assumes variances of samples are equal. Alteratives are Alt.less, meaning mean(sample1) - mean(sample2) < testMean, Alt.greater, meaning mean(sample1) - mean(sample2) > testMean, and Alt.twoSided, meaning mean(sample1) - mean(sample2) != testMean.

welchAnova
TestRes welchAnova(T data)

Same as fTest, except that this test does not require the assumption of equal variances. In exchange it's slightly less powerful.

welchTTest
ConfInt welchTTest(T sample1, U sample2, double testMean, Alt alt, double confLevel)

Two-sample T-test for difference in means. Does not assume variances are equal. Alteratives are Alt.less, meaning mean(sample1) - mean(sample2) < testMean, Alt.greater, meaning mean(sample1) - mean(sample2) > testMean, and Alt.twoSided, meaning mean(sample1) - mean(sample2) != testMean.

wilcoxonRankSum
TestRes wilcoxonRankSum(T sample1, U sample2, Alt alt, uint exactThresh)

Computes Wilcoxon rank sum test statistic and P-value for a set of observations against another set, using the given alternative. Alt.less means that sample1 is stochastically less than sample2. Alt.greater means sample1 is stochastically greater than sample2. Alt.twoSided means sample1 is stochastically less than or greater than sample2.

wilcoxonSignedRank
TestRes wilcoxonSignedRank(T before, U after, Alt alt, uint exactThresh)

Computes a test statistic and P-value for a Wilcoxon signed rank test against the given alternative. Alt.less means that elements of before are stochastically less than corresponding elements of after. Alt.greater means elements of before are stochastically greater than corresponding elements of after. Alt.twoSided means there is a significant difference in either direction.

wilcoxonSignedRank
TestRes wilcoxonSignedRank(T data, double mu, Alt alt, uint exactThresh)

Same as the overload, but allows testing whether a range is stochastically * less than or greater than a fixed value mu rather than paired elements of * a second range.

Structs

ConfInt
struct ConfInt

A plain old data struct for returning the results of hypothesis tests that also produce confidence intervals. Contains, can implicitly convert to, a TestRes.

GTestRes
struct GTestRes

This struct is a subtype of TestRes and is used to return the results of gTestContingency and gTestObs. Due to the information theoretic interpretation of the G test, it contains an extra field to return the mutual information in bits.

RunsTest
struct RunsTest(alias positive = "a > 0", T)

Runs test as in runsTest(), except calculates online instead of from stored array elements.

TestRes
struct TestRes

A plain old data struct for returning the results of hypothesis tests.

Templates

isSummary
template isSummary(T)

Tests whether a struct/class has the necessary information for calculating a T-test. It must have a property .mean (mean), .stdev (stdandard deviation), .var (variance), and .N (sample size).

Meta