Class StatisticalTests

java.lang.Object
ch.idsia.utils.statistics.StatisticalTests

public class StatisticalTests
extends java.lang.Object
This class will contain commonly used statistical tests. At present, the only test available is the t-test. This can be used in paired or unpaired modes, and in each case one or two-sided tests may be applied. The other useful feature
  • Constructor Summary

    Constructors 
    Constructor Description
    StatisticalTests()  
  • Method Summary

    Modifier and Type Method Description
    protected static double betacf​(double a, double b, double x)  
    protected static double betai​(double a, double b, double x)  
    static double binRoot​(double x)
    Bin root - a binary search for a square root.
    static double confDiff​(double[] d, double conf)
    This method returns the distance x from the mean m such that the area under the t distribution (estimated from the data d) between (m - x) and (m + x) is equal to conf.
    static void confTest()
    This uses an example from Statistics for Business and Economics (page 293 - 294) to check the calculation of the confidence intervals for the mean of a dataset.
    static double correlation​(double[] x, double[] y)  
    static double findt​(double conf, double nu)
    Finds the value of t that would match the required confidence and nu.
    protected static double gammln​(double xx)  
    static void main​(java.lang.String[] args)
    Runs through some text-book utils to check that the statistical tests are working properly.
    static double sqr​(double x)  
    static double sumProdDiff​(double[] x, double[] y, double mx, double my)  
    static double sumSquare​(double[] x)  
    static double sumSquareDiff​(double[] x, double mean)  
    static void test()  
    static void testT()
    Runs a t-test on some text book data.
    static double tNotPaired​(double[] s1, double[] s2, boolean twoSided)
    Calculates the probability with which the null hypothesis (that the means are equal) can be rejected in favour of the alternative hypothesis.
    static double tNotPaired​(double m1, double m2, double ss1, double ss2, int n1, int n2, boolean twoSided)  
    static double tNotPairedOneSided​(double[] s1, double[] s2)
    Calculates the probability with which the null hypothesis (that the means are equal) can be rejected in favour of the alternative hypothesis that they are not equal (hence uses a two-sided test).
    static double tNotPairedTwoSided​(double[] s1, double[] s2)
    Calculates the probability with which the null hypothesis (that the means are equal) can be rejected in favour of the alternative hypothesis that one is greater than the other (hence uses a single-sided test).
    static double tPaired​(double[] d, boolean twoSided)  
    static double tPaired​(double[] s1, double[] s2, boolean twoSided)
    Applies a t-test to two arrays of paired samples.
    static double tPairedOneSided​(double[] d)
    Applies the t-test to an array of the differences between paired samples of observations.
    static double tPairedOneSided​(double[] s1, double[] s2)
    Applies a one-sided t-test to two arrays of paired samples.
    static double tPairedTwoSided​(double[] d)  
    static double tPairedTwoSided​(double[] s1, double[] s2)
    Applies a two-sided t-test to two arrays of paired samples.
    static double tSingle​(double t, double nu)
    Applies the single-sided t-test given the value of t and nu.
    static double tTest​(double t, double nu)
    Applies the two-sided t-test given the value of t and nu.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

  • Method Details

    • sqr

      public static double sqr​(double x)
    • sumSquareDiff

      public static double sumSquareDiff​(double[] x, double mean)
    • correlation

      public static double correlation​(double[] x, double[] y)
    • sumSquare

      public static double sumSquare​(double[] x)
    • sumProdDiff

      public static double sumProdDiff​(double[] x, double[] y, double mx, double my)
    • tNotPairedOneSided

      public static double tNotPairedOneSided​(double[] s1, double[] s2)
      Calculates the probability with which the null hypothesis (that the means are equal) can be rejected in favour of the alternative hypothesis that they are not equal (hence uses a two-sided test). The samples are not paired and do not have to be of equal size,
    • tNotPairedTwoSided

      public static double tNotPairedTwoSided​(double[] s1, double[] s2)
      Calculates the probability with which the null hypothesis (that the means are equal) can be rejected in favour of the alternative hypothesis that one is greater than the other (hence uses a single-sided test). The samples are not paired and do not have to be of equal size,
    • tNotPaired

      public static double tNotPaired​(double[] s1, double[] s2, boolean twoSided)
      Calculates the probability with which the null hypothesis (that the means are equal) can be rejected in favour of the alternative hypothesis. Uses a twoSided test if twoSided = true, otherwise uses a one-sided test.
    • tNotPaired

      public static double tNotPaired​(double m1, double m2, double ss1, double ss2, int n1, int n2, boolean twoSided)
    • tPairedOneSided

      public static double tPairedOneSided​(double[] s1, double[] s2)
      Applies a one-sided t-test to two arrays of paired samples. The arrays must be the same size; failure to ensure this could cause an ArrayOutOfBoundsException.
    • tPairedTwoSided

      public static double tPairedTwoSided​(double[] s1, double[] s2)
      Applies a two-sided t-test to two arrays of paired samples. The arrays must be the same size; failure to ensure this could cause an ArrayOutOfBoundsException.
    • tPaired

      public static double tPaired​(double[] s1, double[] s2, boolean twoSided)
      Applies a t-test to two arrays of paired samples. One or two-sided is chosen depending on the value of the boolean variable 'two-sided'. The arrays must be the same size; failure to ensure this could cause an ArrayOutOfBoundsException.
    • tPairedOneSided

      public static double tPairedOneSided​(double[] d)
      Applies the t-test to an array of the differences between paired samples of observations.
    • tPairedTwoSided

      public static double tPairedTwoSided​(double[] d)
    • tPaired

      public static double tPaired​(double[] d, boolean twoSided)
    • confDiff

      public static double confDiff​(double[] d, double conf)
      This method returns the distance x from the mean m such that the area under the t distribution (estimated from the data d) between (m - x) and (m + x) is equal to conf.

      In other words, it finds the desired confidence interval of the mean of the population from which the data is drawn. For example, if conf = 0.95, then there is a 95% chance that the mean lies between (m - x) and (m + x).

      Parameters:
      d - the array of data
      conf - the desired confidence interval
      Returns:
      the spread around the sample mean of the population mean within that confidence interval
    • findt

      public static double findt​(double conf, double nu)
      Finds the value of t that would match the required confidence and nu.
    • binRoot

      public static double binRoot​(double x)
      Bin root - a binary search for a square root. I wrote this purely to check my recollection of how this kind of search can be used to invert functions.
    • tTest

      public static double tTest​(double t, double nu)
      Applies the two-sided t-test given the value of t and nu. To do this it calls betai.
    • tSingle

      public static double tSingle​(double t, double nu)
      Applies the single-sided t-test given the value of t and nu. To do this it calls betai.
    • gammln

      protected static double gammln​(double xx)
    • betai

      protected static double betai​(double a, double b, double x)
    • betacf

      protected static double betacf​(double a, double b, double x)
    • test

      public static void test()
    • confTest

      public static void confTest()
      This uses an example from Statistics for Business and Economics (page 293 - 294) to check the calculation of the confidence intervals for the mean of a dataset.

      The data is: double[] mpg = {18.6, 18.4, 19.2, 20.8, 19.4, 20.5};

      Running the program proiduces the following output:

       At 0.8  : 18.89 < 19.48 < 20.07
       At 0.9  : 18.67 < 19.48 < 20.29
       At 0.95 : 18.45 < 19.48 < 20.51
       At 0.99 : 17.86 < 19.48 < 21.09
       

      This matches closely with the book - any differences are due to small errors in the book version due to the limited number of decimal places used (2) in the handworked example.

    • testT

      public static void testT()
      Runs a t-test on some text book data.

      The data is from page 362 of Statistics for Business and Economics

    • main

      public static void main​(java.lang.String[] args)
      Runs through some text-book utils to check that the statistical tests are working properly.