Package smile.stat.hypothesis
Class ChiSqTest
java.lang.Object
smile.stat.hypothesis.ChiSqTest
Pearson's chi-square test, also known as the chi-square goodness-of-fit test
or chi-square test for independence. Note that the chi-square distribution
is only approximately valid for large sample size. If a significant fraction
of bins have small numbers of counts (say,
< 10
), then the statistic is
not well approximated by a chi-square probability function.-
Field Summary
Modifier and TypeFieldDescriptionfinal double
chi-square statisticfinal double
Cramér's V is a measure of association between two nominal variables, giving a value between 0 and 1 (inclusive).final double
The degree of freedom of chi-square statistic.final String
The type of test.final double
p-value -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic ChiSqTest
test
(int[][] table) Independence test on a two-dimensional contingency table.static ChiSqTest
test
(int[] bins, double[] prob) One-sample Pearson's chi-square test.static ChiSqTest
test
(int[] bins, double[] prob, int constraints) One-sample Pearson's chi-square test.static ChiSqTest
test
(int[] bins1, int[] bins2) Two-sample Pearson's chi-square test.static ChiSqTest
test
(int[] bins1, int[] bins2, int constraints) Two-sample Pearson's chi-square test.toString()
-
Field Details
-
method
The type of test. -
df
public final double dfThe degree of freedom of chi-square statistic. -
chisq
public final double chisqchi-square statistic -
pvalue
public final double pvaluep-value -
CramerV
public final double CramerVCramér's V is a measure of association between two nominal variables, giving a value between 0 and 1 (inclusive). In the case of a 2 × 2 contingency table, Cramér's V is equal to the Phi coefficient.
-
-
Constructor Details
-
ChiSqTest
Constructor.- Parameters:
method
- the type of test.chisq
- the chi-square statistic.df
- the degree of freedom.pvalue
- the p-value.
-
ChiSqTest
Constructor.- Parameters:
method
- the type of test.chisq
- the chi-square statistic.df
- the degree of freedom.pvalue
- the p-value.CramerV
- Cramer's V measure.
-
-
Method Details
-
toString
-
test
One-sample Pearson's chi-square test. Given the array bins containing the observed numbers of events, and an array prob containing the expected probabilities of events, and given one constraint, a small value of p-value indicates a significant difference between the distributions.- Parameters:
bins
- the observed number of events.prob
- the expected probabilities of events.- Returns:
- the test results.
-
test
One-sample Pearson's chi-square test. Given the array bins containing the observed numbers of events, and an array prob containing the expected probabilities of events, and given the number of constraints (normally one), a small value of p-value indicates a significant difference between the distributions.- Parameters:
bins
- the observed number of events.prob
- the expected probabilities of events.constraints
- the constraints on the degree of freedom.- Returns:
- the test results.
-
test
Two-sample Pearson's chi-square test. Given the arrays bins1 and bins2, containing two sets of binned data, and given one constraint, a small value of p-value indicates a significant difference between the distributions.- Parameters:
bins1
- the observed number of events in first sample.bins2
- the observed number of events in second sample.- Returns:
- the test results.
-
test
Two-sample Pearson's chi-square test. Given the arrays bins1 and bins2, containing two sets of binned data, and given the number of constraints (normally one), a small value of p-value indicates a significant difference between the distributions.- Parameters:
bins1
- the observed number of events in first sample.bins2
- the observed number of events in second sample.constraints
- the constraints on the degree of freedom.- Returns:
- the test results.
-
test
Independence test on a two-dimensional contingency table. The rows of contingency table are the values of one nominal variable, the columns are the values of the other nominal variable. The entries are the number of observed events for each combination of row and column.Continuity correction will be applied when computing the test statistic for 2x2 tables: one half is subtracted from all |O-E| differences. The correlation coefficient is calculated as Cramer's V.
- Parameters:
table
- the contingency table.- Returns:
- the test results.
-