Checks if observed freq. distribution fits a claimed distribution. Sample size n with k different categories.
Hypotheses:
$O_{i}$ is observed frequency count of category i. $E_{i} = n \times p_{i}$ is the expected frequency count.
Test statistic is: $\chi^{2} = \sum_{i=1}k\frac{(O_{i} - E_{i})^{2}}{E_{i}}$
and has approximately a chi-square distribution with k − 1 degrees of freedom under the null hypothesis.
Critical value:
test is right-tailed since we need large values of test statistic (even if hypothesis is undirected).
When: two variables in a single sample
you have a contingency table with r row categories and c column categories. checking to see if columns and variables are dependent.
H0: row and column variables are independent HA: row and column variables are dependent
test statistic:
$\chi^2 = \sum_{cells} \frac{(O-E)^{2}}{E}$
has under H0 approximately a chi-square distribution with (r − 1)(c − 1) degrees of freedom.
reject null hypothesis if $\chi^{2} > \chi^{2}_{(r-1)(e-1), \alpha}$
When: comparing two or more samples to see if they have the same proportions of characteristics.
r different populations (rows) and c different categories (columns) of some variable checking for proportions of a characteristic in the populations.
H0: different populations have same proportions of some characteristics
HA: different populations don’t have the same proportions of some characteristics.
test statistic:
$\chi^{2} = \sum_{cells} \frac{(O-E)^2}{E}$
has under H0 approximately a chi-square distribution with (r − 1)(c − 1) degrees of freedom.
reject H0 if observed $\chi^{2} > \chi^{2}_{(r-1)(e-1),\alpha}$
either:
or:
test statistic: frequency count in cell (1,1) has under H0 and given marginals a hypergeometric distribution
parameters n = (first row total), N = (grand total), and k = (first column total)
guess we don’t need to know how to do this manually.