Statistical Methods

Table of Contents

Hypothesis testing

If σ is known, use Z scores. If not, use T scores and $s_{n}$ (or if sample size is below 30).

The steps

  1. Choose population parameter

  2. Formulate null and alternative hypotheses. Choose significance level.

    • H0: parameter = some value
    • HA: depends, can be two-tailed or one-tailed
      • one-tailed: param < value or param > value
      • two-tailed: param ≠ value
  3. Collect data.

  4. Choose test statistic (based on parameter) and identify its distribution under H0

  5. Calculate value of test statistic.

  6. Find p-value, or critical region based on significance.

    • watch out for the critical region. if one-tailed test, have to divide significance by 2 first.
  7. Decide whether or not to reject the null hypothesis:

    • p-value:
      • if p-value ≤ significance, reject
      • otherwise, fail to reject
    • critical values:
      • if Z-score or T-score not in critical region, fail to reject
      • otherwise, reject

YOU NEVER ACCEPT HYPOTHESES

Errors in testing

H0 trueH0 false
reject H0Type Ifine
not reject H0finetype II

Proportion test

test statistic:

$Z = \frac{\hat{P}_{n} - p}{\sqrt{\frac{p(1-p)}{n}}}$

Mean test

Test statistic iff σ known:

$Z = \frac{\bar{X}_{n} - \mu}{\frac{\sigma}{\sqrt{n}}}$

has standard normal distribution under null hypothesis.

Test statistic otherwise:

basically just replace σ with its estimator $\frac{s_n}{\sqrt{n}}$

$T = \frac{\bar{X}_{n} - \mu}{\frac{s_n}{\sqrt{n}}}$

has t-distribution with n−1 degrees of freedom under null hypothesis.

Confidence interval (1−α) for μ:

$\text{lower, upper} = \bar{x}_{n} \pm t_{n-1, \alpha/2} \times \frac{s_n}{\sqrt{n}}$

What does $t_{n-1, \alpha / 2}$ mean? Well, we need a t-score, with n−1 degrees of freedom. Divide significance by 2 because α is the full area (both tails) and since we’re adding/subtracting a t-score, we want to find the score corresponding to the area in one tail.

Two samples

Dependent

dependent: values in one sample are related to values in the other sample, or form natural matched pairs

to test, we look at the difference of means.

null hypothesis can be either no difference, or that difference is a certain value. alternative hypothesis can basically be whatever.

calculate the differences for each x, then have a sample mean of differences $\bar{D}$ and standard deviation of differences $s_{d}$.

test statistic:

$T_{d} = \frac{\bar{D} - (\mu_{1} - \mu_{2})}{\frac{s_{d}}{\sqrt{n}}}$

which under null hypothesis has t-distribution with n−1 degrees of freedom.

Independent

independent: no relationship between two samples

Assuming equal σ

if sample randomly drawn from same population, we assume that σ₁ = σ₂.

test statistic:

$T_{2}^{eq} = \frac{(\bar{X}_{1} - \bar{X}_{2}) - (\mu_{1} - \mu_{2})}{\sqrt{\frac{s^{2}_{p}}{n_{1}} + \frac{s^{2}_{p}}{n_{2}}}}$

the pooled sample variance is:

$s_{p}^{2} = \frac{(n_{1} - 1) s_{1}^{2} + (n_{2} - 1) s_{2}^{2}}{n_{1} + n_{2} - 2}$

Not assuming equal σ

test statistic:

$T_{2} = \frac{(\bar{X}_{1} - \bar{X}_{2}) - (\mu_{1} - \mu_{2})}{\sqrt{\frac{s_{1}^{2}}{n_{1}} + \frac{s_{2}^{2}}{n_{2}}}}$

which under null hypothesis has t-distribution with $\bar{n}$ degrees of freedom. $\bar{n}$ at the exam is the smallest of the two sample sizes.

Two proportions

H0: p1 = p2

test statistic:

$z_{p} = \frac{(\hat{p}_{1} - \hat{p}_{2})}{\sqrt{\frac{\bar{p} (1-\bar{p})}{n_{1}} + \frac{\bar{p}(1-\bar{p})}{n_{2}}}}$

(1−α) CI for p1−p2:

$(\hat{p}{1} - \hat{p}{2}) \pm E$ where

$E = z_{\alpha / 2} \times \sqrt{\frac{\hat{p}_{1} (1-\hat{p}_{1})}{n_{1}} + \frac{\hat{p}_{2} (1-\hat{p}_{2})}{n_{2}}}$

P(Type I error) = α (significance level)