# Non-Parametric Tests

Parametric methods

• They are based in means, standard deviations or probabilities.

The Normal distribution is not always appropriate

• To study variables with a few observations,
• Non-symmetrical distributions, or
• Variables that can have more than two values

## Non-parametric methods

They are not based in the same assumptions that parametric methods, but also have some assumptions.

## Analysis of Variance (ANOVA)

Generalization of $𝑡$-test for >2 treatments

Given: $𝑛$ experimental treatments, one dependent variable

Assumes:

1. the variables are normally distributed in each treatment
2. the variances for the treatments are similar
3. the sample sizes for the treatments do not differ hugely
(Okay to deviate slightly from these assumptions for larger samples sizes)

Works by analyzing how much of the total variance is due to differences within groups, and how much is due to differences across groups.

Procedure:

$H_0$: There is no difference in the population means across all treatments
Compute the F-statistic:

F=(found variation of the group averages)/(expected variation of the group averages)
(don’t do this by hand!)

If $H_0$ is true, we would expect F=1

Note: ANOVA tells you whether there is a significant difference, but does not tell you which treatment(s) are different.

## $\chi^2$ Test

“ANOVA for non-interval data”

Given: data in an 𝑛 x 𝑚 frequency table (e.g. 𝑛 treatments, 𝑚 variables)

Assumes:

1. Non-parametric, hence no assumption of normality
2. Reasonable sample size (pref >50, although some say >20)
3. Reasonable numbers in each cell

Calculates whether the data fits a given distribution

Basis: computes the sum of the Observed-Expected values

Calculate an expected value (mean) for each column

where $O_i$ is an observed frequency $E_i$ is the expected frequency asserted by the null hypothesis

Calculate $\chi^2$:

$\chi^2 = \sum_{i=1}^{n}\frac{(O_i-E_i)^2}{E_i}$

Compare with lookup value for a given significance level and ded

Get to know these and others: https://docs.scipy.org/doc/scipy/reference/stats.html

#### Spearman’s Rank Coefficient $\rho$:
$\rho=1-\frac{6\sum{(x_i-y_i)^2}}{n(n^2-1})$
#### Kendall’s $\tau$
$\tau=\frac{(\textrm{num. concordant ranked pairs})-(\textrm{num. disconcordant ranked pairs})}{\binom{n}{2}}$