## Next section: to Inferential statistics (testing hypotheses)

### Consequently, null hypothesis 5 is rejected.

A related criticism is that a significant rejection of a null hypothesis might not be biologically meaningful, if the difference is too small to matter. For example, in the chicken-sex experiment, having a treatment that produced 49.9% male chicks might be significantly different from 50%, but it wouldn't be enough to make farmers want to buy your treatment. These critics say you should estimate the effect size and put a on it, not estimate a *P* value. So the goal of your chicken-sex experiment should not be to say "Chocolate gives a proportion of males that is significantly less than 50% (*P*=0.015)" but to say "Chocolate produced 36.1% males with a 95% confidence interval of 25.9 to 47.4%." For the chicken-feet experiment, you would say something like "The difference between males and females in mean foot size is 2.45 mm, with a confidence interval on the difference of ±1.98 mm."

### the null hypothesis is rejected when it is true b.

Inglis Multiple ChoiceTESTING TESTOFSIGNIFICAN MEAN CENTRALLIMITTHM CONCEPT STATISTICS DESCRSTAT/P PARAMETRICT= 5 ComprehensionD= 3 GeneralBack to 1302-1Back to 1306-2Back to 1307-1Back to 1308-1Back to 1309-1Back to 1309-2Back to 1312-1Back to 1313-3Back to 1323-1Back to 1325-1Back to 1336-4Back to 1349-3

A Bayesian would insist that you put in numbers just how likely you think the null hypothesis and various values of the alternative hypothesis are, before you do the experiment, and I'm not sure how that is supposed to work in practice for most experimental biology. But the general concept is a valuable one: as Carl Sagan summarized it, "Extraordinary claims require extraordinary evidence."

## test statistic = _______________.

This criticism only applies to two-tailed tests, where the null hypothesis is "Things are exactly the same" and the alternative is "Things are different." Presumably these critics think it would be okay to do a one-tailed test with a null hypothesis like "Foot length of male chickens is the same as, or less than, that of females," because the null hypothesis that male chickens have smaller feet than females could be true. So if you're worried about this issue, you could think of a two-tailed test, where the null hypothesis is that things are the same, as shorthand for doing two one-tailed tests. A significant rejection of the null hypothesis in a two-tailed test would then be the equivalent of rejecting one of the two one-tailed null hypotheses.

## Does the test statistic (c) fall in the critical region (d)?

Which alternative hypothesis you choose in setting up your hypothesis test depends on what you’re interested in concluding, should you have enough evidence to refute the null hypothesis (the claim). The alternative hypothesis should be decided upon before collecting or looking at any data, so as not to influence the results.

## One can never prove the truth of a statistical (null) hypothesis.

A fairly common criticism of the hypothesis-testing approach to statistics is that the null hypothesis will always be false, if you have a big enough sample size. In the chicken-feet example, critics would argue that if you had an infinite sample size, it is impossible that male chickens would have *exactly* the same average foot size as female chickens. Therefore, since you know before doing the experiment that the null hypothesis is false, there's no point in testing it.

## failing to reject the null hypothesis when it is false.

If you only want to see whether the time turns out to be greater than what the company claims (that is, whether the company is falsely advertising its quick prep time), you use the greater-than alternative, and your two hypotheses are