## All null hypotheses include an equal sign in them.

### (a) Identify the null hypothesis and the alternative hypothesis.

We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) into the formula for the test statistic identified in Step 2. We organize the computations in the following table.

### (a) Identify the null hypothesis and the alternative hypothesis.

Which alternative hypothesis you choose in setting up your hypothesis test depends on what you’re interested in concluding, should you have enough evidence to refute the null hypothesis (the claim). The alternative hypothesis should be decided upon before collecting or looking at any data, so as not to influence the results.

Whilst there is relatively little justification why a significance level of 0.05 is used rather than 0.01 or 0.10, for example, it is widely used in academic research. However, if you want to be particularly confident in your results, you can set a more stringent level of 0.01 (a 1% chance or less; 1 in 100 chance or less).

## the null hypothesis is probably wrong b.

Depending on how you want to "summarize" the exam performances will determine how you might want to write a more specific null and alternative hypothesis. For example, you could compare the **mean** exam performance of each group (i.e., the "seminar" group and the "lectures-only" group). This is what we will demonstrate here, but other options include comparing the **distributions**, **medians**, amongst other things. As such, we can state:

## the result would be unexpected if the null hypothesis were true c.

Now that you have identified the null and alternative hypotheses, you need to find evidence and develop a strategy for declaring your "support" for either the null or alternative hypothesis. We can do this using some statistical theory and some arbitrary cut-off points. Both these issues are dealt with next.

## the null hypothesis is probably true d.

The **level of statistical significance** is often expressed as the so-called **p****-value**. Depending on the statistical test you have chosen, you will calculate a probability (i.e., the *p*-value) of observing your sample results (or more extreme) **given that the null hypothesis is true**. Another way of phrasing this is to consider the probability that a difference in a mean score (or other statistic) could have arisen based on the assumption that there really is no difference. Let us consider this statement with respect to our example where we are interested in the difference in mean exam performance between two different teaching methods. If there really is no difference between the two teaching methods in the population (i.e., given that the null hypothesis is true), how likely would it be to see a difference in the mean exam performance between the two teaching methods as large as (or larger than) that which has been observed in your sample?

## One can never prove the truth of a statistical (null) hypothesis.

So, you might get a *p*-value such as 0.03 (i.e., *p* = .03). This means that there is a 3% chance of finding a difference as large as (or larger than) the one in your study given that the null hypothesis is true. However, you want to know whether this is "statistically significant". Typically, if there was a 5% or less chance (5 times in 100 or less) that the difference in the mean exam performance between the two teaching methods (or whatever statistic you are using) is as different as observed given the null hypothesis is true, you would reject the null hypothesis and accept the alternative hypothesis. Alternately, if the chance was greater than 5% (5 times in 100 or more), you would fail to reject the null hypothesis and would not accept the alternative hypothesis. As such, in this example where *p* = .03, we would reject the null hypothesis and accept the alternative hypothesis. We reject it because at a significance level of 0.03 (i.e., less than a 5% chance), the result we obtained could happen too frequently for us to be confident that it was the two teaching methods that had an effect on exam performance.