In hypothesis tests, we’ve always set up a null hypothesis and an alternative hypothesis.
The null hypothesis might state that the business system works or it might tell us that nothing has changed in our business system. The alternative hypothesis might tell us that the business system is broken.
Let’s use a special type of disease screening test as an example.
This disease screening would provide a reading based on your blood.
The average reading is 200.
People that get a reading over 225 get a positive test result.
This would indicate they have the disease. If we’re going to equate this to a hypothesis test, we would say the disease screening had two hypotheses.
The null hypothesis would state that everything’s okay. The person being tested does not have the disease.
The alternative hypothesis would state that the person being tested does in fact have disease. Let’s establish an alpha of 0.02 or 2%, and let’s say the incidence of disease is normally distributed. So if we’re going to look at this on a normal distribution, we might say that 100 is the mean. Anything to the right of 225 would be considered a positive result for disease.
Our null hypothesis is no disease, so left of 225, we do not reject the null hypothesis. The test indicates that the patient does not have the disease. But, to the right of 225, we would reject the null hypothesis.
The test indicates the patient does have the disease. Nonetheless, these disease tests are not perfect. Remember, our alpha is 2%. So if a patient gets 225 or higher on the test, a positive test, it’s unlikely but not impossible that the patient may not have the disease. This is called a false positive result (Type 1 error) .
It’s possible someone scored less than 225, a negative test, but they might actually have the disease. This is called a false negative result ( Type 2 error).
Often, statisticians use a matrix like this to understand the possible outcomes.
At the top, we see the actual truth. Some of the people tested do not have disease. Some people do have disease. Along the left side, we have two possible outcomes of the test. Positive test indicates disease. Negative test indicates no disease.
Now, let’s look at the possible results. If the test comes back negative and the patient does not have disease, the disease screening test was correct, and thus, our hypothesis test worked. Also, if the test comes back positive and the patient does have disease, again, the screening test worked.
For two other quadrants, If a person gets a positive result but they do not have disease, this would be a false positive. In statistics, we also call this a type one error.
The opposite is also possible. A person gets a negative result, but they do have disease. This would be called a false negative. In statistics, we also call this a type two error. Type one errors are telling healthy people that they have disease. So, in that case, maybe our test is too sensitive. Type two errors are telling sick people they’re healthy. This could indicate our test is not sensitive enough.
In both cases, we’re concerned about the quality of the test and the testing kits, but the types of errors and the possible causes for the different types of errors may be very different. Hypothesis tests, even when they’re done the right way, they can be flawed. So, it’s important to understand a hypothesis test might make a mistake, and by knowing the different types of errors, type one and type two, it can help you in developing and interpreting your hypothesis tests and the subsequent results.
I simplify important statistical concepts for enthusiasts. If interested to connect with me, please visit my website link below. We have 25k Followers celeberabation offer . 25% off on all courses.
If you have any questions or requirements for consultation do get in touch with me on the link below or connect@decodingdatascience.com
Agree? Kindly comment on your experience with Hypothesis testing Errors and their different types.
Please subscribe to the weekly newsletter for such great information every week.