Weekend batch
Your City | | | 7 Sep -22 Sep 2024, Weekend batch | Your City | |
| 21 Sep -6 Oct 2024, Weekend batch | Your City | |
About the Author
Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.
Recommended Resources
Free eBook: Top Programming Languages For A Data Scientist
Normality Test in Minitab: Minitab with Statistics
Machine Learning Career Guide: A Playbook to Becoming a Machine Learning Engineer
- PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.
- Skip to secondary menu
- Skip to main content
- Skip to primary sidebar
Statistics By Jim
Making statistics intuitive
Statistical Hypothesis Testing Overview
By Jim Frost 59 Comments
In this blog post, I explain why you need to use statistical hypothesis testing and help you navigate the essential terminology. Hypothesis testing is a crucial procedure to perform when you want to make inferences about a population using a random sample. These inferences include estimating population properties such as the mean, differences between means, proportions, and the relationships between variables.
This post provides an overview of statistical hypothesis testing. If you need to perform hypothesis tests, consider getting my book, Hypothesis Testing: An Intuitive Guide .
Why You Should Perform Statistical Hypothesis Testing
Hypothesis testing is a form of inferential statistics that allows us to draw conclusions about an entire population based on a representative sample. You gain tremendous benefits by working with a sample. In most cases, it is simply impossible to observe the entire population to understand its properties. The only alternative is to collect a random sample and then use statistics to analyze it.
While samples are much more practical and less expensive to work with, there are trade-offs. When you estimate the properties of a population from a sample, the sample statistics are unlikely to equal the actual population value exactly. For instance, your sample mean is unlikely to equal the population mean. The difference between the sample statistic and the population value is the sample error.
Differences that researchers observe in samples might be due to sampling error rather than representing a true effect at the population level. If sampling error causes the observed difference, the next time someone performs the same experiment the results might be different. Hypothesis testing incorporates estimates of the sampling error to help you make the correct decision. Learn more about Sampling Error .
For example, if you are studying the proportion of defects produced by two manufacturing methods, any difference you observe between the two sample proportions might be sample error rather than a true difference. If the difference does not exist at the population level, you won’t obtain the benefits that you expect based on the sample statistics. That can be a costly mistake!
Let’s cover some basic hypothesis testing terms that you need to know.
Background information : Difference between Descriptive and Inferential Statistics and Populations, Parameters, and Samples in Inferential Statistics
Hypothesis Testing
Hypothesis testing is a statistical analysis that uses sample data to assess two mutually exclusive theories about the properties of a population. Statisticians call these theories the null hypothesis and the alternative hypothesis. A hypothesis test assesses your sample statistic and factors in an estimate of the sample error to determine which hypothesis the data support.
When you can reject the null hypothesis, the results are statistically significant, and your data support the theory that an effect exists at the population level.
The effect is the difference between the population value and the null hypothesis value. The effect is also known as population effect or the difference. For example, the mean difference between the health outcome for a treatment group and a control group is the effect.
Typically, you do not know the size of the actual effect. However, you can use a hypothesis test to help you determine whether an effect exists and to estimate its size. Hypothesis tests convert your sample effect into a test statistic, which it evaluates for statistical significance. Learn more about Test Statistics .
An effect can be statistically significant, but that doesn’t necessarily indicate that it is important in a real-world, practical sense. For more information, read my post about Statistical vs. Practical Significance .
Null Hypothesis
The null hypothesis is one of two mutually exclusive theories about the properties of the population in hypothesis testing. Typically, the null hypothesis states that there is no effect (i.e., the effect size equals zero). The null is often signified by H 0 .
In all hypothesis testing, the researchers are testing an effect of some sort. The effect can be the effectiveness of a new vaccination, the durability of a new product, the proportion of defect in a manufacturing process, and so on. There is some benefit or difference that the researchers hope to identify.
However, it’s possible that there is no effect or no difference between the experimental groups. In statistics, we call this lack of an effect the null hypothesis. Therefore, if you can reject the null, you can favor the alternative hypothesis, which states that the effect exists (doesn’t equal zero) at the population level.
You can think of the null as the default theory that requires sufficiently strong evidence against in order to reject it.
For example, in a 2-sample t-test, the null often states that the difference between the two means equals zero.
When you can reject the null hypothesis, your results are statistically significant. Learn more about Statistical Significance: Definition & Meaning .
Related post : Understanding the Null Hypothesis in More Detail
Alternative Hypothesis
The alternative hypothesis is the other theory about the properties of the population in hypothesis testing. Typically, the alternative hypothesis states that a population parameter does not equal the null hypothesis value. In other words, there is a non-zero effect. If your sample contains sufficient evidence, you can reject the null and favor the alternative hypothesis. The alternative is often identified with H 1 or H A .
For example, in a 2-sample t-test, the alternative often states that the difference between the two means does not equal zero.
You can specify either a one- or two-tailed alternative hypothesis:
If you perform a two-tailed hypothesis test, the alternative states that the population parameter does not equal the null value. For example, when the alternative hypothesis is H A : μ ≠ 0, the test can detect differences both greater than and less than the null value.
A one-tailed alternative has more power to detect an effect but it can test for a difference in only one direction. For example, H A : μ > 0 can only test for differences that are greater than zero.
Related posts : Understanding T-tests and One-Tailed and Two-Tailed Hypothesis Tests Explained
P-values are the probability that you would obtain the effect observed in your sample, or larger, if the null hypothesis is correct. In simpler terms, p-values tell you how strongly your sample data contradict the null. Lower p-values represent stronger evidence against the null. You use P-values in conjunction with the significance level to determine whether your data favor the null or alternative hypothesis.
Related post : Interpreting P-values Correctly
Significance Level (Alpha)
For instance, a significance level of 0.05 signifies a 5% risk of deciding that an effect exists when it does not exist.
Use p-values and significance levels together to help you determine which hypothesis the data support. If the p-value is less than your significance level, you can reject the null and conclude that the effect is statistically significant. In other words, the evidence in your sample is strong enough to be able to reject the null hypothesis at the population level.
Related posts : Graphical Approach to Significance Levels and P-values and Conceptual Approach to Understanding Significance Levels
Types of Errors in Hypothesis Testing
Statistical hypothesis tests are not 100% accurate because they use a random sample to draw conclusions about entire populations. There are two types of errors related to drawing an incorrect conclusion.
- False positives: You reject a null that is true. Statisticians call this a Type I error . The Type I error rate equals your significance level or alpha (α).
- False negatives: You fail to reject a null that is false. Statisticians call this a Type II error. Generally, you do not know the Type II error rate. However, it is a larger risk when you have a small sample size , noisy data, or a small effect size. The type II error rate is also known as beta (β).
Statistical power is the probability that a hypothesis test correctly infers that a sample effect exists in the population. In other words, the test correctly rejects a false null hypothesis. Consequently, power is inversely related to a Type II error. Power = 1 – β. Learn more about Power in Statistics .
Related posts : Types of Errors in Hypothesis Testing and Estimating a Good Sample Size for Your Study Using Power Analysis
Which Type of Hypothesis Test is Right for You?
There are many different types of procedures you can use. The correct choice depends on your research goals and the data you collect. Do you need to understand the mean or the differences between means? Or, perhaps you need to assess proportions. You can even use hypothesis testing to determine whether the relationships between variables are statistically significant.
To choose the proper statistical procedure, you’ll need to assess your study objectives and collect the correct type of data . This background research is necessary before you begin a study.
Related Post : Hypothesis Tests for Continuous, Binary, and Count Data
Statistical tests are crucial when you want to use sample data to make conclusions about a population because these tests account for sample error. Using significance levels and p-values to determine when to reject the null hypothesis improves the probability that you will draw the correct conclusion.
To see an alternative approach to these traditional hypothesis testing methods, learn about bootstrapping in statistics !
If you want to see examples of hypothesis testing in action, I recommend the following posts that I have written:
- How Effective Are Flu Shots? This example shows how you can use statistics to test proportions.
- Fatality Rates in Star Trek . This example shows how to use hypothesis testing with categorical data.
- Busting Myths About the Battle of the Sexes . A fun example based on a Mythbusters episode that assess continuous data using several different tests.
- Are Yawns Contagious? Another fun example inspired by a Mythbusters episode.
Share this:
Reader Interactions
January 14, 2024 at 8:43 am
Hello professor Jim, how are you doing! Pls. What are the properties of a population and their examples? Thanks for your time and understanding.
January 14, 2024 at 12:57 pm
Please read my post about Populations vs. Samples for more information and examples.
Also, please note there is a search bar in the upper-right margin of my website. Use that to search for topics.
July 5, 2023 at 7:05 am
Hello, I have a question as I read your post. You say in p-values section
“P-values are the probability that you would obtain the effect observed in your sample, or larger, if the null hypothesis is correct. In simpler terms, p-values tell you how strongly your sample data contradict the null. Lower p-values represent stronger evidence against the null.”
But according to your definition of effect, the null states that an effect does not exist, correct? So what I assume you want to say is that “P-values are the probability that you would obtain the effect observed in your sample, or larger, if the null hypothesis is **incorrect**.”
July 6, 2023 at 5:18 am
Hi Shrinivas,
The correct definition of p-value is that it is a probability that exists in the context of a true null hypothesis. So, the quotation is correct in stating “if the null hypothesis is correct.”
Essentially, the p-value tells you the likelihood of your observed results (or more extreme) if the null hypothesis is true. It gives you an idea of whether your results are surprising or unusual if there is no effect.
Hence, with sufficiently low p-values, you reject the null hypothesis because it’s telling you that your sample results were unlikely to have occurred if there was no effect in the population.
I hope that helps make it more clear. If not, let me know I’ll attempt to clarify!
May 8, 2023 at 12:47 am
Thanks a lot Ny best regards
May 7, 2023 at 11:15 pm
Hi Jim Can you tell me something about size effect? Thanks
May 8, 2023 at 12:29 am
Here’s a post that I’ve written about Effect Sizes that will hopefully tell you what you need to know. Please read that. Then, if you have any more specific questions about effect sizes, please post them there. Thanks!
January 7, 2023 at 4:19 pm
Hi Jim, I have only read two pages so far but I am really amazed because in few paragraphs you made me clearly understand the concepts of months of courses I received in biostatistics! Thanks so much for this work you have done it helps a lot!
January 10, 2023 at 3:25 pm
Thanks so much!
June 17, 2021 at 1:45 pm
Can you help in the following question: Rocinante36 is priced at ₹7 lakh and has been designed to deliver a mileage of 22 km/litre and a top speed of 140 km/hr. Formulate the null and alternative hypotheses for mileage and top speed to check whether the new models are performing as per the desired design specifications.
April 19, 2021 at 1:51 pm
Its indeed great to read your work statistics.
I have a doubt regarding the one sample t-test. So as per your book on hypothesis testing with reference to page no 45, you have mentioned the difference between “the sample mean and the hypothesised mean is statistically significant”. So as per my understanding it should be quoted like “the difference between the population mean and the hypothesised mean is statistically significant”. The catch here is the hypothesised mean represents the sample mean.
Please help me understand this.
Regards Rajat
April 19, 2021 at 3:46 pm
Thanks for buying my book. I’m so glad it’s been helpful!
The test is performed on the sample but the results apply to the population. Hence, if the difference between the sample mean (observed in your study) and the hypothesized mean is statistically significant, that suggests that population does not equal the hypothesized mean.
For one sample tests, the hypothesized mean is not the sample mean. It is a mean that you want to use for the test value. It usually represents a value that is important to your research. In other words, it’s a value that you pick for some theoretical/practical reasons. You pick it because you want to determine whether the population mean is different from that particular value.
I hope that helps!
November 5, 2020 at 6:24 am
Jim, you are such a magnificent statistician/economist/econometrician/data scientist etc whatever profession. Your work inspires and simplifies the lives of so many researchers around the world. I truly admire you and your work. I will buy a copy of each book you have on statistics or econometrics. Keep doing the good work. Remain ever blessed
November 6, 2020 at 9:47 pm
Hi Renatus,
Thanks so much for you very kind comments. You made my day!! I’m so glad that my website has been helpful. And, thanks so much for supporting my books! 🙂
November 2, 2020 at 9:32 pm
Hi Jim, I hope you are aware of 2019 American Statistical Association’s official statement on Statistical Significance: https://www.tandfonline.com/doi/full/10.1080/00031305.2019.1583913 In case you do not bother reading the full article, may I quote you the core message here: “We conclude, based on our review of the articles in this special issue and the broader literature, that it is time to stop using the term “statistically significant” entirely. Nor should variants such as “significantly different,” “p < 0.05,” and “nonsignificant” survive, whether expressed in words, by asterisks in a table, or in some other way."
With best wishes,
November 3, 2020 at 2:09 am
I’m definitely aware of the debate surrounding how to use p-values most effectively. However, I need to correct you on one point. The link you provide is NOT a statement by the American Statistical Association. It is an editorial by several authors.
There is considerable debate over this issue. There are problems with p-values. However, as the authors state themselves, much of the problem is over people’s mindsets about how to use p-values and their incorrect interpretations about what statistical significance does and does not mean.
If you were to read my website more thoroughly, you’d be aware that I share many of their concerns and I address them in multiple posts. One of the authors’ key points is the need to be thoughtful and conduct thoughtful research and analysis. I emphasize this aspect in multiple posts on this topic. I’ll ask you to read the following three because they all address some of the authors’ concerns and suggestions. But you might run across others to read as well.
Five Tips for Using P-values to Avoid Being Misled How to Interpret P-values Correctly P-values and the Reproducibility of Experimental Results
September 24, 2020 at 11:52 pm
HI Jim, i just want you to know that you made explanation for Statistics so simple! I should say lesser and fewer words that reduce the complexity. All the best! 🙂
September 25, 2020 at 1:03 am
Thanks, Rene! Your kind words mean a lot to me! I’m so glad it has been helpful!
September 23, 2020 at 2:21 am
Honestly, I never understood stats during my entire M.Ed course and was another nightmare for me. But how easily you have explained each concept, I have understood stats way beyond my imagination. Thank you so much for helping ignorant research scholars like us. Looking forward to get hardcopy of your book. Kindly tell is it available through flipkart?
September 24, 2020 at 11:14 pm
I’m so happy to hear that my website has been helpful!
I checked on flipkart and it appears like my books are not available there. I’m never exactly sure where they’re available due to the vagaries of different distribution channels. They are available on Amazon in India.
Introduction to Statistics: An Intuitive Guide (Amazon IN) Hypothesis Testing: An Intuitive Guide (Amazon IN)
July 26, 2020 at 11:57 am
Dear Jim I am a teacher from India . I don’t have any background in statistics, and still I should tell that in a single read I can follow your explanations . I take my entire biostatistics class for botany graduates with your explanations. Thanks a lot. May I know how I can avail your books in India
July 28, 2020 at 12:31 am
Right now my books are only available as ebooks from my website. However, soon I’ll have some exciting news about other ways to obtain it. Stay tuned! I’ll announce it on my email list. If you’re not already on it, you can sign up using the form that is in the right margin of my website.
June 22, 2020 at 2:02 pm
Also can you please let me if this book covers topics like EDA and principal component analysis?
June 22, 2020 at 2:07 pm
This book doesn’t cover principal components analysis. Although, I wouldn’t really classify that as a hypothesis test. In the future, I might write a multivariate analysis book that would cover this and others. But, that’s well down the road.
My Introduction to Statistics covers EDA. That’s the largely graphical look at your data that you often do prior to hypothesis testing. The Introduction book perfectly leads right into the Hypothesis Testing book.
June 22, 2020 at 1:45 pm
Thanks for the detailed explanation. It does clear my doubts. I saw that your book related to hypothesis testing has the topics that I am studying currently. I am looking forward to purchasing it.
Regards, Take Care
June 19, 2020 at 1:03 pm
For this particular article I did not understand a couple of statements and it would great if you could help: 1)”If sample error causes the observed difference, the next time someone performs the same experiment the results might be different.” 2)”If the difference does not exist at the population level, you won’t obtain the benefits that you expect based on the sample statistics.”
I discovered your articles by chance and now I keep coming back to read & understand statistical concepts. These articles are very informative & easy to digest. Thanks for the simplifying things.
June 20, 2020 at 9:53 pm
I’m so happy to hear that you’ve found my website to be helpful!
To answer your questions, keep in mind that a central tenant of inferential statistics is that the random sample that a study drew was only one of an infinite number of possible it could’ve drawn. Each random sample produces different results. Most results will cluster around the population value assuming they used good methodology. However, random sampling error always exists and makes it so that population estimates from a sample almost never exactly equal the correct population value.
So, imagine that we’re studying a medication and comparing the treatment and control groups. Suppose that the medicine is truly not effect and that the population difference between the treatment and control group is zero (i.e., no difference.) Despite the true difference being zero, most sample estimates will show some degree of either a positive or negative effect thanks to random sampling error. So, just because a study has an observed difference does not mean that a difference exists at the population level. So, on to your questions:
1. If the observed difference is just random error, then it makes sense that if you collected another random sample, the difference could change. It could change from negative to positive, positive to negative, more extreme, less extreme, etc. However, if the difference exists at the population level, most random samples drawn from the population will reflect that difference. If the medicine has an effect, most random samples will reflect that fact and not bounce around on both sides of zero as much.
2. This is closely related to the previous answer. If there is no difference at the population level, but say you approve the medicine because of the observed effects in a sample. Even though your random sample showed an effect (which was really random error), that effect doesn’t exist. So, when you start using it on a larger scale, people won’t benefit from the medicine. That’s why it’s important to separate out what is easily explained by random error versus what is not easily explained by it.
I think reading my post about how hypothesis tests work will help clarify this process. Also, in about 24 hours (as I write this), I’ll be releasing my new ebook about Hypothesis Testing!
May 29, 2020 at 5:23 am
Hi Jim, I really enjoy your blog. Can you please link me on your blog where you discuss about Subgroup analysis and how it is done? I need to use non parametric and parametric statistical methods for my work and also do subgroup analysis in order to identify potential groups of patients that may benefit more from using a treatment than other groups.
May 29, 2020 at 2:12 pm
Hi, I don’t have a specific article about subgroup analysis. However, subgroup analysis is just the dividing up of a larger sample into subgroups and then analyzing those subgroups separately. You can use the various analyses I write about on the subgroups.
Alternatively, you can include the subgroups in regression analysis as an indicator variable and include that variable as a main effect and an interaction effect to see how the relationships vary by subgroup without needing to subdivide your data. I write about that approach in my article about comparing regression lines . This approach is my preferred approach when possible.
April 19, 2020 at 7:58 am
sir is confidence interval is a part of estimation?
April 17, 2020 at 3:36 pm
Sir can u plz briefly explain alternatives of hypothesis testing? I m unable to find the answer
April 18, 2020 at 1:22 am
Assuming you want to draw conclusions about populations by using samples (i.e., inferential statistics ), you can use confidence intervals and bootstrap methods as alternatives to the traditional hypothesis testing methods.
March 9, 2020 at 10:01 pm
Hi JIm, could you please help with activities that can best teach concepts of hypothesis testing through simulation, Also, do you have any question set that would enhance students intuition why learning hypothesis testing as a topic in introductory statistics. Thanks.
March 5, 2020 at 3:48 pm
Hi Jim, I’m studying multiple hypothesis testing & was wondering if you had any material that would be relevant. I’m more trying to understand how testing multiple samples simultaneously affects your results & more on the Bonferroni Correction
March 5, 2020 at 4:05 pm
I write about multiple comparisons (aka post hoc tests) in the ANOVA context . I don’t talk about Bonferroni Corrections specifically but I cover related types of corrections. I’m not sure if that exactly addresses what you want to know but is probably the closest I have already written. I hope it helps!
January 14, 2020 at 9:03 pm
Thank you! Have a great day/evening.
January 13, 2020 at 7:10 pm
Any help would be greatly appreciated. What is the difference between The Hypothesis Test and The Statistical Test of Hypothesis?
January 14, 2020 at 11:02 am
They sound like the same thing to me. Unless this is specialized terminology for a particular field or the author was intending something specific, I’d guess they’re one and the same.
April 1, 2019 at 10:00 am
so these are the only two forms of Hypothesis used in statistical testing?
April 1, 2019 at 10:02 am
Are you referring to the null and alternative hypothesis? If so, yes, that’s those are the standard hypotheses in a statistical hypothesis test.
April 1, 2019 at 9:57 am
year very insightful post, thanks for the write up
October 27, 2018 at 11:09 pm
hi there, am upcoming statistician, out of all blogs that i have read, i have found this one more useful as long as my problem is concerned. thanks so much
October 27, 2018 at 11:14 pm
Hi Stano, you’re very welcome! Thanks for your kind words. They mean a lot! I’m happy to hear that my posts were able to help you. I’m sure you will be a fantastic statistician. Best of luck with your studies!
October 26, 2018 at 11:39 am
Dear Jim, thank you very much for your explanations! I have a question. Can I use t-test to compare two samples in case each of them have right bias?
October 26, 2018 at 12:00 pm
Hi Tetyana,
You’re very welcome!
The term “right bias” is not a standard term. Do you by chance mean right skewed distributions? In other words, if you plot the distribution for each group on a histogram they have longer right tails? These are not the symmetrical bell-shape curves of the normal distribution.
If that’s the case, yes you can as long as you exceed a specific sample size within each group. I include a table that contains these sample size requirements in my post about nonparametric vs parametric analyses .
Bias in statistics refers to cases where an estimate of a value is systematically higher or lower than the true value. If this is the case, you might be able to use t-tests, but you’d need to be sure to understand the nature of the bias so you would understand what the results are really indicating.
I hope this helps!
April 2, 2018 at 7:28 am
Simple and upto the point 👍 Thank you so much.
April 2, 2018 at 11:11 am
Hi Kalpana, thanks! And I’m glad it was helpful!
March 26, 2018 at 8:41 am
Am I correct if I say: Alpha – Probability of wrongly rejection of null hypothesis P-value – Probability of wrongly acceptance of null hypothesis
March 28, 2018 at 3:14 pm
You’re correct about alpha. Alpha is the probability of rejecting the null hypothesis when the null is true.
Unfortunately, your definition of the p-value is a bit off. The p-value has a fairly convoluted definition. It is the probability of obtaining the effect observed in a sample, or more extreme, if the null hypothesis is true. The p-value does NOT indicate the probability that either the null or alternative is true or false. Although, those are very common misinterpretations. To learn more, read my post about how to interpret p-values correctly .
March 2, 2018 at 6:10 pm
I recently started reading your blog and it is very helpful to understand each concept of statistical tests in easy way with some good examples. Also, I recommend to other people go through all these blogs which you posted. Specially for those people who have not statistical background and they are facing to many problems while studying statistical analysis.
Thank you for your such good blogs.
March 3, 2018 at 10:12 pm
Hi Amit, I’m so glad that my blog posts have been helpful for you! It means a lot to me that you took the time to write such a nice comment! Also, thanks for recommending by blog to others! I try really hard to write posts about statistics that are easy to understand.
January 17, 2018 at 7:03 am
I recently started reading your blog and I find it very interesting. I am learning statistics by my own, and I generally do many google search to understand the concepts. So this blog is quite helpful for me, as it have most of the content which I am looking for.
January 17, 2018 at 3:56 pm
Hi Shashank, thank you! And, I’m very glad to hear that my blog is helpful!
January 2, 2018 at 2:28 pm
thank u very much sir.
January 2, 2018 at 2:36 pm
You’re very welcome, Hiral!
November 21, 2017 at 12:43 pm
Thank u so much sir….your posts always helps me to be a #statistician
November 21, 2017 at 2:40 pm
Hi Sachin, you’re very welcome! I’m happy that you find my posts to be helpful!
November 19, 2017 at 8:22 pm
great post as usual, but it would be nice to see an example.
November 19, 2017 at 8:27 pm
Thank you! At the end of this post, I have links to four other posts that show examples of hypothesis tests in action. You’ll find what you’re looking for in those posts!
Comments and Questions Cancel reply
Hypothesis Testing Calculator
$\text{Test Statistic: }$ | | = | | | |
$\text{Degrees of Freedom: } $ | $df$ | = | |
$ \text{Level of Significance: } $ | $\alpha$ | = | |
Type II Error
$H_o$: | $\mu$ | | |
$H_a$: | $\mu$ | ≠ | $\mu_0$ |
$\text{Level of Significance: }$ | $\alpha$ | = | |
The first step in hypothesis testing is to calculate the test statistic. The formula for the test statistic depends on whether the population standard deviation (σ) is known or unknown. If σ is known, our hypothesis test is known as a z test and we use the z distribution. If σ is unknown, our hypothesis test is known as a t test and we use the t distribution. Use of the t distribution relies on the degrees of freedom, which is equal to the sample size minus one. Furthermore, if the population standard deviation σ is unknown, the sample standard deviation s is used instead. To switch from σ known to σ unknown, click on $\boxed{\sigma}$ and select $\boxed{s}$ in the Hypothesis Testing Calculator.
| $\sigma$ Known | $\sigma$ Unknown |
Test Statistic | $ z = \dfrac{\bar{x}-\mu_0}{\sigma/\sqrt{{\color{Black} n}}} $ | $ t = \dfrac{\bar{x}-\mu_0}{s/\sqrt{n}} $ |
Next, the test statistic is used to conduct the test using either the p-value approach or critical value approach. The particular steps taken in each approach largely depend on the form of the hypothesis test: lower tail, upper tail or two-tailed. The form can easily be identified by looking at the alternative hypothesis (H a ). If there is a less than sign in the alternative hypothesis then it is a lower tail test, greater than sign is an upper tail test and inequality is a two-tailed test. To switch from a lower tail test to an upper tail or two-tailed test, click on $\boxed{\geq}$ and select $\boxed{\leq}$ or $\boxed{=}$, respectively.
Lower Tail Test | Upper Tail Test | Two-Tailed Test |
$H_0 \colon \mu \geq \mu_0$ | $H_0 \colon \mu \leq \mu_0$ | $H_0 \colon \mu = \mu_0$ |
$H_a \colon \mu | $H_a \colon \mu \neq \mu_0$ |
In the p-value approach, the test statistic is used to calculate a p-value. If the test is a lower tail test, the p-value is the probability of getting a value for the test statistic at least as small as the value from the sample. If the test is an upper tail test, the p-value is the probability of getting a value for the test statistic at least as large as the value from the sample. In a two-tailed test, the p-value is the probability of getting a value for the test statistic at least as unlikely as the value from the sample.
To test the hypothesis in the p-value approach, compare the p-value to the level of significance. If the p-value is less than or equal to the level of signifance, reject the null hypothesis. If the p-value is greater than the level of significance, do not reject the null hypothesis. This method remains unchanged regardless of whether it's a lower tail, upper tail or two-tailed test. To change the level of significance, click on $\boxed{.05}$. Note that if the test statistic is given, you can calculate the p-value from the test statistic by clicking on the switch symbol twice.
In the critical value approach, the level of significance ($\alpha$) is used to calculate the critical value. In a lower tail test, the critical value is the value of the test statistic providing an area of $\alpha$ in the lower tail of the sampling distribution of the test statistic. In an upper tail test, the critical value is the value of the test statistic providing an area of $\alpha$ in the upper tail of the sampling distribution of the test statistic. In a two-tailed test, the critical values are the values of the test statistic providing areas of $\alpha / 2$ in the lower and upper tail of the sampling distribution of the test statistic.
To test the hypothesis in the critical value approach, compare the critical value to the test statistic. Unlike the p-value approach, the method we use to decide whether to reject the null hypothesis depends on the form of the hypothesis test. In a lower tail test, if the test statistic is less than or equal to the critical value, reject the null hypothesis. In an upper tail test, if the test statistic is greater than or equal to the critical value, reject the null hypothesis. In a two-tailed test, if the test statistic is less than or equal the lower critical value or greater than or equal to the upper critical value, reject the null hypothesis.
Lower Tail Test | Upper Tail Test | Two-Tailed Test |
If $z \leq -z_\alpha$, reject $H_0$. | If $z \geq z_\alpha$, reject $H_0$. | If $z \leq -z_{\alpha/2}$ or $z \geq z_{\alpha/2}$, reject $H_0$. |
If $t \leq -t_\alpha$, reject $H_0$. | If $t \geq t_\alpha$, reject $H_0$. | If $t \leq -t_{\alpha/2}$ or $t \geq t_{\alpha/2}$, reject $H_0$. |
When conducting a hypothesis test, there is always a chance that you come to the wrong conclusion. There are two types of errors you can make: Type I Error and Type II Error. A Type I Error is committed if you reject the null hypothesis when the null hypothesis is true. Ideally, we'd like to accept the null hypothesis when the null hypothesis is true. A Type II Error is committed if you accept the null hypothesis when the alternative hypothesis is true. Ideally, we'd like to reject the null hypothesis when the alternative hypothesis is true.
| | Condition | |
| | $H_0$ True | $H_a$ True |
Conclusion | Accept $H_0$ | Correct | Type II Error |
Reject $H_0$ | Type I Error | Correct |
Hypothesis testing is closely related to the statistical area of confidence intervals. If the hypothesized value of the population mean is outside of the confidence interval, we can reject the null hypothesis. Confidence intervals can be found using the Confidence Interval Calculator . The calculator on this page does hypothesis tests for one population mean. Sometimes we're interest in hypothesis tests about two population means. These can be solved using the Two Population Calculator . The probability of a Type II Error can be calculated by clicking on the link at the bottom of the page.
If you're seeing this message, it means we're having trouble loading external resources on our website.
If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.
To log in and use all the features of Khan Academy, please enable JavaScript in your browser.
Unit 12: Significance tests (hypothesis testing)
About this unit.
Significance tests give us a formal process for using sample data to evaluate the likelihood of some claim about a population value. Learn how to conduct significance tests and calculate p-values to see how likely a sample result is to occur by random chance. You'll also see how we use p-values to make conclusions about hypotheses.
The idea of significance tests
- Simple hypothesis testing (Opens a modal)
- Idea behind hypothesis testing (Opens a modal)
- Examples of null and alternative hypotheses (Opens a modal)
- P-values and significance tests (Opens a modal)
- Comparing P-values to different significance levels (Opens a modal)
- Estimating a P-value from a simulation (Opens a modal)
- Using P-values to make conclusions (Opens a modal)
- Simple hypothesis testing Get 3 of 4 questions to level up!
- Writing null and alternative hypotheses Get 3 of 4 questions to level up!
- Estimating P-values from simulations Get 3 of 4 questions to level up!
Error probabilities and power
- Introduction to Type I and Type II errors (Opens a modal)
- Type 1 errors (Opens a modal)
- Examples identifying Type I and Type II errors (Opens a modal)
- Introduction to power in significance tests (Opens a modal)
- Examples thinking about power in significance tests (Opens a modal)
- Consequences of errors and significance (Opens a modal)
- Type I vs Type II error Get 3 of 4 questions to level up!
- Error probabilities and power Get 3 of 4 questions to level up!
Tests about a population proportion
- Constructing hypotheses for a significance test about a proportion (Opens a modal)
- Conditions for a z test about a proportion (Opens a modal)
- Reference: Conditions for inference on a proportion (Opens a modal)
- Calculating a z statistic in a test about a proportion (Opens a modal)
- Calculating a P-value given a z statistic (Opens a modal)
- Making conclusions in a test about a proportion (Opens a modal)
- Writing hypotheses for a test about a proportion Get 3 of 4 questions to level up!
- Conditions for a z test about a proportion Get 3 of 4 questions to level up!
- Calculating the test statistic in a z test for a proportion Get 3 of 4 questions to level up!
- Calculating the P-value in a z test for a proportion Get 3 of 4 questions to level up!
- Making conclusions in a z test for a proportion Get 3 of 4 questions to level up!
Tests about a population mean
- Writing hypotheses for a significance test about a mean (Opens a modal)
- Conditions for a t test about a mean (Opens a modal)
- Reference: Conditions for inference on a mean (Opens a modal)
- When to use z or t statistics in significance tests (Opens a modal)
- Example calculating t statistic for a test about a mean (Opens a modal)
- Using TI calculator for P-value from t statistic (Opens a modal)
- Using a table to estimate P-value from t statistic (Opens a modal)
- Comparing P-value from t statistic to significance level (Opens a modal)
- Free response example: Significance test for a mean (Opens a modal)
- Writing hypotheses for a test about a mean Get 3 of 4 questions to level up!
- Conditions for a t test about a mean Get 3 of 4 questions to level up!
- Calculating the test statistic in a t test for a mean Get 3 of 4 questions to level up!
- Calculating the P-value in a t test for a mean Get 3 of 4 questions to level up!
- Making conclusions in a t test for a mean Get 3 of 4 questions to level up!
More significance testing videos
- Hypothesis testing and p-values (Opens a modal)
- One-tailed and two-tailed tests (Opens a modal)
- Z-statistics vs. T-statistics (Opens a modal)
- Small sample hypothesis test (Opens a modal)
- Large sample proportion hypothesis testing (Opens a modal)
Statistics Tutorial
Descriptive statistics, inferential statistics, stat reference, statistics - hypothesis testing.
Hypothesis testing is a formal way of checking if a hypothesis about a population is true or not.
Hypothesis Testing
A hypothesis is a claim about a population parameter .
A hypothesis test is a formal procedure to check if a hypothesis is true or not.
Examples of claims that can be checked:
The average height of people in Denmark is more than 170 cm.
The share of left handed people in Australia is not 10%.
The average income of dentists is less the average income of lawyers.
The Null and Alternative Hypothesis
Hypothesis testing is based on making two different claims about a population parameter.
The null hypothesis (\(H_{0} \)) and the alternative hypothesis (\(H_{1}\)) are the claims.
The two claims needs to be mutually exclusive , meaning only one of them can be true.
The alternative hypothesis is typically what we are trying to prove.
For example, we want to check the following claim:
"The average height of people in Denmark is more than 170 cm."
In this case, the parameter is the average height of people in Denmark (\(\mu\)).
The null and alternative hypothesis would be:
Null hypothesis : The average height of people in Denmark is 170 cm.
Alternative hypothesis : The average height of people in Denmark is more than 170 cm.
The claims are often expressed with symbols like this:
\(H_{0}\): \(\mu = 170 \: cm \)
\(H_{1}\): \(\mu > 170 \: cm \)
If the data supports the alternative hypothesis, we reject the null hypothesis and accept the alternative hypothesis.
If the data does not support the alternative hypothesis, we keep the null hypothesis.
Note: The alternative hypothesis is also referred to as (\(H_{A} \)).
The Significance Level
The significance level (\(\alpha\)) is the uncertainty we accept when rejecting the null hypothesis in the hypothesis test.
The significance level is a percentage probability of accidentally making the wrong conclusion.
Typical significance levels are:
- \(\alpha = 0.1\) (10%)
- \(\alpha = 0.05\) (5%)
- \(\alpha = 0.01\) (1%)
A lower significance level means that the evidence in the data needs to be stronger to reject the null hypothesis.
There is no "correct" significance level - it only states the uncertainty of the conclusion.
Note: A 5% significance level means that when we reject a null hypothesis:
We expect to reject a true null hypothesis 5 out of 100 times.
Advertisement
The Test Statistic
The test statistic is used to decide the outcome of the hypothesis test.
The test statistic is a standardized value calculated from the sample.
Standardization means converting a statistic to a well known probability distribution .
The type of probability distribution depends on the type of test.
Common examples are:
- Standard Normal Distribution (Z): used for Testing Population Proportions
- Student's T-Distribution (T): used for Testing Population Means
Note: You will learn how to calculate the test statistic for each type of test in the following chapters.
The Critical Value and P-Value Approach
There are two main approaches used for hypothesis tests:
- The critical value approach compares the test statistic with the critical value of the significance level.
- The p-value approach compares the p-value of the test statistic and with the significance level.
The Critical Value Approach
The critical value approach checks if the test statistic is in the rejection region .
The rejection region is an area of probability in the tails of the distribution.
The size of the rejection region is decided by the significance level (\(\alpha\)).
The value that separates the rejection region from the rest is called the critical value .
Here is a graphical illustration:
If the test statistic is inside this rejection region, the null hypothesis is rejected .
For example, if the test statistic is 2.3 and the critical value is 2 for a significance level (\(\alpha = 0.05\)):
We reject the null hypothesis (\(H_{0} \)) at 0.05 significance level (\(\alpha\))
The P-Value Approach
The p-value approach checks if the p-value of the test statistic is smaller than the significance level (\(\alpha\)).
The p-value of the test statistic is the area of probability in the tails of the distribution from the value of the test statistic.
If the p-value is smaller than the significance level, the null hypothesis is rejected .
The p-value directly tells us the lowest significance level where we can reject the null hypothesis.
For example, if the p-value is 0.03:
We reject the null hypothesis (\(H_{0} \)) at a 0.05 significance level (\(\alpha\))
We keep the null hypothesis (\(H_{0}\)) at a 0.01 significance level (\(\alpha\))
Note: The two approaches are only different in how they present the conclusion.
Steps for a Hypothesis Test
The following steps are used for a hypothesis test:
- Check the conditions
- Define the claims
- Decide the significance level
- Calculate the test statistic
One condition is that the sample is randomly selected from the population.
The other conditions depends on what type of parameter you are testing the hypothesis for.
Common parameters to test hypotheses are:
- Proportions (for qualitative data)
- Mean values (for numerical data)
You will learn the steps for both types in the following pages.
COLOR PICKER
Contact Sales
If you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail: [email protected]
Report Error
If you want to report an error, or if you want to make a suggestion, send us an e-mail: [email protected]
Top Tutorials
Top references, top examples, get certified.
User Preferences
Content preview.
Arcu felis bibendum ut tristique et egestas quis:
- Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
- Duis aute irure dolor in reprehenderit in voluptate
- Excepteur sint occaecat cupidatat non proident
Keyboard Shortcuts
7.4.1 - hypothesis testing, five step hypothesis testing procedure section .
In the remaining lessons, we will use the following five step hypothesis testing procedure. This is slightly different from the five step procedure that we used when conducting randomization tests.
- Check assumptions and write hypotheses. The assumptions will vary depending on the test. In this lesson we'll be confirming that the sampling distribution is approximately normal by visually examining the randomization distribution. In later lessons you'll learn more objective assumptions. The null and alternative hypotheses will always be written in terms of population parameters; the null hypothesis will always contain the equality (i.e., \(=\)).
- Calculate the test statistic. Here, we'll be using the formula below for the general form of the test statistic.
- Determine the p-value. The p-value is the area under the standard normal distribution that is more extreme than the test statistic in the direction of the alternative hypothesis.
- Make a decision. If \(p \leq \alpha\) reject the null hypothesis. If \(p>\alpha\) fail to reject the null hypothesis.
- State a "real world" conclusion. Based on your decision in step 4, write a conclusion in terms of the original research question.
General Form of a Test Statistic Section
When using a standard normal distribution (i.e., z distribution), the test statistic is the standardized value that is the boundary of the p-value. Recall the formula for a z score: \(z=\frac{x-\overline x}{s}\). The formula for a test statistic will be similar. When conducting a hypothesis test the sampling distribution will be centered on the null parameter and the standard deviation is known as the standard error.
This formula puts our observed sample statistic on a standard scale (e.g., z distribution). A z score tells us where a score lies on a normal distribution in standard deviation units. The test statistic tells us where our sample statistic falls on the sampling distribution in standard error units.
Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
- Knowledge Base
- Null and Alternative Hypotheses | Definitions & Examples
Null & Alternative Hypotheses | Definitions, Templates & Examples
Published on May 6, 2022 by Shaun Turney . Revised on June 22, 2023.
The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :
- Null hypothesis ( H 0 ): There’s no effect in the population .
- Alternative hypothesis ( H a or H 1 ) : There’s an effect in the population.
Table of contents
Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, similarities and differences between null and alternative hypotheses, how to write null and alternative hypotheses, other interesting articles, frequently asked questions.
The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”:
- The null hypothesis ( H 0 ) answers “No, there’s no effect in the population.”
- The alternative hypothesis ( H a ) answers “Yes, there is an effect in the population.”
The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample. It’s critical for your research to write strong hypotheses .
You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.
Receive feedback on language, structure, and formatting
Professional editors proofread and edit your paper by focusing on:
- Academic style
- Vague sentences
- Style consistency
See an example
The null hypothesis is the claim that there’s no effect in the population.
If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.
Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept . Be careful not to say you “prove” or “accept” the null hypothesis.
Null hypotheses often include phrases such as “no effect,” “no difference,” or “no relationship.” When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).
You can never know with complete certainty whether there is an effect in the population. Some percentage of the time, your inference about the population will be incorrect. When you incorrectly reject the null hypothesis, it’s called a type I error . When you incorrectly fail to reject it, it’s a type II error.
Examples of null hypotheses
The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.
| ( ) |
| |
Does tooth flossing affect the number of cavities? | Tooth flossing has on the number of cavities. | test: The mean number of cavities per person does not differ between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ = µ . |
Does the amount of text highlighted in the textbook affect exam scores? | The amount of text highlighted in the textbook has on exam scores. | : There is no relationship between the amount of text highlighted and exam scores in the population; β = 0. |
Does daily meditation decrease the incidence of depression? | Daily meditation the incidence of depression.* | test: The proportion of people with depression in the daily-meditation group ( ) is greater than or equal to the no-meditation group ( ) in the population; ≥ . |
*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .
The alternative hypothesis ( H a ) is the other answer to your research question . It claims that there’s an effect in the population.
Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.
The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.
Alternative hypotheses often include phrases such as “an effect,” “a difference,” or “a relationship.” When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes < or >). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.
Examples of alternative hypotheses
The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.
| |
| |
Does tooth flossing affect the number of cavities? | Tooth flossing has an on the number of cavities. | test: The mean number of cavities per person differs between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ ≠ µ . |
Does the amount of text highlighted in a textbook affect exam scores? | The amount of text highlighted in the textbook has an on exam scores. | : There is a relationship between the amount of text highlighted and exam scores in the population; β ≠ 0. |
Does daily meditation decrease the incidence of depression? | Daily meditation the incidence of depression. | test: The proportion of people with depression in the daily-meditation group ( ) is less than the no-meditation group ( ) in the population; < . |
Null and alternative hypotheses are similar in some ways:
- They’re both answers to the research question.
- They both make claims about the population.
- They’re both evaluated by statistical tests.
However, there are important differences between the two types of hypotheses, summarized in the following table.
| | |
| A claim that there is in the population. | A claim that there is in the population. |
| | |
| | |
| Equality symbol (=, ≥, or ≤) | Inequality symbol (≠, <, or >) |
| Rejected | Supported |
| Failed to reject | Not supported |
Prevent plagiarism. Run a free check.
To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.
General template sentences
The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:
Does independent variable affect dependent variable ?
- Null hypothesis ( H 0 ): Independent variable does not affect dependent variable.
- Alternative hypothesis ( H a ): Independent variable affects dependent variable.
Test-specific template sentences
Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.
| ( ) | |
test with two groups | The mean dependent variable does not differ between group 1 (µ ) and group 2 (µ ) in the population; µ = µ . | The mean dependent variable differs between group 1 (µ ) and group 2 (µ ) in the population; µ ≠ µ . |
with three groups | The mean dependent variable does not differ between group 1 (µ ), group 2 (µ ), and group 3 (µ ) in the population; µ = µ = µ . | The mean dependent variable of group 1 (µ ), group 2 (µ ), and group 3 (µ ) are not all equal in the population. |
| There is no correlation between independent variable and dependent variable in the population; ρ = 0. | There is a correlation between independent variable and dependent variable in the population; ρ ≠ 0. |
| There is no relationship between independent variable and dependent variable in the population; β = 0. | There is a relationship between independent variable and dependent variable in the population; β ≠ 0. |
Two-proportions test | The dependent variable expressed as a proportion does not differ between group 1 ( ) and group 2 ( ) in the population; = . | The dependent variable expressed as a proportion differs between group 1 ( ) and group 2 ( ) in the population; ≠ . |
Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.
If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
- Normal distribution
- Descriptive statistics
- Measures of central tendency
- Correlation coefficient
Methodology
- Cluster sampling
- Stratified sampling
- Types of interviews
- Cohort study
- Thematic analysis
Research bias
- Implicit bias
- Cognitive bias
- Survivorship bias
- Availability heuristic
- Nonresponse bias
- Regression to the mean
Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.
Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.
The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).
The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).
A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (“ x affects y because …”).
A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses . In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
Turney, S. (2023, June 22). Null & Alternative Hypotheses | Definitions, Templates & Examples. Scribbr. Retrieved August 5, 2024, from https://www.scribbr.com/statistics/null-and-alternative-hypotheses/
Is this article helpful?
Shaun Turney
Other students also liked, inferential statistics | an easy introduction & examples, hypothesis testing | a step-by-step guide with easy examples, type i & type ii errors | differences, examples, visualizations, what is your plagiarism score.
Hypothesis Testing Formula
Hypothesis Testing Formula (Table of Contents)
What is the hypothesis testing formula.
Before deep diving into hypothesis testing, we need to understand the hypothesis in the first place. In simple language, an idea is an educated and informed guess about anything around you, which can be tested by experiment or observation.
For example, A new mobile variant will be accepted by people; new medicine might work or not, etc. So a hypothesis test is a statistical tool for testing the hypothesis we will make and whether that statement is full or not. We select a sample from the data set and test a hypothesis statement by determining the likelihood that a sample statistics. So If your results from that test are not significant, it means that the hypothesis is not valid.
Formula For Hypothesis Testing:
Start Your Free Investment Banking Course
Download Corporate Valuation, Investment Banking, Accounting, CFA Calculator & others
The z-test gives hypothesis testing. The formula for Z – Test is given as follows:
- X – Sample Mean
- U – Population Mean
- SD – Standard Deviation
- n – Sample size
But this is not as simple as it seems. To correctly perform the hypothesis test, you need to follow specific steps:
Step 1: First and foremost, to perform a hypothesis test, we must define the null and alternative hypotheses. An example of the null and alternate hypothesis is given by:
- H0 (null hypothesis): Mean value > 0
- For this, Alternate Hypothesis (Ha): Mean < 0
Step 2: Next thing we have to do is that we need to find out the level of significance. Generally, its value is 0.05 or 0.01
Step 3: Find the z-test value, also called test statistic, as stated in the above formula.
Step 4: Find the z score from the z table given the significance level and mean .
Step 5: Compare these two values, and if the test statistic is greater than the z score, reject the null hypothesis. You cannot reject the null hypothesis if the test statistic is less than the z score.
Examples of Hypothesis Testing Formula (With Excel Template)
Let’s take an example to understand the calculation of the Hypothesis Testing formula in a better manner.
Hypothesis Testing Formula – Example #1
Suppose you have been given the following parameters, and you have to find the Z value and state if you accept the null hypothesis or not:
Null hypothesis H0: Population Mean = 30
Alternate hypothesis Ha: Population Mean ≠ 30
Z – Test is calculated using the formula given below
Z = (X – U) / (SD / √n)
- Z – Test = ( 27 – 30 ) / (20 / SQRT(10))
- Z – Test = -0.474
Level of significance = 0.05
This is a Two tail test, so the probability lies on both sides of the distribution. So 0.025 on each side, and we will look at this value on the z table.
Source: https://www.z-table.com/
Since the significance level is 0.025 on each side, we need to find 0.025 in the z table. Once we see that value from the table, we must extract the z value.
If you see here, on the left side, the values of z are given, and in the top row, decimal places are given. So from that, we can say that 0.025 will give a z value of -1.96
So Z – Score = -1.96
We can reject the null hypothesis since the Z Test > Z Score.
Hypothesis Testing Formula – Example #2
Let’s say you are a school principal; you are claiming that the students in your school are above average intelligence. An analyst wants to double-check your claim and use hypothesis testing. He measures the IQ of all the students in the school and then takes a sample of 20 students. The following are the data points:
- Z – Test = (112 – 110)/ (15 / SQRT(20))
- Z – Test = 3.58
Null Hypothesis: Since population mean = 100,
- H0 : Mean = 100
- Ha: Mean > 100
Level of Significance = 0.05
Since the significance level is 0.05, we must find 1 – 0.05 = 0.95 in the z table. Once we find that value from the table, we must extract the z value.
Z – Table:
If you see here, on the left side, the values of z are given, and in the top row, decimal places are given. So from that, we can say that 0.95 lies between 1.64 to 1.65, mid-point of 1.645.
So Z Score = 1.645
Since the Z Test > Z Score, we can reject the null hypothesis and say students’ intelligence is above average.
Explanation
Everyone should remember that No hypothesis test is 100% correct, and there is always a chance of making an error. There is 2 type of errors that can arise in hypothesis testing: type I and type II.
Type 1: When the null hypothesis is true but rejected in the model. The level of significance gives the probability of this. So if the significance level is 0.05, there is a 5% chance that you will reject the true null.
Type 2: When the null hypothesis is not true but not rejected in the model. The probability of this is given the power of the test. Large sample size can help reduce the probability of this type of error, providing greater confidence in the model.
Relevance and Uses of Hypothesis Testing Formula
As discussed above, the hypothesis test helps the analyst test the statistical sample and, in the end, will either accept or reject the null hypothesis. The test assists in determining the accuracy of the formed hypothesis. If unexpected results occur, it may necessitate the formulation of a new hypothesis, which can then be tested. There are steps for any hypothesis test. The first step is to state the hypothesis, both the null and alternate hypothesis.
The next step is determining all the relevant parameters like mean, standard deviation , level of significance, etc., which helps determine the z-test value . The third step determines the z score from the z table, and for this step, we need to see if it is a two-tail or single-tail test and accordingly extract the z score. The fourth and final step is to compare the results and then, based on that, either accept or reject the null hypothesis.
Hypothesis Testing Formula Calculator
You can use the following Hypothesis Testing Calculator
Z = | Recommended ArticlesThis has been a guide to Hypothesis Testing Formula. Here we discuss how to calculate Hypothesis Testing along with practical examples. We also provide a Hypothesis Testing calculator with a downloadable exceExcelplate. You may also look at the following articles to learn more – - Examples of T Distribution Formula
- Calculator For Consumer Surplus Formula
- How To Calculate Equity Multiplier Formula
- Guide To Net Realizable Value Formula
- Altman Z Score (With Excel Template)
*Please provide your correct email id. Login details for this Free course will be emailed to you By signing up, you agree to our Terms of Use and Privacy Policy . Download Hypothesis Testing Formula Excel Template Corporate Valuation, Investment Banking, Accounting, CFA Calculator & others Forgot Password? डाउनलोड Hypothesis Testing Formula Excel Template This website or its third-party tools use cookies, which are necessary to its functioning and required to achieve the purposes illustrated in the cookie policy. By closing this banner, scrolling this page, clicking a link or continuing to browse otherwise, you agree to our Privacy Policy Explore 1000+ varieties of Mock tests View more Submit Next Question 🚀 Limited Time Offer! - 🎁 ENROLL NOW Hypothesis Testing FormulaWe run a hypothesis test that helps statisticians determine if the evidence are enough in a sample data to conclude that a research condition is true or false for the entire population. For finding out hypothesis of a given sample, we conduct a Z-test. Usually, in Hypothesis testing, we compare two sets by comparing against a synthetic data set and idealized model. The Z test formula is given as: Where, \(\begin{array}{l}\overline{x}\end{array} \) is the sample mean \(\begin{array}{l}\mu\end{array} \) is the population mean \(\begin{array}{l}\sigma\end{array} \) is the standard deviation and n is the sample size. Solved ExamplesQuestion: What will be the z value when the given parameters are sample mean = 600, population mean = 585, the standard deviation is 100 and the sample size is 150? Given parameters are, Sample mean, \(\begin{array}{l}\bar{x}\end{array} \) = 600 Population mean, \(\begin{array}{l}\mu\end{array} \) = 585 , Standard deviation, \(\begin{array}{l}\sigma\end{array} \) = 100 Sample size, n = 150 The formula for hypothesis testing is given as, Leave a Comment Cancel replyYour Mobile number and Email id will not be published. Required fields are marked * Request OTP on Voice Call Post My Comment Register with BYJU'S & Download Free PDFsRegister with byju's & watch live videos. Stats and RHypothesis test by hand. Hypothesis testDescriptive versus inferential statisticsMotivations and limitations, step #1: stating the null and alternative hypothesis, step #2: computing the test statistic, step #3: finding the critical value, why don’t we accept \(h_0\) , step #3: computing the p -value, step #4: concluding and interpreting the results, step #2: computing the confidence interval, step #3: concluding and interpreting the results, which method to choose. Remember that descriptive statistics is the branch of statistics aiming at describing and summarizing a set of data in the best possible manner, that is, by reducing it down to a few meaningful key measures and visualizations—with as little loss of information as possible. In other words, the branch of descriptive statistics helps to have a better understanding and a clear image about a set of observations thanks to summary statistics and graphics. With descriptive statistics, there is no uncertainty because we describe only the group of observations that we decided to work on and no attempt is made to generalize the observed characteristics to another or to a larger group of observations. Inferential statistics , one the other hand, is the branch of statistics that uses a random sample of data taken from a population to make inferences, i.e., to draw conclusions about the population of interest (see the difference between population and sample if you need a refresh of the two concepts). In other words, information from the sample is used to make generalizations about the parameter of interest in the population. The two most important tools used in the domain of inferential statistics are: - hypothesis test (which is the main subject of the present article), and
- confidence interval (which is briefly discussed in this section )
Via my teaching tasks, I realized that many students (especially in introductory statistic classes) struggle to perform hypothesis tests and interpret the results. It seems to me that these students often encounter difficulties mainly because hypothesis testing is rather unclear and abstract to them. One of the reason it looks abstract to them is because they do not understand the final goal of hypothesis testing—the “why” behind this tool. They often do inferential statistics without understanding the reasoning behind it, as if they were following a cooking recipe which does not require any thinking. However, as soon as they understand the principle underlying hypothesis testing, it is much easier for them to apply the concepts and solve the exercises. For this reason, I though it would be useful to write an article on the goal of hypothesis tests (the “why?”), in which context they should be used (the “when?”), how they work (the “how?”) and how to interpret the results (the “so what?”). Like anything else in statistics, it becomes much easier to apply a concept in practice when we understand what we are testing or what we are trying to demonstrate beforehand. In this article, I present—as comprehensibly as possible—the different steps required to perform and conclude a hypothesis test by hand . These steps are illustrated with a basic example. This will build the theoretical foundations of hypothesis testing, which will in turn be of great help for the understanding of most statistical tests . Hypothesis tests come in many forms and can be used for many parameters or research questions. The steps I present in this article are not applicable to all hypothesis test, unfortunately. They are however, appropriate for at least the most common hypothesis tests—the tests on: - One mean: \(\mu\)
- independent samples: \(\mu_1\) and \(\mu_2\)
- paired samples: \(\mu_D\)
- One proportion: \(p\)
- Two proportions: \(p_1\) and \(p_2\)
- One variance: \(\sigma^2\)
- Two variances: \(\sigma^2_1\) and \(\sigma^2_2\)
The good news is that the principles behind these 6 statistical tests (and many more) are exactly the same. So if you understand the intuition and the process for one of them, all others pretty much follow. Unlike descriptive statistics where we only describe the data at hand, hypothesis tests use a subset of observations , referred as a sample , to draw conclusions about a population . One may wonder why we would try to “guess” or make inference about a parameter of a population based on a sample, instead of simply collecting data for the entire population, compute statistics we are interested in and take decisions based upon that. The main reason we actually use a sample instead of the entire population is because, most of the time, collecting data on the entire population is practically impossible, too complex, too expensive, it would take too long, or a combination of any of these. 1 So the overall objective of a hypothesis test is to draw conclusions in order to confirm or refute a belief about a population , based on a smaller group of observations. In practice, we take some measurements of the variable of interest—representing the sample(s)—and we check whether our measurements are likely or not given our assumption (our belief). Based on the probability of observing the sample(s) we have, we decide whether we can trust our belief or not. Hypothesis tests have many practical applications. Here are different situations illustrating when the 6 tests mentioned above would be appropriate: - One mean: suppose that a health professional would like to test whether the mean weight of Belgian adults is different than 80 kg (176.4 lbs).
- Independent samples: suppose that a physiotherapist would like to test the effectiveness of a new treatment by measuring the mean response time (in seconds) for patients in a control group and patients in a treatment group, where patients in the two groups are different.
- Paired samples: suppose that a physiotherapist would like to test the effectiveness of a new treatment by measuring the mean response time (in seconds) before and after a treatment, where patients are measured twice—before and after treatment, so patients are the same in the 2 samples.
- One proportion: suppose that a political pundit would like to test whether the proportion of citizens who are going to vote for a specific candidate is smaller than 30%.
- Two proportions: suppose that a doctor would like to test whether the proportion of smokers is different between professional and amateur athletes.
- One variance: suppose that an engineer would like to test whether a voltmeter has a lower variability than what is imposed by the safety standards.
- Two variances: suppose that, in a factory, two production lines work independently from each other. The financial manager would like to test whether the costs of the weekly maintenance of these two machines have the same variance. Note that a test on two variances is also often performed to verify the assumption of equal variances, which is required for several other statistical tests, such as the Student’s t-test for instance.
Of course, this is a non-exhaustive list of potential applications and many research questions can be answered thanks to a hypothesis test. One important point to remember is that in hypothesis testing we are always interested in the population and not in the sample. The sample is used for the aim of drawing conclusions about the population, so we always test in terms of the population. Usually, hypothesis tests are used to answer research questions in confirmatory analyses . Confirmatory analyses refer to statistical analyses where hypotheses—deducted from theory—are defined beforehand (preferably before data collection). In this approach, the researcher has a specific idea about the variables under consideration and she is trying to see if her idea, specified as hypotheses, is supported by data. On the other hand, hypothesis tests are rarely used in exploratory analyses. 2 Exploratory analyses aims to uncover possible relationships between the variables under investigation. In this approach, the researcher does not have any clear theory-driven assumptions or ideas in mind before data collection. This is the reason exploratory analyses are sometimes referred as hypothesis-generating analyses—they are used to create some hypotheses, which in turn may be tested via confirmatory analyses at a later stage. There are, to my knowledge, 3 different methods to perform a hypothesis tests: Method A: Comparing the test statistic with the critical valueMethod b: comparing the p -value with the significance level \(\alpha\), method c: comparing the target parameter with the confidence interval. Although the process for these 3 approaches may slightly differ, they all lead to the exact same conclusions. Using one method or another is, therefore, more often than not a matter of personal choice or a matter of context. See this section to know which method I use depending on the context. I present the 3 methods in the following sections, starting with, in my opinion, the most comprehensive one when it comes to doing it by hand: comparing the test statistic with the critical value. For the three methods, I will explain the required steps to perform a hypothesis test from a general point of view and illustrate them with the following situation: 3 Suppose a health professional who would like to test whether the mean weight of Belgian adults is different than 80 kg. Note that, as for most hypothesis tests, the test we are going to use as example below requires some assumptions. Since the aim of the present article is to explain a hypothesis test, we assume that all assumptions are met. For the interested reader, see the assumptions (and how to verify them) for this type of hypothesis test in the article presenting the one-sample t-test . Method A, which consists in comparing the test statistic with the critical value, boils down to the following 4 steps: - Stating the null and alternative hypothesis
- Computing the test statistic
- Finding the critical value
- Concluding and interpreting the results
Each step is detailed below. As discussed before, a hypothesis test first requires an idea, that is, an assumption about a phenomenon. This assumption, referred as hypothesis, is derived from the theory and/or the research question. Since a hypothesis test is used to confirm or refute a prior belief, we need to formulate our belief so that there is a null and an alternative hypothesis . Those hypotheses must be mutually exclusive , which means that they cannot be true at the same time. This is step #1. In the context of our scenario, the null and alternative hypothesis are thus: - Null hypothesis \(H_0: \mu = 80\)
- Alternative hypothesis \(H_1: \mu \ne 80\)
When stating the null and alternative hypothesis, bear in mind the following three points: - We are always interested in the population and not in the sample. This is the reason \(H_0\) and \(H_1\) will always be written in terms of the population and not in terms of the sample (in this case, \(\mu\) and not \(\bar{x}\) ).
- The assumption we would like to test is often the alternative hypothesis. If the researcher wanted to test whether the mean weight of Belgian adults was less than 80 kg, she would have stated \(H_0: \mu = 80\) (or equivalently, \(H_0: \mu \ge 80\) ) and \(H_1: \mu < 80\) . 4 Do not mix the null with the alternative hypothesis, or the conclusions will be diametrically opposed!
- The null hypothesis is often the status quo. For instance, suppose that a doctor wants to test whether the new treatment A is more efficient than the old treatment B. The status quo is that the new and old treatments are equally efficient. Assuming a larger value is better, she will then write \(H_0: \mu_A = \mu_B\) (or equivalently, \(H_0: \mu_A - \mu_B = 0\) ) and \(H_1: \mu_A > \mu_B\) (or equivalently, \(H_0: \mu_A - \mu_B > 0\) ). On the opposite, if the lower the better, she would have written \(H_0: \mu_A = \mu_B\) (or equivalently, \(H_0: \mu_A - \mu_B = 0\) ) and \(H_1: \mu_A < \mu_B\) (or equivalently, \(H_0: \mu_A - \mu_B < 0\) ).
The test statistic (often called t-stat ) is, in some sense, a metric indicating how extreme the observations are compared to the null hypothesis . The higher the t-stat (in absolute value), the more extreme the observations are. There are several formulas to compute the t-stat, with one formula for each type of hypothesis test—one or two means, one or two proportions, one or two variances. This means that there is a formula to compute the t-stat for a hypothesis test on one mean, another formula for a test on two means, another for a test on one proportion, etc. 5 The only difficulty in this second step is to choose the appropriate formula. As soon as you know which formula to use based on the type of test, you simply have to apply it to the data. For the interested reader, see the different formulas to compute the t-stat for the most common tests in this Shiny app . Luckily, formulas for hypothesis tests on one and two means, and one and two proportions follow the same structure. Computing the test statistic for these tests is similar than scaling a random variable (a process also knows as “standardization” or “normalization”) which consists in subtracting the mean from that random variable, and dividing the result by the standard deviation: \[Z = \frac{X - \mu}{\sigma}\] For these 4 hypothesis tests (one/two means and one/two proportions), computing the test statistic is like scaling the estimator (computed from the sample) corresponding to the parameter of interest (in the population). So we basically subtract the target parameter from the point estimator and then divide the result by the standard error (which is equivalent to the standard deviation but for an estimator). If this is unclear, here is how the test statistic (denoted \(t_{obs}\) ) is computed in our scenario (assuming that the variance of the population is unknown): \[t_{obs} = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}}\] - \(\bar{x}\) is the sample mean (i.e., the estimator)
- \(\mu\) is the mean under the null hypothesis (i.e., the target parameter)
- \(s\) is the sample standard deviation
- \(n\) is the sample size
- ( \(\frac{s}{\sqrt{n}}\) is the standard error)
Notice the similarity between the formula of this test statistic and the formula used to standardize a random variable. This structure is the same for a test on two means, one proportion and two proportions, except that the estimator, the parameter and the standard error are, of course, slightly different for each type of test. Suppose that in our case we have a sample mean of 71 kg ( \(\bar{x}\) = 71), a sample standard deviation of 13 kg ( \(s\) = 13) and a sample size of 10 adults ( \(n\) = 10). Remember that the population mean (the mean under the null hypothesis) is 80 kg ( \(\mu\) = 80). The t-stat is thus: \[t_{obs} = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}} = \frac{71 - 80}{\frac{13}{\sqrt{10}}} = -2.189\] Although formulas are different depending on which parameter you are testing, the value found for the test statistic gives us an indication on how extreme our observations are. We keep this value of -2.189 in mind because it will be used again in step #4. Although the t-stat gives us an indication of how extreme our observations are, we cannot tell whether this “score of extremity” is too extreme or not based on its value only. So, at this point, we cannot yet tell whether our data are too extreme or not. For this, we need to compare our t-stat with a threshold—referred as critical value —given by the probability distribution tables (and which can, of course, also be found with R). In the same way that the formula to compute the t-stat is different for each parameter of interest, the underlying probability distribution—and thus the statistical table—on which the critical value is based is also different for each target parameter. This means that, in addition to choosing the appropriate formula to compute the t-stat, we also need to select the appropriate probability distribution depending on the parameter we are testing. Luckily, there are only 4 different probability distributions for the 6 hypothesis tests covered in this article (one/two means, one/two proportions and one/two variances): - test on one and two means with known population variance(s)
- test on two paired samples where the variance of the difference between the 2 samples \(\sigma^2_D\) is known
- test on one and two proportions (given that some assumptions are met)
- test on one and two means with un known population variance(s)
- test on two paired samples where the variance of the difference between the 2 samples \(\sigma^2_D\) is un known
- test on one variance
- test on two variances
Each probability distribution also has its own parameters (up to two parameters for the 4 distribution considered here), defining its shape and/or location. Parameter(s) of a probability distribution can be seen as its DNA; meaning that the distribution is entirely defined by its parameter(s). Taking our initial scenario—a health professional who would like to test whether the mean weight of Belgian adults is different than 80 kg—as example. The underlying probability distribution of a test on one mean is either the standard Normal or the Student distribution, depending on whether the variance of the population (not sample variance!) is known or unknown: 6 - If the population variance is known \(\rightarrow\) the standard Normal distribution is used
- If the population variance is un known \(\rightarrow\) the Student distribution is used
If no population variance is explicitly given, you can assume that it is unknown since you cannot compute it based on a sample. If you could compute it, that would mean you have access to the entire population and there is, in this case, no point in performing a hypothesis test (you could simply use some descriptive statistics to confirm or refute your belief). In our example, no population variance is specified so it is assumed to be unknown. We therefore use the Student distribution. The Student distribution has one parameter which defines it; the number of degrees of freedom. The number of degrees of freedom depends on the type of hypothesis test. For instance, the number of degrees of freedom for a test on one mean is equal to the number of observations minus one ( \(n\) - 1). Without going too far into the details, the - 1 comes from the fact that there is one quantity which is estimated (i.e., the mean). 7 The sample size being equal to 10 in our example, the degrees of freedom is equal to \(n\) - 1 = 10 - 1 = 9. There is only one last element missing to find the critical value: the significance level . The significance level , denoted \(\alpha\) , is the probability of wrongly rejecting the null hypothesis, so the probability of rejecting the null hypothesis although it is in reality true . In this sense, it is an error (type I error, as opposed to the type II error 8 ) that we accept to deal with, in order to be able to draw conclusions about a population based on a subset of it. As you may have read in many statistical textbooks, the significance level is very often set to 5%. 9 In some fields (such as medicine or engineering, among others), the significance level is also sometimes set to 1% to decrease the error rate. It is best to specify the significance level before performing a hypothesis test to avoid the temptation to set the significance level in accordance to the results (the temptation is even bigger when the results are on the edge of being significant). As I always tell my students, you cannot “guess” nor compute the significance level. Therefore, if it is not explicitly specified, you can safely assume it is 5%. In our case, we did not indicate it, so we take \(\alpha\) = 5% = 0.05. Furthermore, in our example, we want to test whether the mean weight of Belgian adults is different than 80 kg. Since we do not specify the direction of the test, it is a two-sided test . If we wanted to test that the mean weight was less than 80 kg ( \(H_1: \mu <\) 80) or greater than 80 kg ( \(H_1: \mu >\) 80), we would have done a one-sided test. Make sure that you perform the correct test (two-sided or one-sided) because it has an impact on how to find the critical value (see more in the following paragraphs). So now that we know the appropriate distribution (Student distribution), its parameter (degrees of freedom (df) = 9), the significance level ( \(\alpha\) = 0.05) and the direction (two-sided), we have all we need to find the critical value in the statistical tables : By looking at the row df = 9 and the column \(t_.025\) in the Student’s distribution table, we find a critical value of: \[t_{n-1; \alpha / 2} = t_{9; 0.025} = 2.262\] One may wonder why we take \(t_{\alpha/2} = t_.025\) and not \(t_\alpha = t_.05\) since the significance level is 0.05. The reason is that we are doing a two-sided test ( \(H_1: \mu \ne\) 80), so the error rate of 0.05 must be divided in 2 to find the critical value to the right of the distribution. Since the Student’s distribution is symmetric, the critical value to the left of the distribution is simply: -2.262. Visually, the error rate of 0.05 is partitioned into two parts: - 0.025 to the left of -2.262 and
- 0.025 to the right of 2.262
We keep in mind these critical values of -2.262 and 2.262 for the fourth and last step. Note that the red shaded areas in the previous plot are also known as the rejection regions. More on that in the following section. These critical values can also be found in R, thanks to the qt() function: The qt() function is used for the Student’s distribution ( q stands for quantile and t for Student). There are other functions accompanying the different distributions: - qnorm() for the Normal distribution
- qchisq() for the Chi-square distribution
- qf() for the Fisher distribution
In this fourth and last step, all we have to do is to compare the test statistic (computed in step #2) with the critical values (found in step #3) in order to conclude the hypothesis test . The only two possibilities when concluding a hypothesis test are: - Rejection of the null hypothesis
- Non-rejection of the null hypothesis
In our example of adult weight, remember that: - the t-stat is -2.189
- the critical values are -2.262 and 2.262
Also remember that: - the t-stat gives an indication on how extreme our sample is compared to the null hypothesis
- the critical values are the threshold from which the t-stat is considered as too extreme
To compare the t-stat with the critical values, I always recommend to plot them: These two critical values form the rejection regions (the red shaded areas): - from \(- \infty\) to -2.262, and
- from 2.262 to \(\infty\)
If the t-stat lies within one of the rejection region, we reject the null hypothesis . On the contrary, if the t-stat does not lie within any of the rejection region, we do not reject the null hypothesis . As we can see from the above plot, the t-stat is less extreme than the critical value and therefore does not lie within any of the rejection region. In conclusion, we do not reject the null hypothesis that \(\mu = 80\) . This is the conclusion in statistical terms but they are meaningless without proper interpretation. So it is a good practice to also interpret the result in the context of the problem: At the 5% significance level, we do not reject the hypothesis that the mean weight of Belgian adults is 80 kg. From a more philosophical (but still very important) perspective, note that we wrote “we do not reject the null hypothesis” and “we do not reject the hypothesis that the mean weight of Belgian adults is equal to 80 kg”. We did not write “we accept the null hypothesis” nor “the mean weight of Belgian adults is 80 kg”. The reason is due to the fact that, in hypothesis testing, we conclude something about the population based on a sample. There is, therefore, always some uncertainty and we cannot be 100% sure that our conclusion is correct. Perhaps it is the case that the mean weight of Belgian adults is in reality different than 80 kg, but we failed to prove it based on the data at hand. It may be the case that if we had more observations, we would have rejected the null hypothesis (since all else being equal, a larger sample size implies a more extreme t-stat). Or, it may be the case that even with more observations, we would not have rejected the null hypothesis because the mean weight of Belgian adults is in reality close to 80 kg. We cannot distinguish between the two. So we can just say that we did not find enough evidence against the hypothesis that the mean weight of Belgian adults is 80 kg, but we do not conclude that the mean is equal to 80 kg. If the difference is still not clear to you, the following example may help. Suppose a person is suspected of having committed a crime. This person is either innocent—the null hypothesis—or guilty—the alternative hypothesis. In the attempt to know if the suspect committed the crime, the police collects as much information and proof as possible. This is similar to the researcher collecting data to form a sample. And then the judge, based on the collected evidence, decides whether the suspect is considered as innocent or guilty. If there is enough evidence that the suspect committed the crime, the judge will conclude that the suspect is guilty. In other words, she will reject the null hypothesis of the suspect being innocent because there are enough evidence that the suspect committed the crime. This is similar to the t-stat being more extreme than the critical value: we have enough information (based on the sample) to say that the null hypothesis is unlikely because our data would be too extreme if the null hypothesis were true. Since the sample cannot be “wrong” (it corresponds to the collected data), the only remaining possibility is that the null hypothesis is in fact wrong. This is the reason we write “we reject the null hypothesis”. On the other hand, if there is not enough evidence that the suspect committed the crime (or no evidence at all), the judge will conclude that the suspect is considered as not guilty. In other words, she will not reject the null hypothesis of the suspect being innocent. But even if she concludes that the suspect is considered as not guilty, she will never be 100% sure that he is really innocent. It may be the case that: - the suspect did not commit the crime, or
- the suspect committed the crime but the police was not able to collect enough information against the suspect.
In the former case the suspect is really innocent, whereas in the latter case the suspect is guilty but the police and the judge failed to prove it because they failed to find enough evidence against him. Similar to hypothesis testing, the judge has to conclude the case by considering the suspect not guilty, without being able to distinguish between the two. This is the main reason we write “we do not reject the null hypothesis” or “we fail to reject the null hypothesis” (you may even read in some textbooks conclusion such as “there is no sufficient evidence in the data to reject the null hypothesis”), and we do not write “we accept the null hypothesis”. I hope this metaphor helped you to understand the reason why we reject the null hypothesis instead of accepting it. In the following sections, we present two other methods used in hypothesis testing. These methods will result in the exact same conclusion: non-rejection of the null hypothesis, that is, we do not reject the hypothesis that the mean weight of Belgian adults is 80 kg. It is thus presented only if you prefer to use these methods over the first one. Method B, which consists in computing the p -value and comparing this p -value with the significance level \(\alpha\) , boils down to the following 4 steps: In this second method which uses the p -value, the first and second steps are similar than in the first method. The null and alternative hypotheses remain the same: - \(H_0: \mu = 80\)
- \(H_1: \mu \ne 80\)
Remember that the formula for the t-stat is different depending on the type of hypothesis test (one or two means, one or two proportions, one or two variances). In our case of one mean with unknown variance, we have: The p -value is the probability (so it goes from 0 to 1) of observing a sample at least as extreme as the one we observed if the null hypothesis were true. In some sense, it gives you an indication on how likely your null hypothesis is . It is also defined as the smallest level of significance for which the data indicate rejection of the null hypothesis. For more information about the p -value, I recommend reading this note about the p -value and the significance level \(\alpha\) . Formally, the p -value is the area beyond the test statistic. Since we are doing a two-sided test, the p -value is thus the sum of the area above 2.189 and below -2.189. Visually, the p -value is the sum of the two blue shaded areas in the following plot: The p -value can computed with precision in R with the pt() function: The p -value is 0.0563, which indicates that there is a 5.63% chance to observe a sample at least as extreme as the one observed if the null hypothesis were true. This already gives us a hint on whether our t-stat is too extreme or not (and thus whether our null hypothesis is likely or not), but we formally conclude in step #4. Like the qt() function to find the critical value, we use pt() to find the p -value because the underlying distribution is the Student’s distribution. Use pnorm() , pchisq() and pf() for the Normal, Chi-square and Fisher distribution, respectively. See also this Shiny app to compute the p -value given a certain t-stat for most probability distributions. If you do not have access to a computer (during exams for example) you will not be able to compute the p -value precisely, but you can bound it using the statistical table referring to your test. In our case, we use the Student distribution and we look at the row df = 9 (since df = n - 1): - The test statistic is -2.189
- We take the absolute value, which gives 2.189
- The value 2.189 is between 1.833 and 2.262 (highlighted in blue in the above table)
- the area to the right of 1.833 is 0.05
- the area to the right of 2.262 is 0.025
- So we know that the area to the right of 2.189 must be between 0.025 and 0.05
- Since the Student distribution is symmetric, we know that the area to the left of -2.189 must also be between 0.025 and 0.05
- Therefore, the sum of the two areas must be between 0.05 and 0.10
- In other words, the p -value is between 0.05 and 0.10 (i.e., 0.05 < p -value < 0.10)
Although we could not compute it precisely, it is enough to conclude our hypothesis test in the last step. The final step is now to simply compare the p -value (computed in step #3) with the significance level \(\alpha\) . As for all statistical tests : - If the p -value is smaller than \(\alpha\) ( p -value < 0.05) \(\rightarrow H_0\) is unlikely \(\rightarrow\) we reject the null hypothesis
- If the p -value is greater than or equal to \(\alpha\) ( p -value \(\ge\) 0.05) \(\rightarrow H_0\) is likely \(\rightarrow\) we do not reject the null hypothesis
No matter if we take into consideration the exact p -value (i.e., 0.0563) or the bounded one (0.05 < p -value < 0.10), it is larger than 0.05, so we do not reject the null hypothesis. 10 In the context of the problem, we do not reject the null hypothesis that the mean weight of Belgian adults is 80 kg. Remember that rejecting (or not rejecting) a null hypothesis at the significance level \(\alpha\) using the critical value method (method A) is equivalent to rejecting (or not rejecting) the null hypothesis when the p -value is lower (equal or greater) than \(\alpha\) (method B). This is the reason we find the exact same conclusion than with method A, and why you should too if you use both methods on the same data and with the same significance level. Method C, which consists in computing the confidence interval and comparing this confidence interval with the target parameter (the parameter under the null hypothesis), boils down to the following 3 steps: - Computing the confidence interval
In this last method which uses the confidence interval, the first step is similar than in the first two methods. Like hypothesis testing, confidence intervals are a well-known tool in inferential statistics. Confidence interval is an estimation procedure which produces an interval (i.e., a range of values) containing the true parameter with a certain —usually high— probability . In the same way that there is a formula for each type of hypothesis test when computing the test statistics, there exists a formula for each type of confidence interval. Formulas for the different types of confidence intervals can be found in this Shiny app . Here is the formula for a confidence interval on one mean \(\mu\) (with unknown population variance): \[ (1-\alpha)\text{% CI for } \mu = \bar{x} \pm t_{\alpha/2, n - 1} \frac{s}{\sqrt{n}} \] where \(t_{\alpha/2, n - 1}\) is found in the Student distribution table (and is similar to the critical value found in step #3 of method A). Given our data and with \(\alpha\) = 0.05, we have: \[ \begin{aligned} 95\text{% CI for } \mu &= \bar{x} \pm t_{\alpha/2, n - 1} \frac{s}{\sqrt{n}} \\ &= 71 \pm 2.262 \frac{13}{\sqrt{10}} \\ &= [61.70; 80.30] \end{aligned} \] The 95% confidence interval for \(\mu\) is [61.70; 80.30] kg. But what does a 95% confidence interval mean? We know that this estimation procedure has a 95% probability of producing an interval containing the true mean \(\mu\) . In other words, if we construct many confidence intervals (with different samples of the same size), 95% of them will , on average, include the mean of the population (the true parameter). So on average, 5% of these confidence intervals will not cover the true mean. If you wish to decrease this last percentage, you can decrease the significance level (set \(\alpha\) = 0.01 or 0.02 for instance). All else being equal, this will increase the range of the confidence interval and thus increase the probability that it includes the true parameter. The final step is simply to compare the confidence interval (constructed in step #2) with the value of the target parameter (the value under the null hypothesis, mentioned in step #1): - If the confidence interval does not include the hypothesized value \(\rightarrow H_0\) is unlikely \(\rightarrow\) we reject the null hypothesis
- If the confidence interval includes the hypothesized value \(\rightarrow H_0\) is likely \(\rightarrow\) we do not reject the null hypothesis
In our example: - the hypothesized value is 80 (since \(H_0: \mu\) = 80)
- 80 is included in the 95% confidence interval since it goes from 61.70 to 80.30 kg
- So we do not reject the null hypothesis
In the terms of the problem, we do not reject the hypothesis that the mean weight of Belgian adults is 80 kg. As you can see, the conclusion is equivalent than with the critical value method (method A) and the p -value method (method B). Again, this must be the case since we use the same data and the same significance level \(\alpha\) for all three methods. All three methods give the same conclusion. However, each method has its own advantage so I usually select the most convenient one depending on the situation: - It is, in my opinion, the easiest and most straightforward method of the three when I do not have access to R.
- In addition to being able to know whether the null hypothesis is rejected or not, computing the exact p -value can be very convenient so I tend to use this method if I have access to R.
- If I need to test several hypothesized values , I tend to choose this method because I can construct one single confidence interval and compare it to as many values as I want. For example, with our 95% confidence interval [61.70; 80.30], I know that any value below 61.70 kg and above 80.30 kg will be rejected, without testing it for each value.
In this article, we reviewed the goals and when hypothesis testing is used. We then showed how to do a hypothesis test by hand through three different methods (A. critical value , B. p -value and C. confidence interval ). We also showed how to interpret the results in the context of the initial problem. Although all three methods give the exact same conclusion when using the same data and the same significance level (otherwise there is a mistake somewhere), I also presented my personal preferences when it comes to choosing one method over the other two. Thanks for reading. I hope this article helped you to understand the structure of a hypothesis by hand. I remind you that, at least for the 6 hypothesis tests covered in this article, the formulas are different, but the structure and the reasoning behind it remain the same. So you basically have to know which formulas to use, and simply follow the steps mentioned in this article. For the interested reader, I created two accompanying Shiny apps: - Hypothesis testing and confidence intervals : after entering your data, the app illustrates all the steps in order to conclude the test and compute a confidence interval. See more information in this article .
- How to read statistical tables : the app helps you to compute the p -value given a t-stat for most probability distributions. See more information in this article .
As always, if you have a question or a suggestion related to the topic covered in this article, please add it as a comment so other readers can benefit from the discussion. Suppose a researcher wants to test whether Belgian women are taller than French women. Suppose a health professional would like to know whether the proportion of smokers is different among athletes and non-athletes. It would take way too long to measure the height of all Belgian and French women and to ask all athletes and non-athletes their smoking habits. So most of the time, decisions are based on a representative sample of the population and not on the whole population. If we could measure the entire population in a reasonable time frame, we would not do any inferential statistics. ↩︎ Don’t get me wrong, this does not mean that hypothesis tests are never used in exploratory analyses. It is just much less frequent in exploratory research than in confirmatory research. ↩︎ You may see more or less steps in other articles or textbooks, depending on whether these steps are detailed or concise. Hypothesis testing should, however, follows the same process regardless of the number of steps. ↩︎ For one-sided tests, writing \(H_0: \mu = 80\) or \(H_0: \mu \ge 80\) are both correct. The point is that the null and alternative hypothesis must be mutually exclusive since you are testing one hypothesis against the other, so both cannot be true at the same time. ↩︎ To be complete, there are even different formulas within each type of test, depending on whether some assumptions are met or not. For the interested reader, see all the different scenarios and thus the different formulas for a test on one mean and on two means . ↩︎ There are more uncertainty if the population variance is unknown than if it is known, and this greater uncertainty is taken into account by using the Student distribution instead of the standard Normal distribution. Also note that as the sample size increases, the degrees of freedom of the Student distribution increases and the two distributions become more and more similar. For large sample size (usually from \(n >\) 30), the Student distribution becomes so close to the standard Normal distribution that, even if the population variance is unknown, the standard Normal distribution can be used. ↩︎ For a test on two independent samples, the degrees of freedom is \(n_1 + n_2 - 2\) , where \(n_1\) and \(n_2\) are the size of the first and second sample, respectively. Note the - 2 due to the fact that in this case, two quantities are estimated. ↩︎ The type II error is the probability of not rejecting the null hypothesis although it is in reality false. ↩︎ Whether this is a good or a bad standard is a question that comes up often and is debatable. This is, however, beyond the scope of the article. ↩︎ Again, p -values found via a statistical table or via R must be coherent. ↩︎ Related articles- One-sample Wilcoxon test in R
- Correlation coefficient and correlation test in R
- One-proportion and chi-square goodness of fit test
- How to perform a one-sample t-test by hand and in R: test on one mean
Liked this post?- Get updates every time a new article is published (no spam and unsubscribe anytime):
Yes, receive new posts by email FAQ Contribute Sitemap - Data Science
- Data Analysis
- Data Visualization
- Machine Learning
- Deep Learning
- Computer Vision
- Artificial Intelligence
- AI ML DS Interview Series
- AI ML DS Projects series
- Data Engineering
- Web Scrapping
- Explore GfG Premium
- Data Analysis with Python
Introduction to Data Analysis- What is Data Analysis?
- Data Analytics and its type
- How to Install Numpy on Windows?
- How to Install Pandas in Python?
- How to Install Matplotlib on python?
- How to Install Python Tensorflow in Windows?
Data Analysis Libraries- Pandas Tutorial
- NumPy Tutorial - Python Library
- Data Analysis with SciPy
- Introduction to TensorFlow
Data Visulization Libraries- Matplotlib Tutorial
- Python Seaborn Tutorial
- Plotly tutorial
- Introduction to Bokeh in Python
Exploratory Data Analysis (EDA)- Univariate, Bivariate and Multivariate data and its analysis
- Measures of Central Tendency in Statistics
- Measures of Spread - Range, Variance, and Standard Deviation
- Interquartile Range and Quartile Deviation using NumPy and SciPy
- Anova Formula
- Skewness of Statistical Data
- How to Calculate Skewness and Kurtosis in Python?
- Difference Between Skewness and Kurtosis
- Histogram | Meaning, Example, Types and Steps to Draw
- Interpretations of Histogram
- Quantile Quantile plots
- What is Univariate, Bivariate & Multivariate Analysis in Data Visualisation?
- Using pandas crosstab to create a bar plot
- Exploring Correlation in Python
- Covariance and Correlation
- Factor Analysis | Data Analysis
- Data Mining - Cluster Analysis
- MANOVA Test in R Programming
- Python - Central Limit Theorem
- Probability Distribution Function
- Probability Density Estimation & Maximum Likelihood Estimation
- Exponential Distribution in R Programming - dexp(), pexp(), qexp(), and rexp() Functions
- Mathematics | Probability Distributions Set 4 (Binomial Distribution)
- Poisson Distribution | Definition, Formula, Table and Examples
- P-Value: Comprehensive Guide to Understand, Apply, and Interpret
- Z-Score in Statistics
- How to Calculate Point Estimates in R?
- Confidence Interval
- Chi-square test in Machine Learning
Understanding Hypothesis TestingData preprocessing. - ML | Data Preprocessing in Python
- ML | Overview of Data Cleaning
- ML | Handling Missing Values
- Detect and Remove the Outliers using Python
Data Transformation- Data Normalization Machine Learning
- Sampling distribution Using Python
Time Series Data Analysis- Data Mining - Time-Series, Symbolic and Biological Sequences Data
- Basic DateTime Operations in Python
- Time Series Analysis & Visualization in Python
- How to deal with missing values in a Timeseries in Python?
- How to calculate MOVING AVERAGE in a Pandas DataFrame?
- What is a trend in time series?
- How to Perform an Augmented Dickey-Fuller Test in R
- AutoCorrelation
Case Studies and Projects- Top 8 Free Dataset Sources to Use for Data Science Projects
- Step by Step Predictive Analysis - Machine Learning
- 6 Tips for Creating Effective Data Visualizations
Hypothesis testing involves formulating assumptions about population parameters based on sample statistics and rigorously evaluating these assumptions against empirical evidence. This article sheds light on the significance of hypothesis testing and the critical steps involved in the process. What is Hypothesis Testing?A hypothesis is an assumption or idea, specifically a statistical claim about an unknown population parameter. For example, a judge assumes a person is innocent and verifies this by reviewing evidence and hearing testimony before reaching a verdict. Hypothesis testing is a statistical method that is used to make a statistical decision using experimental data. Hypothesis testing is basically an assumption that we make about a population parameter. It evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. To test the validity of the claim or assumption about the population parameter: - A sample is drawn from the population and analyzed.
- The results of the analysis are used to decide whether the claim is true or not.
Example: You say an average height in the class is 30 or a boy is taller than a girl. All of these is an assumption that we are assuming, and we need some statistical way to prove these. We need some mathematical conclusion whatever we are assuming is true. Defining Hypotheses- Null hypothesis (H 0 ): In statistics, the null hypothesis is a general statement or default position that there is no relationship between two measured cases or no relationship among groups. In other words, it is a basic assumption or made based on the problem knowledge. Example : A company’s mean production is 50 units/per da H 0 : [Tex]\mu [/Tex] = 50.
- Alternative hypothesis (H 1 ): The alternative hypothesis is the hypothesis used in hypothesis testing that is contrary to the null hypothesis. Example: A company’s production is not equal to 50 units/per day i.e. H 1 : [Tex]\mu [/Tex] [Tex]\ne [/Tex] 50.
Key Terms of Hypothesis Testing- Level of significance : It refers to the degree of significance in which we accept or reject the null hypothesis. 100% accuracy is not possible for accepting a hypothesis, so we, therefore, select a level of significance that is usually 5%. This is normally denoted with [Tex]\alpha[/Tex] and generally, it is 0.05 or 5%, which means your output should be 95% confident to give a similar kind of result in each sample.
- P-value: The P value , or calculated probability, is the probability of finding the observed/extreme results when the null hypothesis(H0) of a study-given problem is true. If your P-value is less than the chosen significance level then you reject the null hypothesis i.e. accept that your sample claims to support the alternative hypothesis.
- Test Statistic: The test statistic is a numerical value calculated from sample data during a hypothesis test, used to determine whether to reject the null hypothesis. It is compared to a critical value or p-value to make decisions about the statistical significance of the observed results.
- Critical value : The critical value in statistics is a threshold or cutoff point used to determine whether to reject the null hypothesis in a hypothesis test.
- Degrees of freedom: Degrees of freedom are associated with the variability or freedom one has in estimating a parameter. The degrees of freedom are related to the sample size and determine the shape.
Why do we use Hypothesis Testing?Hypothesis testing is an important procedure in statistics. Hypothesis testing evaluates two mutually exclusive population statements to determine which statement is most supported by sample data. When we say that the findings are statistically significant, thanks to hypothesis testing. One-Tailed and Two-Tailed TestOne tailed test focuses on one direction, either greater than or less than a specified value. We use a one-tailed test when there is a clear directional expectation based on prior knowledge or theory. The critical region is located on only one side of the distribution curve. If the sample falls into this critical region, the null hypothesis is rejected in favor of the alternative hypothesis. One-Tailed TestThere are two types of one-tailed test: - Left-Tailed (Left-Sided) Test: The alternative hypothesis asserts that the true parameter value is less than the null hypothesis. Example: H 0 : [Tex]\mu \geq 50 [/Tex] and H 1 : [Tex]\mu < 50 [/Tex]
- Right-Tailed (Right-Sided) Test : The alternative hypothesis asserts that the true parameter value is greater than the null hypothesis. Example: H 0 : [Tex]\mu \leq50 [/Tex] and H 1 : [Tex]\mu > 50 [/Tex]
Two-Tailed TestA two-tailed test considers both directions, greater than and less than a specified value.We use a two-tailed test when there is no specific directional expectation, and want to detect any significant difference. Example: H 0 : [Tex]\mu = [/Tex] 50 and H 1 : [Tex]\mu \neq 50 [/Tex] To delve deeper into differences into both types of test: Refer to link What are Type 1 and Type 2 errors in Hypothesis Testing?In hypothesis testing, Type I and Type II errors are two possible errors that researchers can make when drawing conclusions about a population based on a sample of data. These errors are associated with the decisions made regarding the null hypothesis and the alternative hypothesis. - Type I error: When we reject the null hypothesis, although that hypothesis was true. Type I error is denoted by alpha( [Tex]\alpha [/Tex] ).
- Type II errors : When we accept the null hypothesis, but it is false. Type II errors are denoted by beta( [Tex]\beta [/Tex] ).
| Null Hypothesis is True | Null Hypothesis is False |
---|
Null Hypothesis is True (Accept) | Correct Decision | Type II Error (False Negative) |
---|
Alternative Hypothesis is True (Reject) | Type I Error (False Positive) | Correct Decision |
---|
How does Hypothesis Testing work?Step 1: define null and alternative hypothesis. State the null hypothesis ( [Tex]H_0 [/Tex] ), representing no effect, and the alternative hypothesis ( [Tex]H_1 [/Tex] ), suggesting an effect or difference. We first identify the problem about which we want to make an assumption keeping in mind that our assumption should be contradictory to one another, assuming Normally distributed data. Step 2 – Choose significance levelSelect a significance level ( [Tex]\alpha [/Tex] ), typically 0.05, to determine the threshold for rejecting the null hypothesis. It provides validity to our hypothesis test, ensuring that we have sufficient data to back up our claims. Usually, we determine our significance level beforehand of the test. The p-value is the criterion used to calculate our significance value. Step 3 – Collect and Analyze data.Gather relevant data through observation or experimentation. Analyze the data using appropriate statistical methods to obtain a test statistic. Step 4-Calculate Test StatisticThe data for the tests are evaluated in this step we look for various scores based on the characteristics of data. The choice of the test statistic depends on the type of hypothesis test being conducted. There are various hypothesis tests, each appropriate for various goal to calculate our test. This could be a Z-test , Chi-square , T-test , and so on. - Z-test : If population means and standard deviations are known. Z-statistic is commonly used.
- t-test : If population standard deviations are unknown. and sample size is small than t-test statistic is more appropriate.
- Chi-square test : Chi-square test is used for categorical data or for testing independence in contingency tables
- F-test : F-test is often used in analysis of variance (ANOVA) to compare variances or test the equality of means across multiple groups.
We have a smaller dataset, So, T-test is more appropriate to test our hypothesis. T-statistic is a measure of the difference between the means of two groups relative to the variability within each group. It is calculated as the difference between the sample means divided by the standard error of the difference. It is also known as the t-value or t-score. Step 5 – Comparing Test Statistic:In this stage, we decide where we should accept the null hypothesis or reject the null hypothesis. There are two ways to decide where we should accept or reject the null hypothesis. Method A: Using Crtical valuesComparing the test statistic and tabulated critical value we have, - If Test Statistic>Critical Value: Reject the null hypothesis.
- If Test Statistic≤Critical Value: Fail to reject the null hypothesis.
Note: Critical values are predetermined threshold values that are used to make a decision in hypothesis testing. To determine critical values for hypothesis testing, we typically refer to a statistical distribution table , such as the normal distribution or t-distribution tables based on. Method B: Using P-valuesWe can also come to an conclusion using the p-value, - If the p-value is less than or equal to the significance level i.e. ( [Tex]p\leq\alpha [/Tex] ), you reject the null hypothesis. This indicates that the observed results are unlikely to have occurred by chance alone, providing evidence in favor of the alternative hypothesis.
- If the p-value is greater than the significance level i.e. ( [Tex]p\geq \alpha[/Tex] ), you fail to reject the null hypothesis. This suggests that the observed results are consistent with what would be expected under the null hypothesis.
Note : The p-value is the probability of obtaining a test statistic as extreme as, or more extreme than, the one observed in the sample, assuming the null hypothesis is true. To determine p-value for hypothesis testing, we typically refer to a statistical distribution table , such as the normal distribution or t-distribution tables based on. Step 7- Interpret the ResultsAt last, we can conclude our experiment using method A or B. Calculating test statisticTo validate our hypothesis about a population parameter we use statistical functions . We use the z-score, p-value, and level of significance(alpha) to make evidence for our hypothesis for normally distributed data . 1. Z-statistics:When population means and standard deviations are known. [Tex]z = \frac{\bar{x} – \mu}{\frac{\sigma}{\sqrt{n}}}[/Tex] - [Tex]\bar{x} [/Tex] is the sample mean,
- μ represents the population mean,
- σ is the standard deviation
- and n is the size of the sample.
2. T-StatisticsT test is used when n<30, t-statistic calculation is given by: [Tex]t=\frac{x̄-μ}{s/\sqrt{n}} [/Tex] - t = t-score,
- x̄ = sample mean
- μ = population mean,
- s = standard deviation of the sample,
- n = sample size
3. Chi-Square TestChi-Square Test for Independence categorical Data (Non-normally distributed) using: [Tex]\chi^2 = \sum \frac{(O_{ij} – E_{ij})^2}{E_{ij}}[/Tex] - [Tex]O_{ij}[/Tex] is the observed frequency in cell [Tex]{ij} [/Tex]
- i,j are the rows and columns index respectively.
- [Tex]E_{ij}[/Tex] is the expected frequency in cell [Tex]{ij}[/Tex] , calculated as : [Tex]\frac{{\text{{Row total}} \times \text{{Column total}}}}{{\text{{Total observations}}}}[/Tex]
Real life Examples of Hypothesis TestingLet’s examine hypothesis testing using two real life situations, Case A: D oes a New Drug Affect Blood Pressure?Imagine a pharmaceutical company has developed a new drug that they believe can effectively lower blood pressure in patients with hypertension. Before bringing the drug to market, they need to conduct a study to assess its impact on blood pressure. - Before Treatment: 120, 122, 118, 130, 125, 128, 115, 121, 123, 119
- After Treatment: 115, 120, 112, 128, 122, 125, 110, 117, 119, 114
Step 1 : Define the Hypothesis- Null Hypothesis : (H 0 )The new drug has no effect on blood pressure.
- Alternate Hypothesis : (H 1 )The new drug has an effect on blood pressure.
Step 2: Define the Significance levelLet’s consider the Significance level at 0.05, indicating rejection of the null hypothesis. If the evidence suggests less than a 5% chance of observing the results due to random variation. Step 3 : Compute the test statisticUsing paired T-test analyze the data to obtain a test statistic and a p-value. The test statistic (e.g., T-statistic) is calculated based on the differences between blood pressure measurements before and after treatment. t = m/(s/√n) - m = mean of the difference i.e X after, X before
- s = standard deviation of the difference (d) i.e d i = X after, i − X before,
- n = sample size,
then, m= -3.9, s= 1.8 and n= 10 we, calculate the , T-statistic = -9 based on the formula for paired t test Step 4: Find the p-valueThe calculated t-statistic is -9 and degrees of freedom df = 9, you can find the p-value using statistical software or a t-distribution table. thus, p-value = 8.538051223166285e-06 Step 5: Result - If the p-value is less than or equal to 0.05, the researchers reject the null hypothesis.
- If the p-value is greater than 0.05, they fail to reject the null hypothesis.
Conclusion: Since the p-value (8.538051223166285e-06) is less than the significance level (0.05), the researchers reject the null hypothesis. There is statistically significant evidence that the average blood pressure before and after treatment with the new drug is different. Python Implementation of Case ALet’s create hypothesis testing with python, where we are testing whether a new drug affects blood pressure. For this example, we will use a paired T-test. We’ll use the scipy.stats library for the T-test. Scipy is a mathematical library in Python that is mostly used for mathematical equations and computations. We will implement our first real life problem via python, import numpy as np from scipy import stats # Data before_treatment = np . array ([ 120 , 122 , 118 , 130 , 125 , 128 , 115 , 121 , 123 , 119 ]) after_treatment = np . array ([ 115 , 120 , 112 , 128 , 122 , 125 , 110 , 117 , 119 , 114 ]) # Step 1: Null and Alternate Hypotheses # Null Hypothesis: The new drug has no effect on blood pressure. # Alternate Hypothesis: The new drug has an effect on blood pressure. null_hypothesis = "The new drug has no effect on blood pressure." alternate_hypothesis = "The new drug has an effect on blood pressure." # Step 2: Significance Level alpha = 0.05 # Step 3: Paired T-test t_statistic , p_value = stats . ttest_rel ( after_treatment , before_treatment ) # Step 4: Calculate T-statistic manually m = np . mean ( after_treatment - before_treatment ) s = np . std ( after_treatment - before_treatment , ddof = 1 ) # using ddof=1 for sample standard deviation n = len ( before_treatment ) t_statistic_manual = m / ( s / np . sqrt ( n )) # Step 5: Decision if p_value <= alpha : decision = "Reject" else : decision = "Fail to reject" # Conclusion if decision == "Reject" : conclusion = "There is statistically significant evidence that the average blood pressure before and after treatment with the new drug is different." else : conclusion = "There is insufficient evidence to claim a significant difference in average blood pressure before and after treatment with the new drug." # Display results print ( "T-statistic (from scipy):" , t_statistic ) print ( "P-value (from scipy):" , p_value ) print ( "T-statistic (calculated manually):" , t_statistic_manual ) print ( f "Decision: { decision } the null hypothesis at alpha= { alpha } ." ) print ( "Conclusion:" , conclusion ) T-statistic (from scipy): -9.0 P-value (from scipy): 8.538051223166285e-06 T-statistic (calculated manually): -9.0 Decision: Reject the null hypothesis at alpha=0.05. Conclusion: There is statistically significant evidence that the average blood pressure before and after treatment with the new drug is different. In the above example, given the T-statistic of approximately -9 and an extremely small p-value, the results indicate a strong case to reject the null hypothesis at a significance level of 0.05. - The results suggest that the new drug, treatment, or intervention has a significant effect on lowering blood pressure.
- The negative T-statistic indicates that the mean blood pressure after treatment is significantly lower than the assumed population mean before treatment.
Case B : Cholesterol level in a populationData: A sample of 25 individuals is taken, and their cholesterol levels are measured. Cholesterol Levels (mg/dL): 205, 198, 210, 190, 215, 205, 200, 192, 198, 205, 198, 202, 208, 200, 205, 198, 205, 210, 192, 205, 198, 205, 210, 192, 205. Populations Mean = 200 Population Standard Deviation (σ): 5 mg/dL(given for this problem) Step 1: Define the Hypothesis- Null Hypothesis (H 0 ): The average cholesterol level in a population is 200 mg/dL.
- Alternate Hypothesis (H 1 ): The average cholesterol level in a population is different from 200 mg/dL.
As the direction of deviation is not given , we assume a two-tailed test, and based on a normal distribution table, the critical values for a significance level of 0.05 (two-tailed) can be calculated through the z-table and are approximately -1.96 and 1.96. The test statistic is calculated by using the z formula Z = [Tex](203.8 – 200) / (5 \div \sqrt{25}) [/Tex] and we get accordingly , Z =2.039999999999992. Step 4: Result Since the absolute value of the test statistic (2.04) is greater than the critical value (1.96), we reject the null hypothesis. And conclude that, there is statistically significant evidence that the average cholesterol level in the population is different from 200 mg/dL Python Implementation of Case Bimport scipy.stats as stats import math import numpy as np # Given data sample_data = np . array ( [ 205 , 198 , 210 , 190 , 215 , 205 , 200 , 192 , 198 , 205 , 198 , 202 , 208 , 200 , 205 , 198 , 205 , 210 , 192 , 205 , 198 , 205 , 210 , 192 , 205 ]) population_std_dev = 5 population_mean = 200 sample_size = len ( sample_data ) # Step 1: Define the Hypotheses # Null Hypothesis (H0): The average cholesterol level in a population is 200 mg/dL. # Alternate Hypothesis (H1): The average cholesterol level in a population is different from 200 mg/dL. # Step 2: Define the Significance Level alpha = 0.05 # Two-tailed test # Critical values for a significance level of 0.05 (two-tailed) critical_value_left = stats . norm . ppf ( alpha / 2 ) critical_value_right = - critical_value_left # Step 3: Compute the test statistic sample_mean = sample_data . mean () z_score = ( sample_mean - population_mean ) / \ ( population_std_dev / math . sqrt ( sample_size )) # Step 4: Result # Check if the absolute value of the test statistic is greater than the critical values if abs ( z_score ) > max ( abs ( critical_value_left ), abs ( critical_value_right )): print ( "Reject the null hypothesis." ) print ( "There is statistically significant evidence that the average cholesterol level in the population is different from 200 mg/dL." ) else : print ( "Fail to reject the null hypothesis." ) print ( "There is not enough evidence to conclude that the average cholesterol level in the population is different from 200 mg/dL." ) Reject the null hypothesis. There is statistically significant evidence that the average cholesterol level in the population is different from 200 mg/dL. Limitations of Hypothesis Testing- Although a useful technique, hypothesis testing does not offer a comprehensive grasp of the topic being studied. Without fully reflecting the intricacy or whole context of the phenomena, it concentrates on certain hypotheses and statistical significance.
- The accuracy of hypothesis testing results is contingent on the quality of available data and the appropriateness of statistical methods used. Inaccurate data or poorly formulated hypotheses can lead to incorrect conclusions.
- Relying solely on hypothesis testing may cause analysts to overlook significant patterns or relationships in the data that are not captured by the specific hypotheses being tested. This limitation underscores the importance of complimenting hypothesis testing with other analytical approaches.
Hypothesis testing stands as a cornerstone in statistical analysis, enabling data scientists to navigate uncertainties and draw credible inferences from sample data. By systematically defining null and alternative hypotheses, choosing significance levels, and leveraging statistical tests, researchers can assess the validity of their assumptions. The article also elucidates the critical distinction between Type I and Type II errors, providing a comprehensive understanding of the nuanced decision-making process inherent in hypothesis testing. The real-life example of testing a new drug’s effect on blood pressure using a paired T-test showcases the practical application of these principles, underscoring the importance of statistical rigor in data-driven decision-making. Frequently Asked Questions (FAQs)1. what are the 3 types of hypothesis test. There are three types of hypothesis tests: right-tailed, left-tailed, and two-tailed. Right-tailed tests assess if a parameter is greater, left-tailed if lesser. Two-tailed tests check for non-directional differences, greater or lesser. 2.What are the 4 components of hypothesis testing?Null Hypothesis ( [Tex]H_o [/Tex] ): No effect or difference exists. Alternative Hypothesis ( [Tex]H_1 [/Tex] ): An effect or difference exists. Significance Level ( [Tex]\alpha [/Tex] ): Risk of rejecting null hypothesis when it’s true (Type I error). Test Statistic: Numerical value representing observed evidence against null hypothesis. 3.What is hypothesis testing in ML?Statistical method to evaluate the performance and validity of machine learning models. Tests specific hypotheses about model behavior, like whether features influence predictions or if a model generalizes well to unseen data. 4.What is the difference between Pytest and hypothesis in Python?Pytest purposes general testing framework for Python code while Hypothesis is a Property-based testing framework for Python, focusing on generating test cases based on specified properties of the code. Please Login to comment...Similar reads. Improve your Coding Skills with PracticeWhat kind of Experience do you want to share? |
IMAGES
VIDEO
COMMENTS
Table of contents. Step 1: State your null and alternate hypothesis. Step 2: Collect data. Step 3: Perform a statistical test. Step 4: Decide whether to reject or fail to reject your null hypothesis. Step 5: Present your findings. Other interesting articles. Frequently asked questions about hypothesis testing.
5.2 - Writing Hypotheses. The first step in conducting a hypothesis test is to write the hypothesis statements that are going to be tested. For each test you will have a null hypothesis ( H 0) and an alternative hypothesis ( H a ). When writing hypotheses there are three things that we need to know: (1) the parameter that we are testing (2) the ...
Test statistics represent effect sizes in hypothesis tests because they denote the difference between your sample effect and no effect —the null hypothesis. Consequently, you use the test statistic to calculate the p-value for your hypothesis test. The above p-value definition is a bit tortuous.
In hypothesis testing, the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis.The null hypothesis is usually denoted \(H_0\) while the alternative hypothesis is usually denoted \(H_1\). An hypothesis test is a statistical decision; the conclusion will either be to reject the null hypothesis in favor ...
Formulate the Hypotheses: Write your research hypotheses as a null hypothesis (H 0) and an alternative hypothesis (H A).; Data Collection: Gather data specifically aimed at testing the hypothesis.; Conduct A Test: Use a suitable statistical test to analyze your data.; Make a Decision: Based on the statistical test results, decide whether to reject the null hypothesis or fail to reject it.
The hypothesis testing formula for some important test statistics are given below: z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\). \(\overline{x}\) is the sample mean, \(\mu\) is the population mean, \(\sigma\) is the population standard deviation and n is the size of the sample. ... Hypothesis testing in statistics is a tool that is ...
Now that you know about hypothesis testing, look at the two types of hypothesis testing in statistics. Hypothesis Testing Formula. Z = ( x̅ - μ0 ) / (σ /√n) Here, x̅ is the sample mean, μ0 is the population mean, σ is the standard deviation, n is the sample size.
Here, we'll be using the formula below for the general form of the test statistic. Determine the p-value. The p-value is the area under the standard normal distribution that is more extreme than the test statistic in the direction of the alternative hypothesis. Make a decision. If \(p \leq \alpha\) reject the null hypothesis.
Hypothesis testing is a crucial procedure to perform when you want to make inferences about a population using a random sample. These inferences include estimating population properties such as the mean, differences between means, proportions, and the relationships between variables. This post provides an overview of statistical hypothesis testing.
A statistical hypothesis is an assumption about a population parameter.. For example, we may assume that the mean height of a male in the U.S. is 70 inches. The assumption about the height is the statistical hypothesis and the true mean height of a male in the U.S. is the population parameter.. A hypothesis test is a formal statistical test we use to reject or fail to reject a statistical ...
Test Statistic: z = x¯¯¯ −μo σ/ n−−√ z = x ¯ − μ o σ / n since it is calculated as part of the testing of the hypothesis. Definition 7.1.4 7.1. 4. p - value: probability that the test statistic will take on more extreme values than the observed test statistic, given that the null hypothesis is true.
Hypothesis Testing Calculator. The first step in hypothesis testing is to calculate the test statistic. The formula for the test statistic depends on whether the population standard deviation (σ) is known or unknown. If σ is known, our hypothesis test is known as a z test and we use the z distribution. If σ is unknown, our hypothesis test is ...
Significance tests give us a formal process for using sample data to evaluate the likelihood of some claim about a population value. Learn how to conduct significance tests and calculate p-values to see how likely a sample result is to occur by random chance. You'll also see how we use p-values to make conclusions about hypotheses.
A statistical hypothesis test is a method of statistical inference used to decide whether the data sufficiently support a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. Then a decision is made, either by comparing the test statistic to a critical value or equivalently by evaluating a p ...
Hypothesis testing is based on making two different claims about a population parameter. The null hypothesis ( H 0) and the alternative hypothesis ( H 1) are the claims. The two claims needs to be mutually exclusive, meaning only one of them can be true. The alternative hypothesis is typically what we are trying to prove.
Here, we'll be using the formula below for the general form of the test statistic. Determine the p-value. The p-value is the area under the standard normal distribution that is more extreme than the test statistic in the direction of the alternative hypothesis. Make a decision. If \(p \leq \alpha\) reject the null hypothesis.
Test statistic example. To test your hypothesis about temperature and flowering dates, you perform a regression test. The regression test generates: a regression coefficient of 0.36. a t value comparing that coefficient to the predicted range of regression coefficients under the null hypothesis of no relationship.
Revised on June 22, 2023. The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test: Null hypothesis (H0): There's no effect in the population. Alternative hypothesis (Ha or H1): There's an effect in the population. The effect is usually the effect of the ...
Based on the test results and level of significance, they either accept or reject the null hypothesis. Finally, the statistical findings are compiled and summarized into a research report. Hypothesis Testing Formula. Researchers opt for different statistical tests like t-tests or z-tests. The z-test formula is as follows: Z = ( x̅ - μ 0 ...
H0 (null hypothesis): Mean value > 0; For this, Alternate Hypothesis (Ha): Mean < 0; Step 2: Next thing we have to do is that we need to find out the level of significance.Generally, its value is 0.05 or 0.01. Step 3: Find the z-test value, also called test statistic, as stated in the above formula. Step 4: Find the z score from the z table given the significance level and mean.
Hypothesis Testing Formula We run a hypothesis test that helps statisticians determine if the evidence are enough in a sample data to conclude that a research condition is true or false for the entire population.
There are several formulas to compute the t-stat, with one formula for each type of hypothesis test—one or two means, one or two proportions, one or two variances. This means that there is a formula to compute the t-stat for a hypothesis test on one mean, another formula for a test on two means, another for a test on one proportion, etc. 5
Hypothesis testing is a statistical method that is used to make a statistical decision using experimental data. Hypothesis testing is basically an assumption that we make about a population parameter. ... The test statistic is calculated by using the z formula Z = [Tex](203.8 - 200) / (5 \div \sqrt{25}) [/Tex] and we get accordingly , Z =2. ...
The cheatsheets provide concise summaries of key concepts and formulas, making them a handy reference for students and professionals. It is a popular cheatsheet that covers topics on conditional probability, random variables, parameter estimation, hypothesis testing, and more.