Hypothesis Testing Calculator

Run Z-tests and T-tests (one-sample and two-sample) with one- or two-tailed alternatives, p-values, critical values, and a clear reject/fail-to-reject decision

About the Hypothesis Testing Calculator

Hypothesis testing is a fundamental statistical procedure used to determine if there is enough evidence in a sample of data to infer that a certain condition is true for the entire population. This calculator allows researchers, students, and data analysts to perform various frequentist tests, including one-sample and two-sample Z-tests and T-tests. By comparing a sample statistic to a hypothesized population parameter, the tool helps decide whether to reject or fail to reject the null hypothesis based on a chosen significance level, typically denoted as alpha.

Whether you are testing if a new manufacturing process increases strength or if a marketing campaign improved conversion rates, this tool automates the complex probability calculations involved. It calculates the test statistic, identifies the critical value from the relevant distribution (Normal or Student's T), and provides a precise p-value. It is designed to handle both one-tailed tests, where you predict a specific direction of change, and two-tailed tests, where you are investigating any significant difference regardless of direction. Users simply need to input their sample mean, sample size, and standard deviation to receive a statistically sound conclusion.

Formula

t = (x̄ - μ) / (s / √n) OR z = (x̄ - μ) / (σ / √n)

For a one-sample test, x̄ is the sample mean, μ is the hypothesized population mean, s (or σ) is the standard deviation, and n is the sample size. The denominator represents the standard error of the mean. In two-sample tests, the formula expands to account for the difference between two means (x̄1 - x̄2) divided by the pooled standard error of the difference between those means.

Worked examples

Example 1: A lightbulb manufacturer claims their new LED lasts 50,000 hours. A test of 25 bulbs shows a mean of 51,200 hours with a standard deviation of 2,450 hours. We test at alpha = 0.05.

Hypothesized Mean (μ): 50,000\nSample Mean (x̄): 51,200\nStandard Deviation (s): 2,450\nSample Size (n): 25\nStandard Error = 2450 / sqrt(25) = 490\nt = (51200 - 50000) / 490 = 2.4489\nDegrees of Freedom = 24\nConsulting T-distribution table for 1-tailed p-value.

Result: t = 2.45, p-value = 0.021. Reject the Null Hypothesis (p < 0.05). The new bulb lasts significantly longer.

Example 2: A school wants to see if a new teaching method results in different test scores than the city average of 75. They test 100 students and find a mean score of 73.5 with a known population standard deviation of 9.5. Alpha is 0.05.

Hypothesized Mean (μ): 75\nSample Mean (x̄): 73.5\nPopulation SD (σ): 9.5\nSample Size (n): 100\nStandard Error = 9.5 / sqrt(100) = 0.95\nz = (73.5 - 75) / 0.95 = -1.5789\nTwo-tailed p-value = 2 * P(Z < -1.58) = 0.1144.

Result: z = -1.58, p-value = 0.114. Fail to Reject the Null Hypothesis. There is no significant difference in scores.

Common use cases

A pharmaceutical company testing if a new drug lowers blood pressure more effectively than a placebo.
An e-commerce manager checking if a website redesign resulted in a statistically significant increase in average order value.
A quality control engineer determining if the weight of cereal boxes deviates from the 500g label requirement.
A social scientist comparing the average income levels between two different geographic regions to see if a significant wealth gap exists.

Pitfalls and limitations

Using a Z-test for small sample sizes when the population standard deviation is unknown can lead to an underestimation of the p-value.
A 'fail to reject' decision does not prove the null hypothesis is true; it simply means there is insufficient evidence to support the alternative.
Always check for data outliers or non-normal distributions, as these can violate the underlying assumptions of parametric T-tests.
Multiple comparisons without adjusting the significance level can lead to an inflated risk of Type I errors.

Frequently asked questions

when to use z test vs t test for hypothesis testing?

Use a Z-test when your sample size is large (typically n > 30) and the population standard deviation is known. Use a T-test when your sample size is small or the population standard deviation is unknown, which is the most common scenario in real-world research.

what does p-value actually mean in hypothesis testing?

The p-value represents the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true. If the p-value is less than your significance level (alpha), you reject the null hypothesis.

difference between one tailed and two tailed test examples

A one-tailed test is used when you are looking for a change in a specific direction (e.g., is Group A better than Group B). A two-tailed test is used when you want to determine if there is any difference at all, regardless of direction.

what is type 1 and type 2 error in hypothesis testing?

A Type I error occurs when you reject the null hypothesis when it is actually true (a false positive). A Type II error occurs when you fail to reject a null hypothesis that is actually false (a false negative).

how to choose null and alternative hypothesis?

The null hypothesis (H0) is the default claim that there is no effect or no difference. Your alternative hypothesis (Ha) is the claim you are trying to find evidence for, suggesting a significant effect exists.