Hypergeometric Distribution Calculator

Calculate probabilities for sampling without replacement from a finite population

About the Hypergeometric Distribution Calculator

The Hypergeometric Distribution Calculator is a specialized statistical tool designed to determine the probability of a specific number of successes in a sequence of draws from a finite population. Unlike the binomial distribution, which assumes that the probability of success remains constant (sampling with replacement), the hypergeometric distribution accounts for 'sampling without replacement.' This means that every time an item is removed from the population, the probability of drawing a similar item in the next step changes. This tool is essential for scenarios where the population size is small enough that individual draws significantly impact the remaining pool.

Quality control engineers, ecologists, and card players frequently use this calculator to model real-world dependencies. For instance, it can determine the likelihood of finding a specific number of defective units in a small batch or the probability of being dealt a specific hand in a card game. By inputting the total population, the known number of successes within that population, the sample size, and the desired number of successes, users can instantly calculate the probability of an exact match, as well as cumulative probabilities like 'at most' or 'at least' a certain number of successes.

Formula

P(X = k) = [ (K choose k) * (N - K choose n - k) ] / (N choose n)

In this formula, N represents the total population size, K is the total number of successes available in that population, n is the number of items drawn in the sample, and k is the specific number of observed successes you are calculating the probability for. The 'choose' notation refers to the binomial coefficient, which determines the number of ways to pick a subset of items regardless of order.

The numerator calculates the number of ways to choose exactly k successes from the K available and the remaining required items from the non-success portion of the population. The denominator represents the total possible ways to draw a sample of size n from population N. Dividing these yields the probability of that specific outcome occurring.

Worked examples

Example 1: A jar contains 20 marbles: 8 are red and 12 are blue. If you pick 5 marbles at random without putting them back, what is the probability that exactly 3 are red?

N = 20, K = 8, n = 5, k = 3\n1. Calculate (K choose k): (8 choose 3) = 56\n2. Calculate (N-K choose n-k): (12 choose 2) = 66\n3. Calculate the numerator: 56 * 66 = 3,696\n4. Calculate the denominator (N choose n): (20 choose 5) = 15,504\n5. Divide: 3,696 / 15,504 = 0.2384 (correction for math: 56*66/15504 = 0.23839)

Result: 0.3251 (32.51%) chance of drawing exactly 3 red marbles.

Common use cases

Pitfalls and limitations

Frequently asked questions

difference between binomial and hypergeometric distribution

The binomial distribution assumes sampling with replacement (independence), while the hypergeometric distribution is used for sampling without replacement. In a hypergeometric setup, each draw changes the probability of the next outcome because the population's composition shifts.

can i use binomial instead of hypergeometric for large populations

Yes, as long as the sample size is very small relative to the population (typically less than 5%), the change in probability becomes negligible. In these cases, the binomial distribution is often used as a simpler approximation.

hypergeometric distribution for non-integer values

The hypergeometric distribution is discrete, meaning it only applies to whole numbers. You cannot have 2.5 successes in a sample, so the probability for a non-integer value is always zero.

what defines a success in hypergeometric probability

A 'success' is simply the label for the specific characteristic you are tracking, such as a defective part, a specific card suit, or a person with a particular trait. It does not imply a positive or 'good' outcome in a general sense.

is hypergeometric distribution symmetric

Checking for k successes in a sample of n is the same as checking for (n-k) failures. You can swap the success and failure counts in the formula, and as long as the ratios remain consistent, the probability outcome will be identical.

Related calculators

5 Number Summary Calculator
Calculate the five-number summary (min, Q1, median, Q3, max) and visualize with a box plot
Absolute Uncertainty Calculator
Calculate absolute and relative uncertainty for measurements and experimental data
Average Rating Calculator
Calculate the weighted average star rating from individual vote counts for reviews and feedback
Accuracy Calculator
Calculate accuracy, precision, and error rates for statistical analysis
Adjusted R-Squared Calculator
Calculate adjusted R² to account for the number of predictors in regression models
AIC/BIC Calculator
Compare statistical models using Akaike and Bayesian Information Criteria for model selection
Accuracy Calculator
Calculate accuracy, precision, and error rates for statistical analysis
ANOVA Calculator
Perform one-way Analysis of Variance to test if group means differ significantly