Bernoulli Distribution | Random Variables & Probability Distributions

What is the Bernoulli Distribution?

The Bernoulli distribution is the simplest discrete probability distribution in statistics. It models a single trial (or experiment) that can result in exactly one of two possible outcomes: "success" or "failure".

Definition: A random variable X follows a Bernoulli distribution if it takes only two values, 0 and 1 (representing failure and success, respectively), with probability p of success.

Mathematical Formulation

Let X be a Bernoulli random variable with parameter p (where 0 ≤ p ≤ 1). Then:

Probability Mass Function (PMF)

P(X = k) = \begin{cases} p, & \text{if}\ k = 1 \\ 1-p, & \text{if}\ k = 0 \\ 0, & \text{otherwise} \end{cases}

Cumulative Distribution Function (CDF)

F(x) = P(X \leq x) = \begin{cases} 0, & \text{if}\ x < 0 \\ 1-p, & \text{if}\ 0 \leq x < 1 \\ 1, & \text{if}\ x \geq 1 \end{cases}

The Bernoulli distribution is often written as X ~ Bernoulli(p) or X ~ Bern(p).

Key Properties

Mean (Expected Value)

E[X] = p

The average outcome over many trials is simply the probability of success.

Variance

Var(X) = p(1-p)

The variance is maximized at p = 0.5, where Var(X) = 0.25.

Skewness

Skew(X) = \frac{1-2p}{\sqrt{p(1-p)}}

The distribution is symmetric when p = 0.5, positively skewed when p < 0.5, and negatively skewed when p > 0.5.

Kurtosis

Kurt(X) = \frac{1-6p(1-p)}{p(1-p)}

The kurtosis measures the "tailedness" of the distribution, though this is less intuitive for the two-point Bernoulli distribution.

Moment Generating Function

The moment generating function (MGF) of a Bernoulli distribution is:

M_X(t) = E[e^{tX}] = (1-p) + pe^t

This can be used to derive moments (such as mean, variance) of the distribution.

Visual Representation

Probability Mass Function for Different Values of p

p = 0.2

Low probability of success

p = 0.5

Equal probability of success and failure

p = 0.8

High probability of success

Cumulative Distribution Function for Different Values of p

p = 0.2

p = 0.5

p = 0.8

Variance as a Function of p

The variance of a Bernoulli distribution, p(1-p), has an interesting parabolic shape. It reaches its maximum value of 0.25 when p = 0.5, and approaches 0 as p approaches either 0 or 1.

Real-World Applications

The Bernoulli distribution appears in many real-world scenarios where we're interested in the outcome of a single trial with two possible results:

Coin Flips

A fair coin has p = 0.5, but biased coins can be modeled with different values of p.

Example: In a weighted coin, p = 0.6 means there's a 60% chance of getting heads on a single flip.

Surveys

Yes/no questions in surveys follow a Bernoulli distribution for each respondent.

Example: When asking "Do you own a car?", p might be 0.7, meaning 70% of the population owns a car.

Medical Testing

Each individual test for a disease can be modeled as a Bernoulli trial.

Example: A test for a rare disease might have p = 0.02, representing the 2% probability of a positive result.

Sports

Each shot, swing, or attempt in many sports can be modeled as a Bernoulli trial.

Example: A basketball player with a 75% free throw success rate has p = 0.75 for each free throw attempt.

Email

The classification of an email as spam or not spam follows a Bernoulli distribution.

Example: If 30% of all emails are spam, then p = 0.3 for each incoming email being classified as spam.

Quality Control

Each manufactured item can be classified as defective or non-defective.

Example: A manufacturing process with a 1% defect rate has p = 0.01 for each item being defective.

Relationship to Other Distributions

The Bernoulli distribution is the foundation for several other important probability distributions:

Binomial Distribution

The sum of n independent and identically distributed Bernoulli random variables follows a Binomial distribution with parameters n and p.

X_1, X_2, \ldots, X_n \sim \text{Bernoulli}(p) \\ Y = \sum_{i=1}^n X_i \sim \text{Binomial}(n, p)

Learn More about Binomial Distribution

Geometric Distribution

If we count the number of Bernoulli trials until the first success occurs, we get a Geometric distribution with parameter p.

X_i \sim \text{Bernoulli}(p) \\ Y = \min\{n : X_n = 1\} \sim \text{Geometric}(p)

Learn More about Geometric Distribution

Negative Binomial Distribution

If we count the number of Bernoulli trials until the r-th success occurs, we get a Negative Binomial distribution with parameters r and p.

X_i \sim \text{Bernoulli}(p) \\ Y = \min\{n : \sum_{i=1}^n X_i = r\} \sim \text{NegativeBinomial}(r, p)

Indicator Random Variables

Bernoulli random variables are often used as indicator variables in probability and statistics to simplify calculations involving events.

I_A(x) = \begin{cases} 1, & \text{if}\ x \in A \\ 0, & \text{if}\ x \notin A \end{cases}

Where A is an event and I_A is an indicator random variable for event A.

Examples

Example 1: Coin Flip

Suppose you flip a fair coin once. Let X = 1 if the outcome is heads and X = 0 if the outcome is tails.

Question: What is the probability mass function (PMF) of X? What are the mean and variance of X?

Solution:

Since the coin is fair, the probability of heads is p = 0.5.

The PMF of X is:

P(X = 1) = p = 0.5 \\ P(X = 0) = 1-p = 1 - 0.5 = 0.5

The mean (expected value) of X is:

E[X] = p = 0.5

The variance of X is:

Var(X) = p(1-p) = 0.5 \times 0.5 = 0.25

Example 2: Medical Test

A medical test for a certain disease has a 99% accuracy. Let X = 1 if the test gives the correct result and X = 0 if it gives the incorrect result.

Question: What is the cumulative distribution function (CDF) of X? What is P(X = 1)?

Solution:

The probability of a correct result is p = 0.99.

The CDF of X is:

F(x) = \begin{cases} 0, & \text{if}\ x < 0 \\ 1-p = 1-0.99 = 0.01, & \text{if}\ 0 \leq x < 1 \\ 1, & \text{if}\ x \geq 1 \end{cases}

The probability P(X = 1) is simply:

P(X = 1) = p = 0.99

This means there's a 99% chance of getting a correct test result.

Example 3: Email Classification

An email spam filter has a probability of 0.95 of correctly classifying an email. Let X = 1 if an email is correctly classified and X = 0 otherwise.

Question: What are the skewness and kurtosis of this distribution?

Solution:

The probability of a correct classification is p = 0.95.

The skewness of a Bernoulli distribution is:

Skew(X) = \frac{1-2p}{\sqrt{p(1-p)}}

Substituting p = 0.95:

Skew(X) = \frac{1-2 \times 0.95}{\sqrt{0.95 \times 0.05}} = \frac{1-1.9}{\sqrt{0.0475}} = \frac{-0.9}{0.218} \approx -4.13

The negative skewness indicates that the distribution is skewed to the left, which makes sense since the probability mass is concentrated at X = 1.

The kurtosis of a Bernoulli distribution is:

Kurt(X) = \frac{1-6p(1-p)}{p(1-p)}

Substituting p = 0.95:

Kurt(X) = \frac{1-6 \times 0.95 \times 0.05}{0.95 \times 0.05} = \frac{1-0.285}{0.0475} \approx 15.05

This high kurtosis value indicates a very peaked distribution, which is characteristic of Bernoulli distributions with p close to 0 or 1.

Bernoulli Distribution Calculator

Bernoulli Distribution Results

Input Values

Result

Distribution Properties

Mean (Expected Value)

Variance

Standard Deviation

Skewness

Probability Distribution

Probability Mass Function (PMF)

Cumulative Distribution Function (CDF)

Step-by-Step Solution

What is the Bernoulli Distribution?

Mathematical Formulation

Key Properties

Mean (Expected Value)

Variance

Skewness

Kurtosis

Moment Generating Function

Visual Representation

Probability Mass Function for Different Values of p

p = 0.2

p = 0.5

p = 0.8

Cumulative Distribution Function for Different Values of p

p = 0.2

p = 0.5

p = 0.8

Variance as a Function of p

Real-World Applications

Coin Flips

Surveys

Medical Testing

Sports

Email

Quality Control

Relationship to Other Distributions

Binomial Distribution

Geometric Distribution

Negative Binomial Distribution

Indicator Random Variables

Examples

Example 1: Coin Flip

Example 2: Medical Test

Example 3: Email Classification

Practice Problems

Problem 1

Problem 2

Problem 3

Further Reading

Binomial Distribution

Geometric Distribution

Indicator Random Variables