Bernoulli Random Variable
1 Bernoulli Distribution
The Bernoulli distribution is a discrete probability distribution that models the outcome of a single trial with two possible outcomes: success (1) and failure (0).
1.1 📌 1. Definition
A Bernoulli random variable ( X ) takes the value: - ( X = 1 ) with probability ( p ) (success), - ( X = 0 ) with probability ( 1 - p ) (failure).
Mathematically, the probability mass function (PMF) is given by: \[ P(X = x) = p^x (1 - p)^{1 - x}, \quad x \in \{0, 1\}, \ 0 \leq p \leq 1. \]
1.2 ⚡ 2. Properties of the Bernoulli Distribution
1.2.1 🎯 Mean (Expected Value)
The mean represents the expected outcome of the random variable: \[ E[X] = p. \]
We expected value \(E(x)\) of a random variable \(X\) is given by: \[ E(X) = \sigma x \dot P(X = x) \] For a Bernoulli random variable: \[ E(X) = 1 \dot p + 0 \dot( 1 - p) = p \]
1.2.2 🎲 Variance
The variance measures how much the outcomes deviate from the mean: \[ \operatorname{Var}(X) = p(1 - p). \] The variance of a random variable \(X\) measures how much the values of \(X\) deviate from its mean: \[ Var(X) = E[(X-E(X))^2] \] expand this: \[ Var(X) = E(X^2 - 2pX + p^2) \] Since \(p^2\) is constant and \(E(X)= p\), we have: \[ Var(X) = E(X^2) - 2pE(X) + p^2 \]
For a Bernoulli variable, \(X^2 = X\) (because \(1^2 = 1\) and \(0^2=0\)): \[ E(X^2) = E(X) = p \] Substituting back, \[ Var(X) = p - 2p^2 + p = p - p^2 = p(1) \] ### 📝 Standard Deviation The standard deviation is the square root of the variance: \[ \sigma = \sqrt{p(1 - p)}. \]
1.2.3 📏 Skewness
Skewness measures the asymmetry of the distribution: \[ \gamma_1 = \frac{1 - 2p}{\sqrt{p(1 - p)}}. \]
1.2.4 🛡️ Kurtosis
The kurtosis of the Bernoulli distribution is: \[ \gamma_2 = \frac{1 - 6p(1 - p)}{p(1 - p)}. \]
1.2.5 Variance of the Estimator \(\hat{p}\)
The estimator for \(p\) based on \(n\) independent observations \(X_1, X_2, \dots, X_n\) is the sample mean:
\[ \hat{p}_n = \frac{1}{n}\sigma^{n}_{i=1}X_i. \]
1.3 📚 3. Key Characteristics
- Domain: ( x {0, 1} ).
- Parameter: Single parameter ( p ), where ( 0 p ).
- Support: The distribution is defined on two points: 0 and 1.
- Memoryless: The Bernoulli distribution is not memoryless.
- Special Case:
- If ( p = 0.5 ), the distribution is symmetric.
- If ( p ), the distribution is skewed.
1.4 🔗 4. Relationship to Other Distributions
Binomial Distribution:
The Bernoulli distribution is a special case of the Binomial distribution with ( n = 1 ): \[ \text{Bernoulli}(p) = \text{Binomial}(n=1, p). \]Geometric Distribution:
A geometric random variable models the number of Bernoulli trials until the first success.Beta Distribution (Conjugate Prior):
In Bayesian statistics, the Beta distribution is the conjugate prior for the Bernoulli likelihood.
1.5 🌐 5. Applications of the Bernoulli Distribution
- Modeling Binary Outcomes:
- Coin flips (Heads/Tails)
- Pass/Fail tests
- Yes/No survey responses
- On/Off states in systems
- Machine Learning:
- Logistic regression for binary classification.
- Bernoulli Naive Bayes classifiers.
- Statistical Inference:
- Estimating proportions (e.g., percentage of people supporting a policy).
1.6 💻 6. Python Example: Simulating a Bernoulli Random Variable
import numpy as np
# Parameters
p = 0.7 # Probability of success
n = 1000 # Number of trials
# Simulate Bernoulli trials
data = np.random.binomial(n=1, p=p, size=n)
# Estimating p
p_estimate = np.mean(data)
print(f"True probability p: {p}")
print(f"Estimated probability p̂: {p_estimate:.4f}")