What is statistics? – Binomial Distribution

Discrete Distribution: Theoretical Definition
A Discrete Distribution is a probability distribution that describes events which happen in integral intervals of a random variable. That is, it characterizes non-continuous distributions.

Probability Mass Function (pmf): Theoretical Definition
A Probability Mass Function (pmf) is usually used for the definition of Discrete probability distributions in such way that the probability of all events will equal 1. Moreover, these events must happen in integral intervals, such as 2 tosses of a dice.

The Graph Figure below shows three events in distinct intervals of a random variable. The first event had a probability 0.4 or 40%, the 2nd event had a probability 0.2 or 20%, and the third event had a probability 0.4 or 40%. That is, 0.4+0.2+0.4=1 or 100%.

pmf_probability_mass_function

Binomial Distribution: Theoretical Definition
A Binomial Distribution -“Bi”-two possible outcomes yes/no, success/failure- is a Discrete Probability Distribution. Therefore, the probability of the two possible outcomes of a random variable equals 1 and the events of this variable must happen in integral intervals. Under some conditions, it is also called Bernoulli Distribution.

Binomial Distribution: Statistical Definition
If we define a random variable X which follows the Binomial Distribution (B), and the two possible outcomes of it can be described by n and p, then it can be written as X\sim B(n,p). Therefore, it can be statistically defined by the following Probability Mass Function (pmf):

f(k;n,p)=Pr(X=k)=\binom{n}{k}p^k(1-p)^{(n-k)}

Explanation of Statistical Symbols
The n is the total number of trials.
The p is the probability between the two possible outcomes.
The X=k is the chosen number of trial each time
and thus it can be also written as: \binom{n}{k}.

Example: This written expression “\binom{3}{1}” means that it is presented the first trial out of the total of three trials.

Binomial Distribution: Example
Let’s say that the probability someone to be ill than in good health each year is p=0.4 according to his/her age. What is the probability this person to be ill in the following three years X=k=1,2,3? Note that n=3.

By replacing formula symbols by its corresponding numbers, for the 1st, 2nd and 3rd year, then we can have as a result:

Pr(X=k=1st)=f(1)=\binom{3}{1}0.4^1(1-0.4)^{(3-1)}
f(1)=0.4*(0.6^2)=0.4*0.36=.144 or 14.4% for the 1st year

Pr(X=k=2nd)=f(2)=\binom{3}{2}0.4^2(1-0.4)^{(3-2)} and thus:
f(2)=0.16*(0.6^1)=0.16*0.6=.096 or 9.6% for the 2nd year

Pr(X=k=3rd)=f(3)=\binom{3}{3}0.4^3(1-0.4)^{(3-3)} and thus:
f(3)=0.064*(0.6^0)=0.064*1=.0064 or 0.64% for the 3rd year

Binomial Distribution: Cumulative Distribution Function
The Cumulative Distribution Function (cdf) of the Binomial Distribution is the following one:

f(k;n,p)=Pr(X\leq k)=\sum_{i=0}^{\left\lfloor k \right \rfloor }\binom{n}{i}p^i(1-p)^{n-i}

Explanation of Statistical Symbols
The \left\lfloor k \right \rfloor is the maximum (floor) integral value
The i=Index, takes values from 0,1,2,3… to n.

That is, it is the summation of the results of the Probability Mass Function (pmf).



This Graph figure shows the Presentation of pmf and cdf of the Binomial Distribution for p=0.40 or 40% of 100 events. Note that it is a Discrete Distribution, therefore, the lines were drawn only for illustration reasons. Normally, only dots must be drawn on the related integral numbers. That is, only e.g. 1 and 2 exist, not e.g. 1.3 or 1.4.
Probability mass function - Cumulative distribution function - binomial