Biostatistics with R

Exponential distribution

When we studied the Poisson distribution, we computed the discrete probability of observing \(\small{x}\) events per unit time interval as,

\(~~~~~\small{P(x,\lambda) = \dfrac{\mu^x e^{-\mu}}{x!}~~~~}\) where \(\small{\lambda}\) is the mean number of events in the same unit interval.


If the number of events occuring in a time interval is a random variable, then the time gap (waiting time) between successive events is also a random variable.. This time gap between events is a continuous variable since the time interval can take any real positive value. We are especially interested in the distribution of the waiting time until the first event occurs in a Poisson process.For example, if on a average we expect to observe 4 events per unit time in a Poisson process, what is the waiting time until we observe the next event?. We will show that this follows an exponential probability density distribution of continuous type.



Let \(\small{\lambda}\) be the mean number of events per unit time interval while observing a Poisson process, and let \(\small{W}\) be the wait time until the first event occurs. We can derive an expression for the probability distribution of the continuous variable \(\small{W}\) as follows:


Let us compute the cumulative probability of observing a waiting time \(\small{W \leq w }\) for the first event:

The probability of getting a waiting time (interval) \(\small{w}\) is given by,
\(\small{P(W \leq w)~=~1 - P(W \geq w) }\)
\(\small{~~~~~~~~~~~~~~~~=~1 - P( no~events~in~the~time~interval~[0,w] )}\)
Since the number of events in unit interval is \(\small{\lambda}\), the probability of no event occuring in a time interval \(\small{w}\) is \(\small{e^{-\lambda w}}\) accoring to Poisson distribution. Therefore we write,
\(~~~~~~~~~~~~\small{ P(W \leq w )~=~1 - e^{-\lambda w} }\)

The cumulative distribution function is obtained by integrating the probability distribution function (pdf). Therefore, we can get the pdf of the distribution of waiting time by differentiating the cumulative function with respect to \(\small{w}\).
\(\small{ P_e(w,\lambda) = \dfrac{d}{dw}(P(W \leq w ) ) = \lambda e^{-w \lambda} }\)

Thus the pdf of waiting time \(\small{w}\) until the first event follows an exponential distribution \( \small{ \lambda e^{-w \lambda}}\), where \(\small{\lambda}\) is the mean number of events occuring in unit time interval.

We define a parameter \(\small{\theta} = \dfrac{1}{\lambda}\) to write the exponential distribution as,

\( ~~~~~~~~~\small{P_e(x,\theta) = \dfrac{1}{\theta} \large{e}^{-\frac{x}{\theta}}~~~~~~~ }\) where \(\small{~~\theta \gt 0 }\)




Thus, in a Poisson process with \(\small{\lambda}\) events per unit time, the waiting time \(\small{x}\) for first event follows an exponential distribution with parameter \(\small{\theta = \dfrac{1}{\lambda} }\)



Important Note : Since the exponential distribution is continuous, \(\small{P_e(w,\lambda)dw}\) gives the probability of having a waiting time withing a small interval \(\small{dw }\) around \(\small{w}\).
Thus the probability of observing a waiting time between $t_1$ and $t_2$ is given by \(\small{\displaystyle \int_{t1}^{t2} P_e(w,\lambda)dw }\)
Therefore, it is incorrect to talk of a "waiting time of 5 minues unitl the first event occurs". It is always, "waiting less than 5 minutes until the first event occurs" or "waiting more than 5 minutes until the first event arrives".



The mean and variance of the exponential distribution are obtained by,

\(~~~~~~~~~~~~~\small{mean = \mu = \displaystyle \int_{0}^{\infty} x P(x) dx = \displaystyle \int_{0}^{\infty} x \dfrac{1}{\theta} {\large{e}^{-\frac{x}{\theta}}} dx = \theta }\)

\(~~~~~~~~~~~~~\small{variance = \sigma^2 = \displaystyle \int_{0}^{\infty} (x-\mu)^2 \dfrac{1}{\theta} {\large{e}^{-\frac{x}{\theta}}} dx = \theta^2 }\)









So, if \(\small{\lambda}\) is the mean number of events in unit interval, then \(\small{\theta = \dfrac{1}{\lambda}}\) is the mean waiting time for the first event. Thus if 10 events occur per minute on an average, the mean waiting time unitl the first event is one tenth of a minute.

The plot of the exponential distribution



R scripts

R provides functions for computing exponential distribution with probability density \(\small{P_e(x) = \dfrac{1}{\theta} {\large e^{-\frac{x}{\theta}}} }\). Here, theta is the inverse of the mean rate in the Poisson distribution.


 dexp(x, theta) --------------> returns the uniform probability density for a given x value  
                            
 pexp(x, theta) --------------> returns the cumulative probability from 0 upto x 


 qexp(p, theta) ---------------> returns the  x value at which the cumulative probability is p

 rexp(n, theta) ---------------> returns n random numbers in the range [0 , infinity] from an                                                                       
                                   exponential distribution with the given theta parameter.



The R script below demonstrates the usage of the above mentioned functions:


##### Using R library functions for Poisson distribution yx = dexp(3, 0.5) yx = round(yx, digits=4) print(paste("Probability density at x=3 for beta = 1/2 is : ", yx)) px = pexp(3,0.5) px = round(yx, digits=4) print(paste("Cumulative probability upto x=3 for beta = 1/2 is : ", px)) qx = qexp(.935, 0.5) qx = round(qx, digits=4) print(paste("Value of x upto which cumulative probability is 0.35, for theta = 1/2 is : ", qx)) rx = rexp(4, 0.5) rx = round(rx, digits=4) print(paste("Four random deviates from exponential distribution with theta = 1/2 : ")) print(rx)


Executing the above code prints the following lines on the screen:


[1] "Probability density at x=3 for beta = 1/2 is : 0.1116" [1] "Cumulative probability upto x=3 for beta = 1/2 is : 0.1116" [1] "Value of x upto which cumulative probability is 0.35, for theta = 1/2 is : 5.4667" [1] "Four random deviaites from exponential distribution with theta = 1/2 : " [1] 0.0360 1.4116 2.3654 1.9593
Example-1 : Telephone calls arrive at the enquiry counter of a hospital can be assumed to follow a Poisson process with a mean rate of 12 per hour. What is the probability that the person at the counter waits more than 10 minutes for the next call?

We will convert the call arrival rate into minutes. 12 calls per hour translates into,

calls per minute = \(\small{\lambda = \dfrac{12}{60} = 0.2/min }\)

mean waiting time for next call = \(\small{\theta = \dfrac{1}{\lambda} = \dfrac{1}{0.2} = 5 }\) min

Therefore, the probability for waiting more than 10 min for next call is given by,

\(\small{P_e(x \gt 10,\lambda=0.2) = \displaystyle \int_{10}^\infty P_e(x=10,\theta=5)dx = \displaystyle \int_{10}^\infty \dfrac{1}{\theta} {\large e}^{\frac{-x}{\theta}}dx = {\large e}^{\frac{-10}{5}} = \small{0.1353} }\)