# ch4: Discrete random variables

## Random variable

A random variable is called **discrete** if it can take only a finite or countably infinite number of values. The **range** **discrete** if the range is denumerable.

Range

The range of a variable (not neccesarily random) can be:

- Finite
- Countably infinite
- Not countably infinite

## The probability function of a discrete random variable

If

Properties Probability function

The above also means that a function which statisfies both properties is a probablity function.

The probablities $$P(X \in B)$$ for each $$B \subset S_{X}$$ are, all together, called the **(probability) distribution** of the random variable **homogenous** distribution.

Geometric series

## The expectation of a discrete random variable

The **expectation** or **expected value ** of a discrete random variable 𝑋 is given by:

provided that this summation absolute convergent is (that is: $$\sum_{x \in S_{X}}|x| \cdot P(X=x)<\infty$$).

TIP

Expectation

If this the summation converges (absolutely) then the expected value exists, if the summations doesn't converge (absolutely) then the expected value doesn't exist.

letter | description |
---|---|

(greek letter m, for mean) sometimes is used instead of | |

standing for sample mean and |

## Functions of a discrete random variable; variance

Building further on the expectation we can define multiple imporant properties:

Functions

If

So if ** is a linear function of **, that is

The average can be considered a **measure for the center** of a distribution

Variance and standard deviation

Notation | Name | Definition |
---|---|---|

The variance of | ||

The standard deviation of | is the square root of the variance: |

Properties of variance and standard deviation

(the computational formula) - if
, so if is not degenerate, we have and

### Chebysshev's inequality and the Empirical rule

Formula

For any real number

Valid for **any** random variable **upper bound of the probability** of values **outside the interval**

In essence Chebyshev's inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean. Using the inequality and standard deviation a standard interval can be build:

Empircal rule

If the graph of the distribution of 𝑋 shows a bell shape, then the approximately probabilities for 𝑋 having a value within the interval

is 68% is 95% is 99.7%

Chebyshev's rule is valid for any distribution, but the so called Empirical Rule is only valid for distributions that are (approximately) symmetric and bell (hill) shaped.

## The binomial, hypergeometric, geometric and Poisson distribution

### The Binomial distribution

Definition

Short notations:

One can apply the binomial distribution as a probability model of real life situations, whenever there is a series of 𝑛 similar experiments for which the conditions of Bernoulli trials hold, i.e.:

- A phenomenon occurs (or does not occur) at a fixed success rate
(or failure rate ) - Independence of the trials.

If

Special values of n and p, the parameters of the B(n,p)-distribution

- If
("success guaranteed"), then and has a degenerate distribution in . Similarly, if , then and . - If
, that is, if only one trial is conducted (one shot on the basket, the quality of one product is assessed, etc.), is said to have an **alternative distribution with success probability**, which is a -distribution.

It follows that:

so:

And:

We find:

the variance of a

### The Hypergeometric distribution

Definition

If the probability function of the random variable **random draws without replacement** from a so called **dichotomous** population: consisting of elements which do or do not have a specific property.

Random draws from a dichotomous population lead to the hypergeometric distribution of the number of “successes” if we draw without replacement, but on the other hand, if the draws are with replacement, we can use the binomial distribution: in that case the draws should be independent.

Other properties

For relatively large

Note that the variances of the hypergeometric and binomial distributions under these conditions are almost equal:

A (quite strict) rule of thumb for approximating by the binomial distribution is

### The Geometric distribution

Definition

If

The following formula is convenient whenever we have to compute a summation of geometric probabilities:

The reasoning is as follows: the probability that we need more than

### The Poisson distribution

Definition

This is a probability function: all probabilities are at least 0 and the sum of all probabilities is 1.

Poisson probabilities are given in (cumulative) probability tables for $$ P(X \leq c)$$

Other properties

If

A rule of thumb for applying this approximation is:

These approximations are also applicable in case of "large