2022-12-01

Normal distribution

What is the Normal Distribution

The normal distribution (Gaussian distribution) is one of the most universally utilized probability distributions and is used to describe natural and social phenomena. The normal distribution has the following basic properties:

Mean, median, and mode are consistent.
The curve is symmetrical with the mean value as the peak and the mean value as the center.
The standard deviation changes the peak of the curve and the width of the distribution.
The x-axis is an asymptote.
The area bounded by the curve and the x-axis is 1

An example of a normal distribution is the height of an adult male (female).

Probability density function (PDF)

When a univariate random variable $X$ follows a normal distribution with mean $\mu$ and variance $\sigma^2$ , its probability density function (RDF) is expressed by

f(X) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\quad(x \in \mathbb{R})

A normal distribution is expressed as following $N(\mu, \sigma^2)$ when it follows a mean $\mu$ and variance $\sigma^2$ . Also, the sum of the probability density function of the normal distribution is 1. In other words, integrating this probability density function over the entire interval yields 1.

How to derive the probability density function

Most of the phenomena in the world have a peak at the mean value, and the probability of occurrence decreases as one moves away from the mean value. These phenomena can be expressed by the following function.

f(x) = e^{-x^2}

y=e(-x^2)

We will modify the above function into a more generic function based on the above. First, we will make it possible to set an arbitrary mean value. We can translate the mean value to the left or right depending on the value of $\mu$ as follows.

f(x) = e^{-(x - \mu)^2}

Next, to allow the width of the distribution to be set arbitrarily, we transform the formula into the following:

f(x) = e^{-\frac{(x - \mu)^2}{2\sigma^2}}

The width of the distribution can now be controlled by the value of $\sigma$ . Here, $\sigma^2$ in $2\sigma^2$ is squared so that it always takes a positive value regardless of the value of $\sigma$ . The coefficient of 2 is added to simplify the results of later integrations.

The density function is the sum of integrals over all intervals. Therefore, a constant $c$ is added to the beginning of the equation to adjust it.

\int^{\infty}_{\infty} ce^{-\frac{(x - \mu)^2}{2\sigma^2}}dx= 1

Computing the above equation, the constant $c$ takes the following value:

c = \frac{1}{\sqrt{2\pi}\sigma}

Thus, the probability density function of the normal distribution is the following equation:

f(X) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}

Probability of a normal distribution

For a normal distribution, if we know the mean $\mu$ and standard deviation $\sigma$ , we know the probability of occurrence of the random variable $X$ .

The graph of the normal distribution below shows the range of standard deviations (± $\sigma$ , ±1.96 $\sigma$ , ±2 $\sigma$ ).

Python normal distribution

The range of the random variable $X$ and its probability of occurrence are as follows.

The range of random variable $X$	Probability of occurrence of $X$
– $\sigma$ <= $X$ <= $\sigma$	68% of total
– 1.96 $\sigma$ <= $X$ <= 1.96 $\sigma$	95% of total
– 2 $\sigma$ <= $X$ <= 2 $\sigma$	95.5% of total
– 3 $\sigma$ <= $X$ <= 3 $\sigma$	99.7% of total

The commonly used 1.96 $\sigma$ in hypothesis-testing is treated as the 95% significance level.

Standard normal distribution

When the random variable $X$ follows a normal distribution $N(\mu,\sigma^2)$ , $aX+b$ follows a normal distribution $N(a\mu+b,a^2\sigma^2)$ .

Using this property and transforming $Z=X-\mu\sigma$ , $Z$ follows a normal distribution with mean 0 and variance 1. This transformation is called standardization of the normal distribution, and the normal distribution with mean 0 and variance 1 is called the standard normal distribution.

Reproductive property of the normal distribution

The reproductive property of normal distribution means that when random variables $X$ and $Y$ independently follow normal distributions $N(\mu_1,\sigma^2_1)$ and $N(\mu_2,\sigma^2_2)$ respectively, the distribution of $X$ + $Y$ is normal distribution The property that $N(\mu_1+\mu_2,\sigma^2_1+\sigma^2_2)$ .

As an example, assume that the mutually independent random variables $X$ and $Y$ follow $N(2, 2^2)$ and $N(5, 3^2)$ , respectively, and find the probability distribution that the random variable $3X + 2Y$ follows.

Normal distribution

What is the Normal Distribution

Probability density function (PDF)

How to derive the probability density function

Probability of a normal distribution

Standard normal distribution

Reproductive property of the normal distribution

Python code

Draw y=e^{-x^2}

Draw normal distribution

Multinomial distribution

Beta distribution

Ryusei Kakujo