What is the Normal Distribution
The normal distribution (Gaussian distribution) is one of the most universally utilized probability distributions and is used to describe natural and social phenomena. The normal distribution has the following basic properties:
- Mean, median, and mode are consistent.
- The curve is symmetrical with the mean value as the peak and the mean value as the center.
- The standard deviation changes the peak of the curve and the width of the distribution.
- The x-axis is an asymptote.
- The area bounded by the curve and the x-axis is 1
An example of a normal distribution is the height of an adult male (female).
Probability density function (PDF)
When a univariate random variable
A normal distribution is expressed as following
How to derive the probability density function
Most of the phenomena in the world have a peak at the mean value, and the probability of occurrence decreases as one moves away from the mean value. These phenomena can be expressed by the following function.
We will modify the above function into a more generic function based on the above. First, we will make it possible to set an arbitrary mean value. We can translate the mean value to the left or right depending on the value of
Next, to allow the width of the distribution to be set arbitrarily, we transform the formula into the following:
The width of the distribution can now be controlled by the value of
The density function is the sum of integrals over all intervals. Therefore, a constant
Computing the above equation, the constant
Thus, the probability density function of the normal distribution is the following equation:
Probability of a normal distribution
For a normal distribution, if we know the mean
The graph of the normal distribution below shows the range of standard deviations (±
The range of the random variable
The range of random variable |
Probability of occurrence of |
---|---|
– |
68% of total |
– 1.96 |
95% of total |
– 2 |
95.5% of total |
– 3 |
99.7% of total |
The commonly used 1.96
Standard normal distribution
When the random variable
Using this property and transforming
Reproductive property of the normal distribution
The reproductive property of normal distribution means that when random variables
As an example, assume that the mutually independent random variables
The probability distribution that the random variable
The probability distribution that the random variable
From the reproductive property of the normal distribution, the probability distribution that the random variable
The probability distribution that the random variable
Python code
The Python code used in this article is as follows.
Draw y=e^{-x^2}
```python
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from matplotlib import rcParams
rcParams['figure.figsize'] = 10, 5
# %matplotlib inline
sns.set()
sns.set_context(rc = {'patch.linewidth': 0.2})
sns.set_style('dark')
x = np.linspace(-3, 3, 100)
y = np.exp(x)
plt.figure()
plt.plot(x, np.exp(-x**2))
plt.xlabel('$x$')
plt.ylabel('$-\exp(-x^2)$')
plt.show()
Draw normal distribution
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from matplotlib import rcParams
rcParams['figure.figsize'] = 10, 5
# %matplotlib inline
sns.set()
sns.set_context(rc = {'patch.linewidth': 0.2})
sns.set_style('dark')
# normal distribution setting
mean = 0
std = 1
# set random variable
X = np.arange(-3,3,0.01)
# calculate PDF
Y = stats.norm.pdf(X,mean,std)
# draw normal distribution
plt.plot(X,Y,label="N(0,1)", linewidth=5)
# draw standard deviation
plt.axvline(x=std, color="pink", ymax=1.5*Y.max(), label="±σ")
plt.axvline(x=-std, color="pink", ymax=1.5*Y.max())
plt.axvline(x=1.96*std, color="orange", ymax=0.4*Y.max(), label="±1.96σ")
plt.axvline(x=-1.96*std, color="orange", ymax=0.4*Y.max())
plt.axvline(x=2*std, color="skyblue", ymax=0.4*Y.max(), label="±2σ")
plt.axvline(x=-2*std, color="skyblue", ymax=0.4*Y.max())
# graph setting
plt.xlabel("Random variable: X")
plt.ylabel("PDF: f(x)")
plt.legend(loc="upper left")
plt.show()