t 分布とは
t分布とは、標本平均
母集団の平均を
上式zが従う分布である標準正規分布をもとに仮説検定が行われます。しかし、上式の母集団の標準偏差
t分布の確率密度関数は次の式で表されます。
t分布の確率密度関数は自由度(
自由度
したがって、
t 分布の期待値と分散
t分布の期待値、分散はそれぞれ以下になります。
t 分布表(上側)
t分布はパラメータが
自由度 |
|||||
---|---|---|---|---|---|
1 | 3.078 | 6.314 | 12.706 | 31.821 | 63.657 |
2 | 1.886 | 2.920 | 4.303 | 6.965 | 9.925 |
3 | 1.638 | 2.353 | 3.182 | 4.541 | 5.841 |
4 | 1.533 | 2.132 | 2.776 | 3.747 | 4.604 |
5 | 1.476 | 2.015 | 2.571 | 3.365 | 4.032 |
6 | 1.440 | 1.943 | 2.447 | 3.143 | 3.707 |
7 | 1.415 | 1.895 | 2.365 | 2.998 | 3.499 |
8 | 1.397 | 1.860 | 2.306 | 2.896 | 3.355 |
9 | 1.383 | 1.833 | 2.262 | 2.821 | 3.250 |
10 | 1.372 | 1.812 | 2.228 | 2.764 | 3.169 |
11 | 1.363 | 1.796 | 2.201 | 2.718 | 3.106 |
12 | 1.356 | 1.782 | 2.179 | 2.681 | 3.055 |
13 | 1.350 | 1.771 | 2.160 | 2.650 | 3.012 |
14 | 1.345 | 1.761 | 2.145 | 2.624 | 2.977 |
15 | 1.341 | 1.753 | 2.131 | 2.602 | 2.947 |
16 | 1.337 | 1.746 | 2.120 | 2.583 | 2.921 |
17 | 1.333 | 1.740 | 2.110 | 2.567 | 2.898 |
18 | 1.330 | 1.734 | 2.101 | 2.552 | 2.878 |
19 | 1.328 | 1.729 | 2.093 | 2.539 | 2.861 |
20 | 1.325 | 1.725 | 2.086 | 2.528 | 2.845 |
21 | 1.323 | 1.721 | 2.080 | 2.518 | 2.831 |
22 | 1.321 | 1.717 | 2.074 | 2.508 | 2.819 |
23 | 1.319 | 1.714 | 2.069 | 2.500 | 2.807 |
24 | 1.318 | 1.711 | 2.064 | 2.492 | 2.797 |
25 | 1.316 | 1.708 | 2.060 | 2.485 | 2.787 |
26 | 1.315 | 1.706 | 2.056 | 2.479 | 2.779 |
27 | 1.314 | 1.703 | 2.052 | 2.473 | 2.771 |
28 | 1.313 | 1.701 | 2.048 | 2.467 | 2.763 |
29 | 1.311 | 1.699 | 2.045 | 2.462 | 2.756 |
30 | 1.310 | 1.697 | 2.042 | 2.457 | 2.750 |
40 | 1.303 | 1.684 | 2.021 | 2.423 | 2.704 |
60 | 1.296 | 1.671 | 2.000 | 2.390 | 2.660 |
80 | 1.292 | 1.664 | 1.990 | 2.374 | 2.639 |
120 | 1.289 | 1.658 | 1.980 | 2.358 | 2.617 |
180 | 1.286 | 1.653 | 1.973 | 2.347 | 2.603 |
240 | 1.285 | 1.651 | 1.970 | 2.342 | 2.596 |
1.258 | 1.645 | 1.96 | 2.326 | 2.576 |
例えば、自由度20のt分布の上側5%点を求めたい場合は、
Python コード
以下にt分布の描画で使用したPythonコードを示します。
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
plt.style.use('ggplot')
fig, ax = plt.subplots(facecolor="w", figsize=(10, 5))
x = np.linspace(-4, 4, 100)
z = stats.norm.pdf(x, loc=0, scale=1)
for df in range(1, 11):
t = stats.t.pdf(x, df)
plt.plot(x, t, label=f"t dist(df={df})")
plt.legend()
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
plt.style.use('ggplot')
fig, ax = plt.subplots(facecolor="w", figsize=(10, 5))
x = np.linspace(-4, 4, 100)
z = stats.norm.pdf(x, loc=0, scale=1)
for df in [1, 5, 10]:
t = stats.t.pdf(x, df)
plt.plot(x, t, label=f"t dist(df={df})")
plt.plot(x, z, label='Std norm dist', linewidth=4)
plt.legend()
AlloyDB
Amazon Cognito
Amazon EC2
Amazon ECS
Amazon QuickSight
Amazon RDS
Amazon Redshift
Amazon S3
API
Autonomous Vehicle
AWS
AWS API Gateway
AWS Chalice
AWS Control Tower
AWS IAM
AWS Lambda
AWS VPC
BERT
BigQuery
Causal Inference
ChatGPT
Chrome Extension
CircleCI
Classification
Cloud Functions
Cloud IAM
Cloud Run
Cloud Storage
Clustering
CSS
Data Engineering
Data Modeling
Database
dbt
Decision Tree
Deep Learning
Descriptive Statistics
Differential Equation
Dimensionality Reduction
Discrete Choice Model
Docker
Economics
FastAPI
Firebase
GIS
git
GitHub
GitHub Actions
Google
Google Cloud
Google Search Console
Hugging Face
Hypothesis Testing
Inferential Statistics
Interval Estimation
JavaScript
Jinja
Kedro
Kubernetes
LightGBM
Linux
LLM
Mac
Machine Learning
Macroeconomics
Marketing
Mathematical Model
Meltano
MLflow
MLOps
MySQL
NextJS
NLP
Nodejs
NoSQL
ONNX
OpenAI
Optimization Problem
Optuna
Pandas
Pinecone
PostGIS
PostgreSQL
Probability Distribution
Product
Project
Psychology
Python
PyTorch
QGIS
R
ReactJS
Regression
Rideshare
SEO
Singer
sklearn
Slack
Snowflake
Software Development
SQL
Statistical Model
Statistics
Streamlit
Tabular
Tailwind CSS
TensorFlow
Terraform
Transportation
TypeScript
Urban Planning
Vector Database
Vertex AI
VSCode
XGBoost