Traffine I/O

日本語

2022-12-23

周辺確率分布

Statistics

Probability Distribution

Probability Distribution

周辺確率分布とは

周辺確率分布とは、同時確率分布から一方の確率変数を消去した確率分布になります。例えば、 $X$ の周辺確率とは、他の事象に関係なく事象 $X$ が起こる確率になります。

離散型確率変数において、周辺確率分布は次の式で表されます。

P(X)=\sum _{y}P(X,Y)

連続型確率変数において、周辺確率分布は次の式で表されます。

f_(x)=\int_{y}f(x,y)\,\mathrm {d} y

離散確率変数の周辺分布の例として、次のある小学校のクラスの男女の血液型の分布を使って説明します。

X\Y	A 型	B 型	O 型	AB 型
男子	0.25	0.10	0.10	0.05
女子	0.20	0.20	0.05	0.05

このとき、次のように性別、血液型の合計の確率を計算することができます。

X\Y	A 型	B 型	O 型	AB 型	計
男子	0.25	0.10	0.10	0.05	0.50
女子	0.20	0.20	0.05	0.05	0.50
計	0.45	0.30	0.15	0.10	1.00

これらの合計値が周辺確率になります。それぞれの周辺確率は次のようになります。

P(男子)=0.50 \\ P(女子)=0.50 \\ P(A型)=0.45 \\ P(B型)=0.30 \\ P(O型)=0.15 \\ P(AB型)=0.10

確率変数の独立性

確率変数 $X$ と $Y$ が互いに影響を与えないとき、 $X$ と $Y$ は互いに独立であると言えます。独立性は同時確率が周辺確率の積で表すことができるかどうかで決まります。離散確率変数の場合と連続確率変数の場合のそれぞれについて説明します。

離散確率変数の場合

離散確率変数 $X$ 、 $Y$ が以下を満たすとき、 $X$ と $Y$ は独立であると言えます。

P(X,Y) = P(X)P(Y)

例えば次のような確率変数 $X$ と $Y$ の同時確率分布があるとします。

X\Y	1	2	3
0	0.10	0.10	0.20
1	0.20	0	0
2	0.10	0.10	0.20

周辺確率はそれぞれ以下になります。

P(X=0)=0.40 \\ P(X=1)=0.20 \\ P(X=2)=0.40 \\ P(Y=1)=0.40 \\ P(Y=2)=0.20 \\ P(Y=3)=0.40

したがって、 $P(X=0, Y=1) = 0.10$ 、 $P(X=0)P(Y=1) = 0.16$ より同時確率と周辺確率の積が一致しないため、 $X$ と $Y$ が独立でないことが分かります。

連続確率変数の場合

連続確率変数 $x$ 、 $y$ の確率密度関数が以下を満たすとき、 $x$ と $y$ は独立であると言えます。

f(x,y) = f(x)f(y)

例えば次の同時確率の確率密度関数があるとします。

f(x, y) = \left\{ \begin{array}{ll} 4xy & (0 < x < 1, 0 < y < 1) \\ 0 & (otherwise) \end{array} \right.

$0 < x < 1$ において $f(x)$ は次のようになります。

\begin{aligned} f(x) &= \int^{1}_{0} f(x,y) \mathrm{d} y \\ &= \int^{1}_{0} 4xy \mathrm{d} y \\ &= 2x \end{aligned}

よって $f(x)$ は次のようになります。

f(x) = \left\{ \begin{array}{ll} 2x & (0 < y < 1) \\ 0 & (otherwise) \end{array} \right.

同様に、 $f(y)$ は次のようになります。

f(y) = \left\{ \begin{array}{ll} 2y & (0 < y < 1) \\ 0 & (otherwise) \end{array} \right.

したがって、 $0 < x < 1, 0 < y < 1$ において、以下を満たすので $x$ と $y$ は独立であると言えます。

f(x,y) = f(x)f(y) = 4xy

次に、次の同時確率の確率密度関数を考えます。

f(x, y) = \left\{ \begin{array}{ll} x+y & (0 < x < 1, 0 < y < 1) \\ 0 & (otherwise) \end{array} \right.

$f(x)$ 、 $f(y)$ はそれぞれ次のようになります。

f(x) = \left\{ \begin{array}{ll} x + \frac{1}{2} & (0 < y < 1) \\ 0 & (otherwise) \end{array} \right.

f(y) = \left\{ \begin{array}{ll} y + \frac{1}{2} & (0 < y < 1) \\ 0 & (otherwise) \end{array} \right.

$0 < x < 1, 0 < y < 1$ において、 $f(x,y) \neq f(x)f(y)$ であるため、 $x$ と $y$ は独立ではありません。

同時確率分布

条件付き確率分布

AlloyDB

Amazon Cognito

Amazon EC2

Amazon ECS

Amazon QuickSight

Amazon QuickSight

Amazon RDS

Amazon Redshift

Amazon Redshift

Amazon S3

API

Autonomous Vehicle

Autonomous Vehicle

AWS

AWS API Gateway

AWS API Gateway

AWS Chalice

AWS Control Tower

AWS Control Tower

AWS IAM

AWS Lambda

AWS VPC

BERT

BigQuery

Causal Inference

Causal Inference

ChatGPT

Chrome Extension

Chrome Extension

CircleCI

Classification

Cloud Functions

Cloud Functions

Cloud IAM

Cloud Run

Cloud Storage

Clustering

CSS

Data Engineering

Data Engineering

Data Modeling

Database

dbt

Decision Tree

Deep Learning

Descriptive Statistics

Descriptive Statistics

Differential Equation

Differential Equation

Dimensionality Reduction

Dimensionality Reduction

Discrete Choice Model

Discrete Choice Model

Docker

Economics

FastAPI

Firebase

GIS

git

GitHub

GitHub Actions

Google

Google Cloud

Google Search Console

Google Search Console

Hugging Face

Hypothesis Testing

Hypothesis Testing

Inferential Statistics

Inferential Statistics

Interval Estimation

Interval Estimation

JavaScript

Jinja

Kedro

Kubernetes

LightGBM

Linux

LLM

Mac

Machine Learning

Machine Learning

Macroeconomics

Marketing

Mathematical Model

Mathematical Model

Meltano

MLflow

MLOps

MySQL

NextJS

NLP

Nodejs

NoSQL

ONNX

OpenAI

Optimization Problem

Optimization Problem

Optuna

Pandas

Pinecone

PostGIS

PostgreSQL

Probability Distribution

Probability Distribution

Product

Project

Psychology

Python

PyTorch

QGIS

ReactJS

Regression

Rideshare

SEO

Singer

sklearn

Slack

Snowflake

Software Development

Software Development

SQL

Statistical Model

Statistical Model

Statistics

Streamlit

Tabular

Tailwind CSS

TensorFlow

Terraform

Transportation

TypeScript

Urban Planning

Vector Database

Vector Database

Vertex AI

VSCode

XGBoost

Ryusei Kakujo

researchgate

github

Weave the future of cities through data

Transportation modeling/ Urban planning/ Machine learning/ Computer science/ GIS