2022-11-18

Sampling survey

What is a sampling survey

There are two types of statistical surveys: complete enumeration, in which the entire population is surveyed, and sampling survey, in which a sample is drawn from the population and the characteristics of the population are statistically estimated. Below are some examples of complete enumerations and sampling surveys.

  • Complete enumeration
    • Census
  • Sampling survey
    • Public opinion poll
    • Social survey

Because a complete enumeration involves surveying the entire population, it is often impossible due to cost, time, and labor issues. For example, in psychological research, since the entire human population is the target, a complete enumeration is not possible. Therefore, in many cases, a sampling survey is used.

Since a sampling survey is conducted on a selected portion of the population, the results of the sampling survey are subject to error from the population values, i.e., the true values. This error is called sampling error. Sampling error is expressed as a measure of the probabilistic range of variation. In order to obtain accurate sampling survey estimation results, it is important to minimize the sampling error, i.e., to select a sample that is a clean contraction of the population.

Sampling methods

There are several sampling methods. In this article, I will introduce the following sampling methods.

  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Multi-stage sampling

Simple Random Sampling

Simple random sampling is a method of sampling from a population using a table of random numbers and is the most basic method of sample selection. While simple random sampling is easy to sample, it can be difficult to select a truly random sample if the population is large enough to reflect the nature of the population.

Stratified sampling

Stratified sampling is a method in which the population is divided into several strata based on certain characteristics in advance, and the survey targets are randomly selected from each stratum. For example, when conducting an attitude survey of 20 students in a high school with a male-female ratio of 7:3, randomly selecting 14 boys and 6 girls, respectively.

Stratified sampling is characterized by the fact that it reflects the characteristics of each stratum, thus reducing estimation error, but also by the fact that the nature of the population must be known in advance.

Cluster sampling

Cluster sampling is a sampling method in which the population is divided into several clusters, and all the surveys are conducted in clusters selected by random sampling. For example, when surveying the average height of high school students, high schools are considered as one cluster, and 20 high schools are randomly selected from across the country to measure the height of all students attending those 20 schools.

One of the characteristics of cluster sampling is that while it saves time and effort in conducting a survey because extraction is possible with knowledge of the cluster information, the survey targets within a cluster tend to have similar characteristics, and the sample is more likely to be biased.

Multi-stage sampling

Multi-stage sampling is a method of multi-stage sampling from a population. For example, when surveying households throughout Japan, first 30 municipalities are randomly selected from the entire country (first stage), then 5 districts are randomly selected within those municipalities (second stage), and then 20 households are randomly selected within those districts (third stage).

One of the characteristics of multi-stage sampling is that, while it is more efficient to sample, it is more likely to result in a biased sample when the sample size is small.

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!