Introduction
In the realm of SQL, the techniques of data retrieval play a critical role in the overall performance of data operations. This article introduces an overview of various scanning methods used in SQL: Full Table Scan, Index Unique Scan, and Index Range Scan.
Full Table Scan
A Full Table Scan refers to the operation where each row in a table is inspected to retrieve data. It involves examining every record in the table to either fetch the desired records or ascertain that no such records exist.
Full Table Scan can be time-consuming, particularly when the table contains a large amount of data (around a million or more). This is due to the exhaustive nature of the operation where every record is individually accessed and examined.
Example Use Cases
A common use case for Full Table Scan is when you need to select all records from the table. In SQL, this can be executed using the command:
SELECT * FROM users;
Full Table Scan is also utilized when you are filtering by a column that does not have an index. For example:
SELECT * FROM users WHERE name="hoge";
Another scenario is when indexes cannot be efficiently utilized due to the nature of the query. This can occur when you use wildcard characters in your search pattern, as it prevents the efficient use of indexes. For instance:
SELECT * FROM users WHERE name LIKE "%hoge%";
Index Unique Scan
An Index Unique Scan is a search operation where a unique record within the table is located using an index. It aims to find a single unique record, and once found, the search operation ends.
Due to its operation nature, the Index Unique Scan can complete processing quickly. The search ends as soon as the unique record is found, making it a highly efficient method, especially when dealing with large amounts of data.
Example Use Case
An Index Unique Scan is typically used when filtering records based on the primary key or unique key. Here is an example of such a query:
SELECT * FROM users WHERE id=10;
This operation can be completed quickly due to the unique nature of the primary key, which allows the database to quickly locate and return the requested record.
Index Range Scan
An Index Range Scan is a search operation that uses an index to retrieve multiple rows from a table. Unlike an Index Unique Scan, this method is intended to fetch more than one record.
The time taken for an Index Range Scan to execute may vary based on the number of records it needs to retrieve. This processing time is directly proportional to the number of target records - the more records to fetch, the longer it takes.
Example Use Case
An Index Range Scan is typically used when filtering based on a non-unique key. An example of such a query would be:
SELECT * FROM users WHERE age=50;
In this case, the operation may retrieve multiple records that meet the specified condition (i.e., users of age 50).
When to Use Full Scan vs. Index Scan
Index Scans can often perform more efficiently as they don't require reading the entire dataset. They can directly locate the needed data, hence reducing the processing time.
On the other hand, Full Scans can be more time-consuming, especially with increasing data volume. However, they can maintain a constant performance level, independent of data size.
Determining Factors for Scan Selection
-
Data Volume
The choice between using Full Scans and Index Scans is usually dictated by the data volume. If the data volume is small (around 10,000 records or fewer), the processing time difference between using an Index Scan and a Full Scan can be negligible. -
Proportion of Filterable Data
The proportion of data that can be filtered also influences the choice of scanning method. In cases where the data to be filtered forms a small proportion of the total data, Full Scans may sometimes be faster. The effectiveness of Full Scans versus Index Scans largely depends on the proportion of data that can be filtered in relation to the total data volume.
Guideline for Selecting between Full Scan and Index Scan
The choice between Full Scans and Index Scans can be dependent on the proportion of rows that will be filtered out relative to the total number of rows. The following table provides some general guidance on which scan type may be preferable:
Proportion of Rows Filtered to Total Rows | Recommended Scan Type |
---|---|
20% or more | Full Scan |
Between 1% and 20% | Case by case (some suggest Index Scan is faster for less than 15%) |
Less than 1% | Index Scan |