2023-03-20

Partitioning in SQL

What is Partitioning

Partitioning in SQL is a technique to divide a table's data into smaller, more manageable pieces called partitions. This functionality is especially useful when a table contains a massive volume of records. By dividing the records into specific areas, it improves the efficiency of database operations, making data retrieval processes significantly faster.

Advantages of Using Partitioning

The benefits of using partitioning in SQL are numerous, here are a few:

  • Enhanced SQL Performance
    SQL operations become faster as full table scans can access only the required partitions rather than scanning the entire table. It also enables parallel access to each partition, allowing results to be computed faster.

  • Increased Parallelism
    When updating records within a partition, each partition is locked individually, allowing work to proceed in parallel without hindrance.

  • Optimized Statistics and Maintenance
    With partitioning, it's possible to update statistics or optimize each partition individually. This granular control makes routine maintenance tasks more efficient and less disruptive to overall system performance.

Types of Partitioning in SQL

I will explain the various types of partitioning methods available in SQL.

Range Partitioning

Range partitioning divides a table into partitions based on a range of values. This method is commonly used when dealing with data that has logical ranges into which it can be distributed, such as dates, times, or numeric ranges.

For instance, imagine a table of orders, and we want to partition it based on the order date. We could create partitions for each year, where each partition would hold all orders made within that year.

sql
CREATE TABLE orders (
  order_id INT NOT NULL,
  customer_id INT,
  order_date DATE NOT NULL DEFAULT '1970-01-01',
  order_value DECIMAL(10,2)
)
PARTITION BY RANGE ( YEAR(order_date) ) (
  PARTITION p1970s VALUES LESS THAN (1980),
  PARTITION p1980s VALUES LESS THAN (1990),
  PARTITION p1990s VALUES LESS THAN (2000),
  PARTITION p2000s VALUES LESS THAN MAXVALUE
);

Here, the orders table is partitioned by the range of order_date. We've defined four partitions that group orders from the 1970s, 1980s, 1990s, and 2000s, respectively.

List Partitioning

List partitioning allows us to divide data based on predefined lists of values. This approach is useful when the data to be partitioned does not fit into ranges but can be grouped into specific categories.

Consider an example where a company has stores across various regions, and they want to partition their sales data by these regions.

sql
CREATE TABLE sales (
  sales_id INT NOT NULL,
  product_id INT,
  region_id INT,
  sale_date DATE,
  sale_value DECIMAL(10,2)
)
PARTITION BY LIST (region_id) (
  PARTITION pNorth VALUES IN (1, 2, 3),
  PARTITION pEast VALUES IN (4, 5, 6),
  PARTITION pWest VALUES IN (7, 8, 9),
  PARTITION pSouth VALUES IN (10, 11, 12)
);

In this case, each partition represents a region, identified by a list of region IDs.

Hash Partitioning

Hash partitioning distributes data evenly across a predefined number of partitions using a hash function. This type of partitioning is useful when there is no logical range or list to partition data by, and the goal is to ensure an even distribution of data.

Let's consider partitioning employee records in an organization across multiple partitions.

sql
CREATE TABLE employees (
  emp_id INT NOT NULL,
  emp_name VARCHAR(30),
  emp_position VARCHAR(30),
  emp_salary DECIMAL(10,2)
)
PARTITION BY HASH (emp_id)
PARTITIONS 5;

In this case, the employees table is partitioned into five equal partitions using the employee id (emp_id).

Key Partitioning

Key partitioning is a special case of hash partitioning where the partitioning key is either the whole primary key or part of it. It's primarily used when you want an equal distribution of data and your table has a primary key.

Suppose we have a products table, and we want to partition the data based on the product_id which is the primary key.

sql
CREATE TABLE products (
  product_id INT NOT NULL PRIMARY KEY,
  product_name VARCHAR(30),
  product_price DECIMAL(10,2)
)
PARTITION BY KEY(product_id)
PARTITIONS 3;

In this scenario, the products table is partitioned into three partitions based on the product_id.

Subpartitioning

Subpartitioning is a further division of partitions into smaller units, creating a two-level partitioning scheme. This is especially useful when you want to distribute data more granularly. This method often combines two different partitioning strategies. For example, you might initially partition data by range and then further subpartition those ranges by list or hash.

Let's consider a sales scenario where a company wants to partition their sales records first by year (a range) and then further by region (a list).

sql
CREATE TABLE sales (
  sales_id INT NOT NULL,
  region_id INT,
  sale_date DATE NOT NULL DEFAULT '1970-01-01',
  sale_value DECIMAL(10,2)
)
PARTITION BY RANGE( YEAR(sale_date) )
SUBPARTITION BY LIST(region_id) (
  PARTITION p1970s VALUES LESS THAN (1980) (
    SUBPARTITION sNorth VALUES IN (1, 2, 3),
    SUBPARTITION sEast VALUES IN (4, 5, 6)
  ),
  PARTITION p1980s VALUES LESS THAN (1990) (
    SUBPARTITION sWest VALUES IN (7, 8, 9),
    SUBPARTITION sSouth VALUES IN (10, 11, 12)
  ),
  PARTITION p1990s VALUES LESS THAN (2000) (
    SUBPARTITION sNorth1 VALUES IN (1, 2, 3),
    SUBPARTITION sEast1 VALUES IN (4, 5, 6)
  ),
  PARTITION p2000s VALUES LESS THAN MAXVALUE (
    SUBPARTITION sWest1 VALUES IN (7, 8, 9),
    SUBPARTITION sSouth1 VALUES IN (10, 11, 12)
  )
);

In this example, the sales table is first partitioned by range, dividing the data by the decades of sales. Each of these partitions is then further subpartitioned by list, with each subpartition containing sales from a specific region. This approach offers a granular level of data segregation, making data retrieval and maintenance more efficient.

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!