advancedPerformanceFree Guide

Database Table Partitioning

Definition

Database table partitioning divides a large table into smaller physical segments (partitions) within a single database instance, improving query performance through partition pruning and enabling efficient data archival.

Introduction to Database Table Partitioning

Partitioning is the first horizontal scaling technique to consider before sharding. Unlike sharding (which spans multiple servers), partitioning divides a table within one database instance. The key benefit is partition pruning — the query optimizer can skip entire partitions that don't match the query's WHERE conditions, dramatically reducing I/O for large tables.

Key Takeaways

Range partitioning: rows partitioned by value ranges (e.g., orders by year)
List partitioning: rows partitioned by explicit value lists (e.g., by country code)
Hash partitioning: rows distributed evenly using a hash function
Partition pruning: optimizer skips irrelevant partitions automatically
Local indexes: each partition has its own index — much smaller and faster
Partitioning enables efficient data archival by dropping old partitions (instant, no DELETE overhead)

Real-World Examples & SQL Schema

Range Partitioning by Year (PostgreSQL)

-- Partitioned parent table
CREATE TABLE orders (
  id          BIGINT,
  customer_id INT,
  created_at  TIMESTAMPTZ NOT NULL,
  total       DECIMAL(12,2)
) PARTITION BY RANGE (created_at);

-- Create annual partitions
CREATE TABLE orders_2023
  PARTITION OF orders
  FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

CREATE TABLE orders_2024
  PARTITION OF orders
  FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

CREATE TABLE orders_2025
  PARTITION OF orders
  FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

-- Query only scans orders_2024 partition (pruning):
SELECT * FROM orders
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';

PostgreSQL automatically routes INSERTs to the correct partition and prunes irrelevant partitions from queries.

Run code in Playground

Drop Old Partition (Instant Archival)

-- Archiving 2022 data is instant — no row-by-row DELETE:
DETACH PARTITION orders_2022;
-- Or drop permanently:
DROP TABLE orders_2022;

-- Compare to non-partitioned delete:
-- DELETE FROM orders WHERE created_at < '2023-01-01';
-- ↑ Slow, generates lots of WAL, locks table

Dropping a partition is metadata-only — instant regardless of partition size. This makes partitioning ideal for time-series data with retention policies.

Run code in Playground

Primary Use Cases

Time-series data with regular archival requirements (orders, logs, events)

Tables with hundreds of millions of rows

Data with clear range predicates in the majority of queries

Multi-tenant data partitioned by tenant_id

Tables with different storage requirements (hot vs cold data)

Frequently Asked Questions

What is the difference between partitioning and sharding?

Partitioning divides a table's data within a single database server. All partitions are managed by the same database instance. Sharding distributes data across multiple independent database servers. Partitioning improves single-server query performance through partition pruning. Sharding provides horizontal scalability by distributing load across multiple machines. Start with partitioning; only add sharding when a single server can no longer handle the load.

What is partition pruning in databases?

Partition pruning is the query optimizer's ability to skip irrelevant partitions when executing a query. Example: if orders is partitioned by year and you query WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31', the optimizer reads only the orders_2024 partition and skips all others. For a 10-year table with 10 annual partitions, this reduces I/O by up to 90%.

Related Concepts

Sharding Indexing Replication Query Optimization