Database Table Partitioning
Definition
Database table partitioning divides a large table into smaller physical segments (partitions) within a single database instance, improving query performance through partition pruning and enabling efficient data archival.
Introduction to Database Table Partitioning
Partitioning is the first horizontal scaling technique to consider before sharding. Unlike sharding (which spans multiple servers), partitioning divides a table within one database instance. The key benefit is partition pruning — the query optimizer can skip entire partitions that don't match the query's WHERE conditions, dramatically reducing I/O for large tables.
Key Takeaways
- Range partitioning: rows partitioned by value ranges (e.g., orders by year)
- List partitioning: rows partitioned by explicit value lists (e.g., by country code)
- Hash partitioning: rows distributed evenly using a hash function
- Partition pruning: optimizer skips irrelevant partitions automatically
- Local indexes: each partition has its own index — much smaller and faster
- Partitioning enables efficient data archival by dropping old partitions (instant, no DELETE overhead)
Real-World Examples & SQL Schema
Range Partitioning by Year (PostgreSQL)
-- Partitioned parent table
CREATE TABLE orders (
id BIGINT,
customer_id INT,
created_at TIMESTAMPTZ NOT NULL,
total DECIMAL(12,2)
) PARTITION BY RANGE (created_at);
-- Create annual partitions
CREATE TABLE orders_2023
PARTITION OF orders
FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');
CREATE TABLE orders_2024
PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
CREATE TABLE orders_2025
PARTITION OF orders
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
-- Query only scans orders_2024 partition (pruning):
SELECT * FROM orders
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';PostgreSQL automatically routes INSERTs to the correct partition and prunes irrelevant partitions from queries.
Run code in PlaygroundDrop Old Partition (Instant Archival)
-- Archiving 2022 data is instant — no row-by-row DELETE: DETACH PARTITION orders_2022; -- Or drop permanently: DROP TABLE orders_2022; -- Compare to non-partitioned delete: -- DELETE FROM orders WHERE created_at < '2023-01-01'; -- ↑ Slow, generates lots of WAL, locks table
Dropping a partition is metadata-only — instant regardless of partition size. This makes partitioning ideal for time-series data with retention policies.
Run code in PlaygroundPrimary Use Cases
Time-series data with regular archival requirements (orders, logs, events)
Tables with hundreds of millions of rows
Data with clear range predicates in the majority of queries
Multi-tenant data partitioned by tenant_id
Tables with different storage requirements (hot vs cold data)