Database Indexing Explained
Definition
A database index is a separate data structure that stores a sorted subset of table column values with pointers to full rows, enabling fast data retrieval without full table scans.
Introduction to Database Indexing Explained
Indexing is the most impactful performance optimization in relational databases. Without indexes, every query that searches by a non-primary-key column must read every row in the table — a full table scan. With the right index, the database can jump directly to matching rows in O(log n) time using a B-tree structure.
Key Takeaways
- Indexes trade write performance for read performance — every INSERT/UPDATE/DELETE must update all indexes
- B-tree indexes are the default in most databases and support range queries (<, >, BETWEEN)
- The primary key is always indexed; foreign keys should almost always be indexed
- Composite indexes follow the leftmost prefix rule
- Covering indexes eliminate the need to access the actual table rows
- EXPLAIN/EXPLAIN ANALYZE shows whether your query uses indexes
Real-World Examples & SQL Schema
Create a Simple Index
-- Without index: 1M row table scan in ~500ms SELECT * FROM users WHERE email = 'alice@example.com'; -- Create index: CREATE INDEX idx_users_email ON users(email); -- Same query with index: ~1ms SELECT * FROM users WHERE email = 'alice@example.com';
A B-tree index on email reduces lookup from O(n) to O(log n).
Run code in PlaygroundComposite Index — Leftmost Prefix Rule
CREATE INDEX idx_orders ON orders(customer_id, status, created_at); -- Uses index (leftmost columns): SELECT * FROM orders WHERE customer_id = 5; SELECT * FROM orders WHERE customer_id = 5 AND status = 'pending'; -- Does NOT use index (skips customer_id): SELECT * FROM orders WHERE status = 'pending';
Composite indexes work left-to-right. Design column order based on query patterns.
Run code in PlaygroundPrimary Use Cases
Columns frequently used in WHERE clauses
Foreign key columns used in JOINs
Columns used in ORDER BY for sort-heavy queries
Columns with high cardinality (many unique values)
Columns in GROUP BY for aggregate queries