advancedPerformanceFree Guide

SQL Query Optimization Techniques

Definition

SQL query optimization is the process of analyzing and rewriting SQL queries, adjusting index strategies, and tuning database configuration to minimize query execution time and resource consumption.

Introduction to SQL Query Optimization Techniques

A well-written query on a properly indexed table can be 1000x faster than a poorly written query without indexes. Query optimization is the highest-ROI skill for database engineers. It combines understanding the query planner, index internals, data statistics, and SQL rewriting techniques.

Key Takeaways

Always start with EXPLAIN ANALYZE — never guess; measure first
Full table scans on large tables are almost always the bottleneck
SELECT * fetches unnecessary columns — always select only what you need
Functions on indexed columns (WHERE UPPER(name) = 'JOHN') prevent index usage
EXISTS is faster than COUNT(*) > 0 for existence checks
Move filters from HAVING to WHERE when they don't involve aggregates
Correlated subqueries execute once per outer row — convert to JOIN or CTE

Real-World Examples & SQL Schema

Avoid Functions on Indexed Columns

-- BAD: function on indexed column prevents index use
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';
-- Full table scan: O(n)

-- GOOD: use expression index, or store lowercase
CREATE INDEX idx_lower_email ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';
-- Index scan: O(log n)

-- Better: store email lowercase on INSERT:
SELECT * FROM users WHERE email = 'alice@example.com';

Wrapping a column in a function prevents the optimizer from using the column's index.

Run code in Playground

EXISTS vs COUNT for Existence Check

-- SLOW: must count all matching rows
SELECT * FROM customers
WHERE (SELECT COUNT(*) FROM orders WHERE customer_id = customers.id) > 0;

-- FAST: stops at first match
SELECT * FROM customers
WHERE EXISTS (
  SELECT 1 FROM orders WHERE customer_id = customers.id
);

-- Alternative: JOIN (often fastest)
SELECT DISTINCT c.*
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id;

EXISTS short-circuits at the first matching row. COUNT must scan all matching rows.

Run code in Playground

Primary Use Cases

Slow API endpoints caused by inefficient database queries

Dashboard and reporting queries on large datasets

High-traffic OLTP systems where query latency matters

Database CPU and I/O cost reduction

ORM-generated queries that need optimization

Frequently Asked Questions

How do you read an EXPLAIN plan in SQL?

EXPLAIN shows the query execution plan. Key things to look for: access method (Seq Scan = full table scan, Index Scan = uses index, Index Only Scan = covering index — fastest), rows (estimated row count at each step — high numbers mean more work), cost (arbitrary units but relative costs help compare plan steps), and in EXPLAIN ANALYZE: actual rows vs estimated rows (big discrepancies mean stale statistics — run ANALYZE).

What is the most common cause of slow SQL queries?

Missing indexes on columns used in WHERE, JOIN, and ORDER BY clauses is the most common cause of slow queries. Other frequent causes: SELECT * fetching unnecessary data, correlated subqueries executing N times, functions applied to indexed columns preventing index use, stale table statistics causing the optimizer to choose bad plans, and the N+1 query problem in ORM-based applications.

Related Concepts

Indexing Partitioning Replication Transactions