SQL ROW_NUMBER() Interview Questions
What Interviewers Are Testing
SQL ROW_NUMBER() is a window function that assigns a unique, sequential integer to each row within a partition, starting at 1. It is the cornerstone of advanced SQL tasks like deduplication, pagination, and ranking.
Interview Questions & Model Answers
What is the ROW_NUMBER() window function, and how does its basic syntax work?
The ROW_NUMBER() function assigns a unique, sequential integer to each row in a query's result set, starting at 1. It requires an OVER() clause, which must contain an ORDER BY to determine the sequence. You can optionally include PARTITION BY to restart the numbering for different groups.
-- Assign a unique rank to all employees by their salary SELECT name, department_id, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS overall_rank FROM employees;
Potential Follow-up Questions:
- What happens if two rows have the same salary?
- Can you run a query using ROW_NUMBER() without an ORDER BY clause?
How does ROW_NUMBER() differ from RANK() and DENSE_RANK() when handling duplicate values (ties)?
The difference lies in how they handle ties (rows with identical sorting values): - ROW_NUMBER() always assigns unique, sequential numbers (e.g., 1, 2, 3, 4) even if values tie. The tie-breaker is arbitrary unless specified. - RANK() assigns the same rank to ties but skips subsequent numbers (e.g., 1, 2, 2, 4). - DENSE_RANK() assigns the same rank to ties but does NOT skip numbers (e.g., 1, 2, 2, 3).
-- Compare tie handling behaviors side-by-side SELECT name, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num, RANK() OVER (ORDER BY salary DESC) AS standard_rank, DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank FROM employees;
Potential Follow-up Questions:
- Which function would you use if you want to find the top 3 highest salaries including all ties?
- How do you make the tie-breaker in ROW_NUMBER() deterministic?
How do you find the top 1 or top N highest-paid employees in each department? Why can't you use ROW_NUMBER() directly in the WHERE clause?
To find the top N per group, you partition by the group column (e.g., department_id) and sort by the metric column (e.g., salary DESC). You must wrap this in a Common Table Expression (CTE) or a subquery because window functions are evaluated in the SELECT phase, which runs after the WHERE clause. Attempting to filter on a window function in the same query level results in a syntax/compilation error.
-- Correct pattern using a CTE to find the top 1 earner per department
WITH RankedEmployees AS (
SELECT
name,
department_id,
salary,
ROW_NUMBER() OVER (
PARTITION BY department_id
ORDER BY salary DESC
) AS rn
FROM employees
)
SELECT name, department_id, salary
FROM RankedEmployees
WHERE rn = 1;Potential Follow-up Questions:
- What would happen if we used RANK() instead of ROW_NUMBER() in this query?
- How would you write this query to get the top 3 earners per department?
How do you write a query to identify and delete duplicate rows in a table that lacks a unique primary key?
You can use ROW_NUMBER() to group duplicate records together using PARTITION BY on all identifying columns, and sort them by a timestamp or ID. Any row with a row number greater than 1 is a duplicate. In databases like PostgreSQL or SQL Server, you can delete directly from a CTE. In SQLite, you select the IDs of duplicate rows using a CTE and delete them from the main table.
-- SQLite-compatible deduplication pattern
WITH DuplicateTracker AS (
SELECT
id,
ROW_NUMBER() OVER (
PARTITION BY email, name
ORDER BY id ASC
) AS rn
FROM users
)
DELETE FROM users
WHERE id IN (
SELECT id
FROM DuplicateTracker
WHERE rn > 1
);Potential Follow-up Questions:
- What is the performance implication of this deletion pattern on a very large table?
- How can you prevent duplicates from occurring in the first place?
How do you find consecutive daily active streaks (the 'Islands' problem) using ROW_NUMBER()?
The 'Gaps and Islands' problem can be solved with a double-subtraction trick. If a sequence of dates is contiguous (e.g., Day 1, Day 2, Day 3) and you subtract a sequential ROW_NUMBER() from each date, the resulting date (the anchor or group identifier) will be identical for the entire consecutive streak. You can then group by this anchor date to count the length of each streak.
-- Identify consecutive daily transaction streaks
WITH UniqueDates AS (
-- Isolate unique transaction dates per user
SELECT DISTINCT user_id, DATE(transaction_date) AS t_date
FROM bank_transactions
),
StreakGroups AS (
-- Subtract row number (days) to get a constant anchor date
SELECT
user_id,
t_date,
DATE(t_date, '-' || ROW_NUMBER() OVER (
PARTITION BY user_id
ORDER BY t_date ASC
) || ' days') AS streak_anchor
FROM UniqueDates
)
SELECT
user_id,
MIN(t_date) AS streak_start,
MAX(t_date) AS streak_end,
COUNT(*) AS consecutive_days
FROM StreakGroups
GROUP BY user_id, streak_anchor
ORDER BY consecutive_days DESC;Potential Follow-up Questions:
- Why is it important to run SELECT DISTINCT on dates first?
- How would you modify this to find streaks of at least 5 consecutive days?
How do you calculate the median salary using ROW_NUMBER() in a database that lacks a built-in median function?
You can calculate the median by assigning two row numbers to the dataset: one in ascending order and one in descending order. The median value(s) will be where the difference between the ascending and descending row numbers is at most 1 (i.e., ABS(rn_asc - rn_desc) <= 1). You then average these values to handle both even and odd count datasets.
-- Calculate the exact median salary
WITH RankedSalaries AS (
SELECT
salary,
ROW_NUMBER() OVER (ORDER BY salary ASC) AS rn_asc,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS rn_desc
FROM employees
)
SELECT AVG(salary) AS median_salary
FROM RankedSalaries
WHERE ABS(rn_asc - rn_desc) <= 1;Potential Follow-up Questions:
- How does this handle duplicate salary values in the dataset?
- Why does this work for both odd and even numbers of rows?
Frequently Asked Questions
What is the difference between ROW_NUMBER() and NTILE()?
Can you use ROW_NUMBER() without a PARTITION BY clause?
Is ROW_NUMBER() deterministic?
Related Interview Topics
Practice Answering Live
Use our Interview Arena to practice SQL challenges under real interview conditions. Free.