Mid LevelAdvanced SQL6 Interview QuestionsFREE

SQL ROW_NUMBER() Interview Questions

What Interviewers Are Testing

SQL ROW_NUMBER() is a window function that assigns a unique, sequential integer to each row within a partition, starting at 1. It is the cornerstone of advanced SQL tasks like deduplication, pagination, and ranking.

Interview Questions & Model Answers

Q1

What is the ROW_NUMBER() window function, and how does its basic syntax work?

A

The ROW_NUMBER() function assigns a unique, sequential integer to each row in a query's result set, starting at 1. It requires an OVER() clause, which must contain an ORDER BY to determine the sequence. You can optionally include PARTITION BY to restart the numbering for different groups.

Example
-- Assign a unique rank to all employees by their salary
SELECT
  name,
  department_id,
  salary,
  ROW_NUMBER() OVER (ORDER BY salary DESC) AS overall_rank
FROM employees;
Try in Playground

Potential Follow-up Questions:

  • What happens if two rows have the same salary?
  • Can you run a query using ROW_NUMBER() without an ORDER BY clause?
Q2

How does ROW_NUMBER() differ from RANK() and DENSE_RANK() when handling duplicate values (ties)?

A

The difference lies in how they handle ties (rows with identical sorting values): - ROW_NUMBER() always assigns unique, sequential numbers (e.g., 1, 2, 3, 4) even if values tie. The tie-breaker is arbitrary unless specified. - RANK() assigns the same rank to ties but skips subsequent numbers (e.g., 1, 2, 2, 4). - DENSE_RANK() assigns the same rank to ties but does NOT skip numbers (e.g., 1, 2, 2, 3).

Example
-- Compare tie handling behaviors side-by-side
SELECT
  name,
  salary,
  ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
  RANK()       OVER (ORDER BY salary DESC) AS standard_rank,
  DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
Try in Playground

Potential Follow-up Questions:

  • Which function would you use if you want to find the top 3 highest salaries including all ties?
  • How do you make the tie-breaker in ROW_NUMBER() deterministic?
Q3

How do you find the top 1 or top N highest-paid employees in each department? Why can't you use ROW_NUMBER() directly in the WHERE clause?

A

To find the top N per group, you partition by the group column (e.g., department_id) and sort by the metric column (e.g., salary DESC). You must wrap this in a Common Table Expression (CTE) or a subquery because window functions are evaluated in the SELECT phase, which runs after the WHERE clause. Attempting to filter on a window function in the same query level results in a syntax/compilation error.

Example
-- Correct pattern using a CTE to find the top 1 earner per department
WITH RankedEmployees AS (
  SELECT
    name,
    department_id,
    salary,
    ROW_NUMBER() OVER (
      PARTITION BY department_id
      ORDER BY salary DESC
    ) AS rn
  FROM employees
)
SELECT name, department_id, salary
FROM RankedEmployees
WHERE rn = 1;
Try in Playground

Potential Follow-up Questions:

  • What would happen if we used RANK() instead of ROW_NUMBER() in this query?
  • How would you write this query to get the top 3 earners per department?
Q4

How do you write a query to identify and delete duplicate rows in a table that lacks a unique primary key?

A

You can use ROW_NUMBER() to group duplicate records together using PARTITION BY on all identifying columns, and sort them by a timestamp or ID. Any row with a row number greater than 1 is a duplicate. In databases like PostgreSQL or SQL Server, you can delete directly from a CTE. In SQLite, you select the IDs of duplicate rows using a CTE and delete them from the main table.

Example
-- SQLite-compatible deduplication pattern
WITH DuplicateTracker AS (
  SELECT
    id,
    ROW_NUMBER() OVER (
      PARTITION BY email, name
      ORDER BY id ASC
    ) AS rn
  FROM users
)
DELETE FROM users
WHERE id IN (
  SELECT id
  FROM DuplicateTracker
  WHERE rn > 1
);
Try in Playground

Potential Follow-up Questions:

  • What is the performance implication of this deletion pattern on a very large table?
  • How can you prevent duplicates from occurring in the first place?
Q5

How do you find consecutive daily active streaks (the 'Islands' problem) using ROW_NUMBER()?

A

The 'Gaps and Islands' problem can be solved with a double-subtraction trick. If a sequence of dates is contiguous (e.g., Day 1, Day 2, Day 3) and you subtract a sequential ROW_NUMBER() from each date, the resulting date (the anchor or group identifier) will be identical for the entire consecutive streak. You can then group by this anchor date to count the length of each streak.

Example
-- Identify consecutive daily transaction streaks
WITH UniqueDates AS (
  -- Isolate unique transaction dates per user
  SELECT DISTINCT user_id, DATE(transaction_date) AS t_date
  FROM bank_transactions
),
StreakGroups AS (
  -- Subtract row number (days) to get a constant anchor date
  SELECT
    user_id,
    t_date,
    DATE(t_date, '-' || ROW_NUMBER() OVER (
      PARTITION BY user_id
      ORDER BY t_date ASC
    ) || ' days') AS streak_anchor
  FROM UniqueDates
)
SELECT
  user_id,
  MIN(t_date) AS streak_start,
  MAX(t_date) AS streak_end,
  COUNT(*) AS consecutive_days
FROM StreakGroups
GROUP BY user_id, streak_anchor
ORDER BY consecutive_days DESC;
Try in Playground

Potential Follow-up Questions:

  • Why is it important to run SELECT DISTINCT on dates first?
  • How would you modify this to find streaks of at least 5 consecutive days?
Q6

How do you calculate the median salary using ROW_NUMBER() in a database that lacks a built-in median function?

A

You can calculate the median by assigning two row numbers to the dataset: one in ascending order and one in descending order. The median value(s) will be where the difference between the ascending and descending row numbers is at most 1 (i.e., ABS(rn_asc - rn_desc) <= 1). You then average these values to handle both even and odd count datasets.

Example
-- Calculate the exact median salary
WITH RankedSalaries AS (
  SELECT
    salary,
    ROW_NUMBER() OVER (ORDER BY salary ASC) AS rn_asc,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS rn_desc
  FROM employees
)
SELECT AVG(salary) AS median_salary
FROM RankedSalaries
WHERE ABS(rn_asc - rn_desc) <= 1;
Try in Playground

Potential Follow-up Questions:

  • How does this handle duplicate salary values in the dataset?
  • Why does this work for both odd and even numbers of rows?

Frequently Asked Questions

What is the difference between ROW_NUMBER() and NTILE()?
ROW_NUMBER() assigns a unique, ascending integer to every row (1, 2, 3...). NTILE(N) divides the ordered rows into N equal-sized buckets and assigns a bucket number (1 to N) to each row. NTILE is used for percentile or quartile segmentation, whereas ROW_NUMBER is used for row-level ranking and identification.
Can you use ROW_NUMBER() without a PARTITION BY clause?
Yes. If you omit PARTITION BY, the entire result set is treated as a single partition, and the row numbers will be assigned sequentially from 1 to the total number of rows in the query.
Is ROW_NUMBER() deterministic?
No, it is only deterministic if the columns specified in the ORDER BY clause have unique values across all rows. If there are duplicate values in the ORDER BY columns, the database engine can assign row numbers to those duplicates in an arbitrary, non-guaranteed order unless you add a secondary unique column (like a Primary Key) as a tie-breaker.

Related Interview Topics

Practice Answering Live

Use our Interview Arena to practice SQL challenges under real interview conditions. Free.