Advanced SQL Interview Questions for Senior Engineers (2026)
SQL interviews at the senior level are not a basic SELECT exam. They probe your ability to write non-trivial queries, reason about execution plans, and design indexes that survive production load. If you are interviewing for a senior backend, data engineering, or analytics engineering role, this guide is the complete advanced SQL playbook.

Table of Contents
- What Advanced SQL Actually Means
- Window Functions You Must Know
- Common Table Expressions (CTEs)
- Recursive CTEs
- Indexing Fundamentals Interviewers Expect
- Query Execution Plans
- Top 15 Advanced SQL Questions
- Patterns Behind These Questions
- Common Optimization Mistakes
- FAQ
- Conclusion
What Advanced SQL Actually Means
At the senior level, interviewers test four things:
- Expressive SQL. Window functions, CTEs, and lateral joins — for problems that are awkward in base SQL.
- Physical design. Index choice, partitioning strategy, clustering.
- Query tuning. Reading
EXPLAINoutput and identifying the bottleneck. - Data-modeling intuition. When to normalize vs. denormalize, when to use JSON columns vs. separate tables.
You can get away with junior SQL (basic JOINs and GROUP BYs) at L3. At L5+ the bar is higher, and data engineer roles at any level go well beyond this.
Window Functions You Must Know
Window functions compute a result across a set of rows related to the current row, without collapsing them like GROUP BY does. Every senior SQL interview will test at least one window function problem.
The five window functions you must know cold:
ROW_NUMBER()— assigns a unique sequential integer to each row in a partition.RANK()— like ROW_NUMBER but gives ties the same rank (with gaps).DENSE_RANK()— like RANK but without gaps after ties.LAG(col, n)— returns the value ofcolfrom n rows before the current one in the partition.LEAD(col, n)— returns the value from n rows after.
Aggregate window functions:
SUM(col) OVER (PARTITION BY x ORDER BY y)— running total within partition.AVG(col) OVER (PARTITION BY x ROWS BETWEEN N PRECEDING AND CURRENT ROW)— moving average.
Classic interview prompt: "For each user, return the 3 most recent orders."
WITH ranked AS (
SELECT user_id, order_id, created_at,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at DESC) AS rn
FROM orders
)
SELECT user_id, order_id, created_at
FROM ranked
WHERE rn <= 3;Common Table Expressions (CTEs)
CTEs are not just stylistic. They turn 8-level nested subqueries into readable, testable step-by-step pipelines.
Use CTEs when:
- The query is logically a sequence of transformations.
- You need to reference the same intermediate result multiple times.
- You are writing something a teammate will have to read in 6 months.
Do not use CTEs when:
- A single query is simpler.
- You care about performance and your engine does not inline the CTE.
Postgres before version 12 materialized CTEs by default (often a performance trap). Postgres 12+ inlines them unless AS MATERIALIZED is specified.
Recursive CTEs
Recursive CTEs are how you solve graph and tree problems in SQL. They come up in senior interviews as a "level up" question.
Classic prompt: "Given an employee table with (id, name, manager_id), return each employee with their full reporting chain."
WITH RECURSIVE chain AS (
SELECT id, name, manager_id, 1 AS depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.id, e.name, e.manager_id, c.depth + 1
FROM employees e
JOIN chain c ON e.manager_id = c.id
)
SELECT * FROM chain ORDER BY depth, id;This walks the org chart top-down. The same pattern handles category hierarchies, bill-of-materials problems, and dependency graphs.
Indexing Fundamentals Interviewers Expect
You will be asked something like: "This query is slow. What index would you add?" Know these:
Single-column index: CREATE INDEX ON orders (user_id);
Useful when the query filters on that column alone.
Composite index: CREATE INDEX ON orders (user_id, created_at);
Useful when filtering on user_id AND sorting by created_at. Column order matters — the prefix rule means WHERE user_id = ? can use it but WHERE created_at > ? alone cannot.
Covering index: CREATE INDEX ON orders (user_id, created_at) INCLUDE (amount);
Lets the database answer the query from the index alone without going to the heap. Huge performance win for read-heavy workloads.
Partial index: CREATE INDEX ON orders (created_at) WHERE status = 'pending';
Indexes only rows matching the predicate. Smaller index, faster, useful when a hot subset dominates queries.
Expression index: CREATE INDEX ON users (lower(email));
Lets case-insensitive email lookups use the index.
Interview question: "You have a 500 million row orders table. Queries filter on user_id = ? and sort by created_at DESC LIMIT 20. What index do you add?"
Answer: Composite index on (user_id, created_at DESC). Or (user_id, created_at) — most engines handle the DESC sort via scan direction.
Query Execution Plans
Every senior SQL interview asks you to interpret an EXPLAIN output. Core concepts:
- Seq Scan: full table scan. Slow for big tables unless the table is small or a large fraction of rows is being returned.
- Index Scan: uses an index to find rows, then fetches them from the heap.
- Index Only Scan: answer is entirely in the index (requires a covering index).
- Bitmap Heap Scan: for queries that match many rows via an index, more efficient than repeatedly chasing heap pointers.
- Nested Loop Join: for each row on one side, scan the other side. Fast for small outer sets.
- Hash Join: builds a hash table of one side, probes with the other. Good for large tables with equality joins.
- Merge Join: both sides pre-sorted; efficient for range joins.
Red flags in a plan:
- Seq Scan on a large table with a selective predicate (missing index).
- Nested Loop with large outer cardinality (should be Hash).
- Multiple sorts that could be avoided with a better index.
- Rows estimate off by 10x or more (statistics need ANALYZE).
Top 15 Advanced SQL Questions
These appear often at Stripe, Uber, Snowflake, Airbnb, and FAANG data-heavy teams.
- Nth highest salary per department. (Window function: RANK or DENSE_RANK partitioned by department.)
- Consecutive login days per user. (LAG or ROW_NUMBER trick to group runs.)
- Sessionize events with a 30-minute inactivity gap. (Window + sum over gap markers.)
- Running 7-day rolling average of revenue. (SUM window with frame.)
- Cumulative distinct users up to each date. (Harder than it looks; requires an "earliest seen" subquery.)
- Top N products per category by sales. (Window + filter.)
- Year-over-year percent change per region. (LAG with months offset.)
- Self-join to find mutual friends. (Classic graph problem in SQL.)
- Gaps and islands — find consecutive unbroken ranges. (Row number difference trick.)
- Percentile latency (p50, p95, p99) over a time window. (PERCENTILE_CONT or NTILE.)
- Find the first purchase per user with attribution. (Self-join or window ROW_NUMBER.)
- Compute the median income per zip code. (PERCENTILE_CONT, or interpolate manually.)
- Pivot a long-format table into wide columns. (CASE WHEN aggregation or PIVOT.)
- Compute funnel conversion rates step by step. (Self-join or sequence of CTEs.)
- Deduplicate a table while preserving the earliest row per group. (ROW_NUMBER + filter OR DISTINCT ON in Postgres.)
If you can solve all 15 in under 20 minutes each, you are senior-SQL-ready.
Patterns Behind These Questions
Underneath those 15 questions are only five patterns:
- Window functions for per-group ranking or running aggregates.
- Self-joins for comparing rows within the same table.
- LAG / LEAD for time-series and event-sequencing problems.
- Recursive CTEs for tree or graph traversal.
- Gaps and islands via ROW_NUMBER differences.
Mastering these five patterns lets you recognise the right tool in under 60 seconds on any new problem.
Common Optimization Mistakes
SELECT *in production queries. Prevents index-only scans. Forces bigger result sets across the network.NOT INon nullable columns. Null semantics trap;NOT EXISTSis almost always clearer and faster.ORacross columns. Often prevents index use. ConsiderUNIONof two queries instead.- Functions on indexed columns in WHERE.
WHERE DATE(created_at) = '2026-04-22'is not sargable; it forces a full scan. UseWHERE created_at >= '2026-04-22' AND created_at < '2026-04-23'instead. - Too many
LEFT JOINs whenINNERis correct. LEFT JOIN prevents the optimizer from reordering joins aggressively. - No LIMIT on exploratory queries. In production, an accidental full scan can page out the cache for everyone.
- Overly-wide indexes. Beyond 3 or 4 columns, write amplification dominates.
Frequently Asked Questions
Which SQL dialect should I study?
PostgreSQL is the most-asked-for dialect in senior interviews. MySQL, Snowflake, and BigQuery dialects are close cousins. T-SQL and Oracle specifics come up only if the job posting says so.
Do senior interviews ask raw ORM?
No. SQL is tested in raw form. ORMs are assumed.
How deep should I go on EXPLAIN?
Know how to read it. Know the 5 common node types. You do not need to know every planner tuning parameter.
Is stored-procedure knowledge needed?
Rarely asked at senior IC SQL interviews. Asked occasionally for DB-adjacent roles (DBRE, data platform).
Should I practice on LeetCode SQL or something else?
LeetCode SQL is fine for easy-to-medium, but tops out. StrataScratch, DataLemur, and Interview Query cover more advanced patterns.
Conclusion
Advanced SQL is a superpower that separates senior backend and data engineering candidates from the pool. The 5 patterns and 15 questions above cover almost every problem you will see at a top-tier interview. Pair those with real practice reading execution plans, and you will be in the top tier of SQL interviewees for any loop.