Last Updated: May 3, 2026
Aggregate functions like COUNT and SUM are powerful, but they come with a trade-off: they collapse rows. Once you GROUP BY country, you get one row per country, and the individual streams disappear.
Window functions solve this. They let you run calculations across a set of rows while keeping every row in the result. That combination, aggregate-level computation with row-level detail, is what makes window functions one of the most heavily tested SQL topic in technical interviews.
Insert the following rows to follow along:
To understand what window functions do, start with what GROUP BY does and where it falls short.
Say you want the total listening time per country. With GROUP BY, that's straightforward:
This produces one row per country. The individual streams are gone. You can't see which user contributed what, or what time each stream happened, because GROUP BY collapsed all that detail into summary rows.
Now suppose you need both: each stream's full detail, plus the country total next to it. With GROUP BY, you'd have to write a subquery or join the grouped result back to the original table. Window functions skip that step entirely:
Every row from streams stays in the result. The country_total column shows the sum for that row's country, repeated on every row that shares the same country. No collapsing, no subquery.
The following diagram shows the difference:
GROUP BY produces 2 rows (one per country). The window function produces 5 rows (one per stream), each carrying the country total alongside the original data. That's the core idea: window functions add computed columns without removing rows.
The syntax that turns an aggregate function into a window function is OVER(). Any aggregate you already know (SUM, COUNT, AVG, MIN, MAX) can become a window function by adding OVER() after it.
The simplest form is an empty OVER() with no arguments. This treats the entire result set as one window:
SUM(duration_listened) OVER () computes the sum of duration_listened across all 20 rows and places that same total on every row. It's equivalent to computing SELECT SUM(duration_listened) FROM streams as a scalar subquery, but the syntax is shorter and the intent is clearer.
You can use multiple window functions in the same SELECT:
Each window function independently computes its aggregate across the entire result set. Every row gets the same total, the same count, and the same average. The row-level duration_listened sits right next to these aggregates, which makes it easy to do things like calculate each stream's percentage of total listening time:
Without window functions, that percentage calculation would require a subquery or a cross join to bring the total into each row.
An empty OVER() treats all rows as a single group. PARTITION BY divides rows into groups (called partitions) and applies the window function separately within each one. It works like GROUP BY in terms of how it forms groups, but it doesn't collapse the rows.
To get each stream's country total alongside the stream detail:
PARTITION BY country creates one partition for each distinct value of country. SUM runs independently within each partition. A US stream sees only the US total. A GB stream sees only the GB total. But every individual stream row remains in the output.
You can partition by any column, and by multiple columns. To get the count of streams per user per country:
This creates one partition for each (user_id, country) combination. User 1 in the US gets one count, user 3 in GB gets another. The partitioning logic is identical to what GROUP BY user_id, country would produce, except every row survives.
The mental model is straightforward:
| Aspect | GROUP BY | PARTITION BY |
|---|---|---|
| Groups rows | Yes | Yes |
| Collapses rows into one per group | Yes | No |
| Output row count | One per group | Same as input |
| Can access individual row values | No | Yes |
| Requires all non-aggregated columns in clause | Yes | No |
PARTITION BY doesn't replace GROUP BY. They serve different purposes. Use GROUP BY when you want a summary table. Use PARTITION BY when you want the summary attached to each row.
Different window functions in the same query can use different partitions:
The first window function partitions by country. The second partitions by user_id. Each computes independently against its own partition. This kind of multi-perspective analysis in a single query is difficult to achieve with GROUP BY alone.
PARTITION BY controls which rows belong to each group. ORDER BY within OVER() controls the order of rows inside each partition. This ordering is what enables cumulative calculations, ranking, and row-by-row comparisons.
To compute a running count of streams per user, ordered by time:
Without ORDER BY in the OVER() clause, COUNT(*) would return the same total for every row in the partition. With ORDER BY started_at, the count builds up row by row: 1 for the first stream, 2 for the second, 3 for the third, and so on.
This happens because adding ORDER BY changes the default window frame (more on frames shortly). Instead of looking at all rows in the partition, the function only considers rows from the start of the partition up to the current row.
The same pattern works with SUM to produce a running total:
For user 1, the running total grows with each payment: 9.99, then 19.98. For user 4, it grows separately: 14.99, then 29.98. Each partition has its own independent running total.
You can use ORDER BY without PARTITION BY. This treats the entire result set as one partition and orders all rows together:
This computes a single running total across all users, ordered by payment date. There's only one partition (the whole table), and the running sum accumulates row by row.
Window ORDER BY supports ASC and DESC, just like a regular ORDER BY:
With DESC, the running total starts from the most recent stream and accumulates backward. The first row (latest stream) shows just that stream's duration. The last row (earliest stream) shows the total of all streams.
The full power of window functions emerges when you combine PARTITION BY and ORDER BY. PARTITION BY defines the groups. ORDER BY defines the sequence within each group.
To calculate each user's cumulative listening time, ordered chronologically:
Each user gets their own running total that starts at 0 and accumulates with each stream. When the partition changes (new user_id), the running total resets.
User 1's running total grows from 237 to 875 across four streams. User 2's running total starts fresh at 275 and grows to 653 across three streams. The partitions are independent.
This is also useful for calculating percentages relative to a group. To show each stream as a percentage of its user's total listening time:
The SUM with PARTITION BY (no ORDER BY) gives the total for the entire partition. Each row divides its own duration by that total. No subquery needed.
So far, the window function has been operating on either the entire partition (when there's no ORDER BY) or from the start of the partition to the current row (when ORDER BY is present). Window frames give you explicit control over exactly which rows the function considers relative to the current row.
A frame is specified after ORDER BY using the ROWS, RANGE, or GROUPS keyword:
The frame defines a sliding window of rows that moves as the "current row" advances through the partition.
When the current row is Row 3, a frame of ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING includes Row 2, Row 3, and Row 4. As the current row moves to Row 4, the frame shifts to include Row 3, Row 4, and nothing after (because Row 4 is the last row in the partition).
There are five bound specifications you can use:
| Bound | Meaning |
|---|---|
UNBOUNDED PRECEDING | First row of the partition |
N PRECEDING | N rows before the current row |
CURRENT ROW | The current row |
N FOLLOWING | N rows after the current row |
UNBOUNDED FOLLOWING | Last row of the partition |
The start bound must come before or at the end bound in the ordering. ROWS BETWEEN 2 FOLLOWING AND 1 PRECEDING is invalid because the start (2 rows ahead) is after the end (1 row behind).
Some common frame combinations:
| Frame Specification | Rows Included |
|---|---|
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW | All rows from partition start to current row (running total) |
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING | All rows in the partition (same as no ORDER BY) |
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW | Current row and the 2 rows before it (3-row moving window) |
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING | Current row, the row before, and the row after (centered 3-row window) |
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING | Current row to end of partition |
Suppose you want a moving average of listening duration over each user's last 3 streams:
For user 1's four streams, the moving average works like this:
| Row | duration_listened | Rows in Frame | moving_avg_3 |
|---|---|---|---|
| 1st (237) | 237 | [237] | 237.0 |
| 2nd (200) | 200 | [237, 200] | 218.5 |
| 3rd (90) | 90 | [237, 200, 90] | 175.7 |
| 4th (348) | 348 | [200, 90, 348] | 212.7 |
At the first row, only one row exists in the frame (there are no preceding rows), so the average is the value itself. At the second row, two rows are in the frame. From the third row onward, the frame always contains three rows: the current row and the two before it.
The frame keyword determines how "N PRECEDING" and "N FOLLOWING" are interpreted.
ROWS counts physical rows. ROWS BETWEEN 2 PRECEDING AND CURRENT ROW means "this row and the two rows directly above it in the sorted order." It doesn't care about values, only positions.
This is the most commonly used frame type and the easiest to reason about.
RANGE operates on values rather than positions. RANGE BETWEEN 2 PRECEDING AND CURRENT ROW means "all rows whose ORDER BY value is within 2 of the current row's value." If the ORDER BY column is a date, RANGE BETWEEN INTERVAL '7 days' PRECEDING AND CURRENT ROW includes all rows within the last 7 days.
The important difference shows up when rows have duplicate ORDER BY values. With ROWS, ties are ordered arbitrarily, and each row gets its own position in the frame. With RANGE, all rows with the same value are treated as peers and included together.
With RANGE, "CURRENT ROW" means "all rows with the same ORDER BY value as the current row." If two streams happen on the same date, both are included in the frame for both rows. With ROWS, "CURRENT ROW" means exactly one row.
GROUPS counts groups of peer rows (rows with the same ORDER BY value). GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING means "the current group of peers, plus the group before and the group after." This is useful when you want to aggregate over neighboring groups rather than individual rows.
GROUPS is less commonly used than ROWS or RANGE, and not all databases support it. PostgreSQL supports GROUPS starting from version 11. MySQL added it in version 8.0. SQL Server does not support GROUPS.
| Use | When |
|---|---|
| ROWS | You want a fixed number of physical rows in the frame (3-row moving average, last 5 rows) |
| RANGE | You want value-based ranges (last 7 days, values within 100 of current) or need ties handled as a group |
| GROUPS | You want to include neighboring groups of tied values |
ROWS is the default choice for most problems. RANGE is necessary when the frame should be value-based, like a 7-day moving average that must include all rows within 7 calendar days regardless of how many rows fall on each day.
The frame you get when you don't explicitly specify one depends on whether ORDER BY is present:
| OVER() Clause | Default Frame |
|---|---|
| No ORDER BY | RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (entire partition) |
| With ORDER BY | RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (start to current row) |
This default explains why adding ORDER BY to OVER() changes the behavior of aggregate window functions. Without ORDER BY, SUM gives you the partition total on every row. With ORDER BY, SUM gives you a running total. The aggregate function itself hasn't changed. The frame changed.
This is the single most important thing to understand about window frames. When you write:
The frame is the entire partition, so every row sees the full total.
When you write:
The frame shrinks to "start of partition through current row," producing a running total. If you want the full partition total even with ORDER BY (maybe because you need both ordering and the total), specify the frame explicitly:
When a query uses the same window definition in multiple places, repeating the OVER() clause gets verbose. The WINDOW clause lets you name a window and reference it:
The WINDOW clause goes between WHERE (or GROUP BY/HAVING if present) and ORDER BY. You can define multiple named windows:
This is a readability feature. The query does the same thing as writing out each OVER() clause in full, but it avoids repetition and makes the window definitions easy to scan.
Not all databases support the WINDOW clause. PostgreSQL and MySQL 8.0+ support it. SQL Server does not.
| Feature | PostgreSQL | MySQL | SQL Server |
|---|---|---|---|
| WINDOW clause | Yes | Yes (8.0+) | No |
Window functions are only allowed in two places: the SELECT list and the ORDER BY clause. They cannot appear in WHERE, HAVING, or GROUP BY.
What goes wrong here?
PostgreSQL returns: ERROR: window functions are not allowed in WHERE.
This restriction exists because of SQL's execution order. Window functions run after WHERE, GROUP BY, and HAVING have already filtered and grouped the data. The window function's input is the result of those earlier stages, so it can't be referenced before it exists.
Window functions execute after HAVING but before the final SELECT output. To filter on a window function result, wrap the query in a subquery or CTE:
The subquery computes the window function first. The outer query then filters on the result.