AlgoMaster Logo

Introduction to Window Functions

Last Updated: May 3, 2026

10 min read

Aggregate functions like COUNT and SUM are powerful, but they come with a trade-off: they collapse rows. Once you GROUP BY country, you get one row per country, and the individual streams disappear.

Window functions solve this. They let you run calculations across a set of rows while keeping every row in the result. That combination, aggregate-level computation with row-level detail, is what makes window functions one of the most heavily tested SQL topic in technical interviews.

Sample Data

Insert the following rows to follow along:

SQL

Window Functions vs GROUP BY

To understand what window functions do, start with what GROUP BY does and where it falls short.

Say you want the total listening time per country. With GROUP BY, that's straightforward:

SQL

This produces one row per country. The individual streams are gone. You can't see which user contributed what, or what time each stream happened, because GROUP BY collapsed all that detail into summary rows.

Now suppose you need both: each stream's full detail, plus the country total next to it. With GROUP BY, you'd have to write a subquery or join the grouped result back to the original table. Window functions skip that step entirely:

SQL

Every row from streams stays in the result. The country_total column shows the sum for that row's country, repeated on every row that shares the same country. No collapsing, no subquery.

The following diagram shows the difference:

GROUP BY produces 2 rows (one per country). The window function produces 5 rows (one per stream), each carrying the country total alongside the original data. That's the core idea: window functions add computed columns without removing rows.

The OVER() Clause

The syntax that turns an aggregate function into a window function is OVER(). Any aggregate you already know (SUM, COUNT, AVG, MIN, MAX) can become a window function by adding OVER() after it.

The simplest form is an empty OVER() with no arguments. This treats the entire result set as one window:

SQL

SUM(duration_listened) OVER () computes the sum of duration_listened across all 20 rows and places that same total on every row. It's equivalent to computing SELECT SUM(duration_listened) FROM streams as a scalar subquery, but the syntax is shorter and the intent is clearer.

You can use multiple window functions in the same SELECT:

SQL

Each window function independently computes its aggregate across the entire result set. Every row gets the same total, the same count, and the same average. The row-level duration_listened sits right next to these aggregates, which makes it easy to do things like calculate each stream's percentage of total listening time:

SQL

Without window functions, that percentage calculation would require a subquery or a cross join to bring the total into each row.

PARTITION BY

An empty OVER() treats all rows as a single group. PARTITION BY divides rows into groups (called partitions) and applies the window function separately within each one. It works like GROUP BY in terms of how it forms groups, but it doesn't collapse the rows.

To get each stream's country total alongside the stream detail:

SQL

PARTITION BY country creates one partition for each distinct value of country. SUM runs independently within each partition. A US stream sees only the US total. A GB stream sees only the GB total. But every individual stream row remains in the output.

You can partition by any column, and by multiple columns. To get the count of streams per user per country:

SQL

This creates one partition for each (user_id, country) combination. User 1 in the US gets one count, user 3 in GB gets another. The partitioning logic is identical to what GROUP BY user_id, country would produce, except every row survives.

PARTITION BY vs GROUP BY

The mental model is straightforward:

AspectGROUP BYPARTITION BY
Groups rowsYesYes
Collapses rows into one per groupYesNo
Output row countOne per groupSame as input
Can access individual row valuesNoYes
Requires all non-aggregated columns in clauseYesNo

PARTITION BY doesn't replace GROUP BY. They serve different purposes. Use GROUP BY when you want a summary table. Use PARTITION BY when you want the summary attached to each row.

PARTITION BY with Different Aggregates

Different window functions in the same query can use different partitions:

SQL

The first window function partitions by country. The second partitions by user_id. Each computes independently against its own partition. This kind of multi-perspective analysis in a single query is difficult to achieve with GROUP BY alone.

ORDER BY in Window Functions

PARTITION BY controls which rows belong to each group. ORDER BY within OVER() controls the order of rows inside each partition. This ordering is what enables cumulative calculations, ranking, and row-by-row comparisons.

To compute a running count of streams per user, ordered by time:

SQL

Without ORDER BY in the OVER() clause, COUNT(*) would return the same total for every row in the partition. With ORDER BY started_at, the count builds up row by row: 1 for the first stream, 2 for the second, 3 for the third, and so on.

This happens because adding ORDER BY changes the default window frame (more on frames shortly). Instead of looking at all rows in the partition, the function only considers rows from the start of the partition up to the current row.

The same pattern works with SUM to produce a running total:

SQL

For user 1, the running total grows with each payment: 9.99, then 19.98. For user 4, it grows separately: 14.99, then 29.98. Each partition has its own independent running total.

ORDER BY Without PARTITION BY

You can use ORDER BY without PARTITION BY. This treats the entire result set as one partition and orders all rows together:

SQL

This computes a single running total across all users, ordered by payment date. There's only one partition (the whole table), and the running sum accumulates row by row.

ORDER BY Direction

Window ORDER BY supports ASC and DESC, just like a regular ORDER BY:

SQL

With DESC, the running total starts from the most recent stream and accumulates backward. The first row (latest stream) shows just that stream's duration. The last row (earliest stream) shows the total of all streams.

Combining PARTITION BY and ORDER BY

The full power of window functions emerges when you combine PARTITION BY and ORDER BY. PARTITION BY defines the groups. ORDER BY defines the sequence within each group.

To calculate each user's cumulative listening time, ordered chronologically:

SQL

Each user gets their own running total that starts at 0 and accumulates with each stream. When the partition changes (new user_id), the running total resets.

User 1's running total grows from 237 to 875 across four streams. User 2's running total starts fresh at 275 and grows to 653 across three streams. The partitions are independent.

This is also useful for calculating percentages relative to a group. To show each stream as a percentage of its user's total listening time:

SQL

The SUM with PARTITION BY (no ORDER BY) gives the total for the entire partition. Each row divides its own duration by that total. No subquery needed.

Window Frames

So far, the window function has been operating on either the entire partition (when there's no ORDER BY) or from the start of the partition to the current row (when ORDER BY is present). Window frames give you explicit control over exactly which rows the function considers relative to the current row.

A frame is specified after ORDER BY using the ROWS, RANGE, or GROUPS keyword:

SQL

The frame defines a sliding window of rows that moves as the "current row" advances through the partition.

When the current row is Row 3, a frame of ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING includes Row 2, Row 3, and Row 4. As the current row moves to Row 4, the frame shifts to include Row 3, Row 4, and nothing after (because Row 4 is the last row in the partition).

Frame Bound Options

There are five bound specifications you can use:

BoundMeaning
UNBOUNDED PRECEDINGFirst row of the partition
N PRECEDINGN rows before the current row
CURRENT ROWThe current row
N FOLLOWINGN rows after the current row
UNBOUNDED FOLLOWINGLast row of the partition

The start bound must come before or at the end bound in the ordering. ROWS BETWEEN 2 FOLLOWING AND 1 PRECEDING is invalid because the start (2 rows ahead) is after the end (1 row behind).

Some common frame combinations:

Frame SpecificationRows Included
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROWAll rows from partition start to current row (running total)
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGAll rows in the partition (same as no ORDER BY)
ROWS BETWEEN 2 PRECEDING AND CURRENT ROWCurrent row and the 2 rows before it (3-row moving window)
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWINGCurrent row, the row before, and the row after (centered 3-row window)
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWINGCurrent row to end of partition

A Practical Example: 3-Stream Moving Average

Suppose you want a moving average of listening duration over each user's last 3 streams:

SQL

For user 1's four streams, the moving average works like this:

Rowduration_listenedRows in Framemoving_avg_3
1st (237)237[237]237.0
2nd (200)200[237, 200]218.5
3rd (90)90[237, 200, 90]175.7
4th (348)348[200, 90, 348]212.7

At the first row, only one row exists in the frame (there are no preceding rows), so the average is the value itself. At the second row, two rows are in the frame. From the third row onward, the frame always contains three rows: the current row and the two before it.

ROWS vs RANGE vs GROUPS

The frame keyword determines how "N PRECEDING" and "N FOLLOWING" are interpreted.

ROWS

ROWS counts physical rows. ROWS BETWEEN 2 PRECEDING AND CURRENT ROW means "this row and the two rows directly above it in the sorted order." It doesn't care about values, only positions.

This is the most commonly used frame type and the easiest to reason about.

RANGE

RANGE operates on values rather than positions. RANGE BETWEEN 2 PRECEDING AND CURRENT ROW means "all rows whose ORDER BY value is within 2 of the current row's value." If the ORDER BY column is a date, RANGE BETWEEN INTERVAL '7 days' PRECEDING AND CURRENT ROW includes all rows within the last 7 days.

The important difference shows up when rows have duplicate ORDER BY values. With ROWS, ties are ordered arbitrarily, and each row gets its own position in the frame. With RANGE, all rows with the same value are treated as peers and included together.

SQL

With RANGE, "CURRENT ROW" means "all rows with the same ORDER BY value as the current row." If two streams happen on the same date, both are included in the frame for both rows. With ROWS, "CURRENT ROW" means exactly one row.

GROUPS

GROUPS counts groups of peer rows (rows with the same ORDER BY value). GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING means "the current group of peers, plus the group before and the group after." This is useful when you want to aggregate over neighboring groups rather than individual rows.

GROUPS is less commonly used than ROWS or RANGE, and not all databases support it. PostgreSQL supports GROUPS starting from version 11. MySQL added it in version 8.0. SQL Server does not support GROUPS.

Choosing Between ROWS, RANGE, and GROUPS

UseWhen
ROWSYou want a fixed number of physical rows in the frame (3-row moving average, last 5 rows)
RANGEYou want value-based ranges (last 7 days, values within 100 of current) or need ties handled as a group
GROUPSYou want to include neighboring groups of tied values

ROWS is the default choice for most problems. RANGE is necessary when the frame should be value-based, like a 7-day moving average that must include all rows within 7 calendar days regardless of how many rows fall on each day.

Default Frame Behavior

The frame you get when you don't explicitly specify one depends on whether ORDER BY is present:

OVER() ClauseDefault Frame
No ORDER BYRANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (entire partition)
With ORDER BYRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (start to current row)

This default explains why adding ORDER BY to OVER() changes the behavior of aggregate window functions. Without ORDER BY, SUM gives you the partition total on every row. With ORDER BY, SUM gives you a running total. The aggregate function itself hasn't changed. The frame changed.

This is the single most important thing to understand about window frames. When you write:

SQL

The frame is the entire partition, so every row sees the full total.

When you write:

SQL

The frame shrinks to "start of partition through current row," producing a running total. If you want the full partition total even with ORDER BY (maybe because you need both ordering and the total), specify the frame explicitly:

SQL

The WINDOW Clause

When a query uses the same window definition in multiple places, repeating the OVER() clause gets verbose. The WINDOW clause lets you name a window and reference it:

SQL

The WINDOW clause goes between WHERE (or GROUP BY/HAVING if present) and ORDER BY. You can define multiple named windows:

SQL

This is a readability feature. The query does the same thing as writing out each OVER() clause in full, but it avoids repetition and makes the window definitions easy to scan.

Not all databases support the WINDOW clause. PostgreSQL and MySQL 8.0+ support it. SQL Server does not.

FeaturePostgreSQLMySQLSQL Server
WINDOW clauseYesYes (8.0+)No

Where Window Functions Can Appear

Window functions are only allowed in two places: the SELECT list and the ORDER BY clause. They cannot appear in WHERE, HAVING, or GROUP BY.

What goes wrong here?

SQL

PostgreSQL returns: ERROR: window functions are not allowed in WHERE.

This restriction exists because of SQL's execution order. Window functions run after WHERE, GROUP BY, and HAVING have already filtered and grouped the data. The window function's input is the result of those earlier stages, so it can't be referenced before it exists.

Window functions execute after HAVING but before the final SELECT output. To filter on a window function result, wrap the query in a subquery or CTE:

SQL

The subquery computes the window function first. The outer query then filters on the result.