AlgoMaster Logo

UNION and UNION ALL

Last Updated: May 3, 2026

8 min read

Sometimes the data you need lives in two separate queries. Maybe you want a combined activity feed that pulls from both follows and artist_follows, or you need to merge completed and refunded payments into a single report.

SQL gives you UNION and UNION ALL to stack result sets vertically, and picking the right one matters more than most people realize.

Sample Data

The follow data is designed so that combining user-follows and artist-follows produces interesting overlaps. Insert the following rows to follow along:

SQL

How UNION ALL Works

UNION ALL is the simpler of the two operators. It takes the result set from one query, takes the result set from another, and stacks them together. No filtering, no deduplication, just concatenation.

Say you want a list of all "follow" activity on the platform, combining both user-to-user follows and user-to-artist follows into one feed. These live in separate tables with different structures, but you can reshape them into matching columns:

SQL

This returns every row from the first query followed by every row from the second. If follows has 8 rows and artist_follows has 10 rows, the result has 18 rows. UNION ALL never removes anything.

A few things to notice about the structure:

  • Both queries produce exactly four columns.
  • The column names in the result come from the first query. The second query's aliases (user_id, follow_type, etc.) are ignored for naming purposes, though they still help readability.
  • The data types must be compatible. Both followed_user_id and artist_id are integers, so they can share the target_id column. The literal 'user' and 'artist' strings are both text, so the follow_type column works too.

Adding Context with Literals

The 'user' and 'artist' string literals in the query above are a common pattern. When you combine rows from different sources, you often need a discriminator column so you can tell where each row came from. Without it, you would have a list of IDs with no way to know whether target_id = 3 means user 3 or artist 3.

Column Order Matters, Names Don't

SQL matches columns by position, not by name. The first column of the first query pairs with the first column of the second query, regardless of what they're called. This means you could write:

SQL

This runs without errors because both queries have two integer columns. But the result is nonsense: you've paired followed_user_id with user_id and follower_user_id with artist_id. The database doesn't check whether the pairing makes sense. That's your job.

How UNION Works

UNION does the same stacking as UNION ALL, then adds one extra step: it removes duplicate rows from the combined result.

"Duplicate" here means every column in the row matches exactly. If two rows differ in even one column, they both survive.

SQL

On January 12, alice (user 1) followed carol (in follows) and alice also followed a track's artist in artist_follows. If both tables produce a row with (1, '2024-01-12'), UNION keeps only one copy. UNION ALL would keep both.

The deduplication considers all columns. If you add a source discriminator column (like the 'user' and 'artist' labels from earlier), no two rows would ever match because the source column would always differ. In that case, UNION and UNION ALL produce identical results, but UNION still pays the cost of checking for duplicates.

When Does UNION Actually Remove Rows?

UNION only removes rows when both queries can produce identical output rows. This happens naturally in a few situations:

  • Overlapping date ranges. Querying "streams from this week" and "streams from the last 7 days" might overlap if run on different days.
  • Shared dimension queries. Pulling user IDs from two different tables (like both follows and artist_follows) where the same user appears in both.
  • Self-referencing combinations. Combining a table with a filtered version of itself.

If your two queries pull from completely separate tables and include a source discriminator column, duplicates are structurally impossible. UNION ALL is the right choice in that case.

Column Matching Rules

Every UNION or UNION ALL has three requirements:

Same Number of Columns

Both queries must produce the same number of columns. If one query returns 3 columns and the other returns 4, the database raises an error.

What goes wrong here?

SQL

This fails because the first query has 3 columns and the second has 2.

Fix: Pad the shorter query with a NULL or a literal:

SQL

Now both queries have 3 columns. Since artists also has a country column, this works cleanly. If the second table lacked a matching column, you could use NULL AS country instead.

Compatible Data Types

The database needs to be able to combine each column pair into a single output type. Integers with integers, text with text, timestamps with timestamps. Most databases will implicitly cast compatible types. For example, an INTEGER and a BIGINT will work because integers can widen to bigints. But pairing a text column with an integer column fails in most databases.

PairingWorks?Result Type
INTEGER + INTEGERYesINTEGER
INTEGER + BIGINTYesBIGINT
VARCHAR + VARCHARYesVARCHAR
INTEGER + VARCHARNoError in most databases
DATE + TIMESTAMPYesTIMESTAMP
BOOLEAN + INTEGERDependsPostgreSQL: yes (auto-cast). MySQL: yes. SQL Server: no

Position-Based Matching

As covered earlier, columns match by position. The first column from each query combines into the first result column, the second into the second, and so on. Column names from the second (and any subsequent) query are ignored. Only the first query's column names or aliases appear in the result.

This means you should always verify that column positions align semantically, not just type-wise.

Performance: UNION vs UNION ALL

The performance difference between UNION and UNION ALL comes down to one thing: deduplication costs work.

UNION ALL just concatenates the two result sets. The database does no additional work beyond running both queries.

UNION concatenates and then sorts or hashes the combined result to find and remove duplicates. For small result sets, the difference is negligible. For large ones (millions of rows), it can be significant. Sorting is O(n log n), and the hash-based approach needs memory proportional to the number of distinct rows.

The Rule of Thumb

Use UNION ALL unless you specifically need deduplication. Three common scenarios where UNION ALL is safe:

  1. The queries pull from different tables with a discriminator column. If you include a 'user' vs 'artist' label, no row from one query can match a row from the other.
  2. Each query has a WHERE clause that makes them mutually exclusive. For example, one query filters status = 'completed' and the other filters status = 'refunded'. No row can appear in both.
  3. The underlying tables guarantee uniqueness. If both queries select primary keys and the tables are disjoint, duplicates are structurally impossible.

If duplicates can appear and you want them removed, use UNION. But think about whether the duplicates actually matter for your use case. An analytics query that counts rows might give wrong results with UNION (undercounting), while a reporting query that lists items might show confusing repeats with UNION ALL.

Combining More Than Two Queries

You can chain multiple UNION or UNION ALL operators to combine three or more queries:

SQL

This builds a combined event timeline from three separate tables. Each UNION ALL stacks the next result set onto the previous ones.

Mixing UNION and UNION ALL

You can mix them in the same statement, but the behavior can be surprising. In standard SQL and PostgreSQL, UNION has higher precedence than UNION ALL. Consider:

SQL

The precedence rules mean this is interpreted as:

Query B and Query C get deduplicated first (via UNION), then that result is concatenated with Query A (via UNION ALL). This is rarely the intent. If you need to control the order of operations, use parentheses (in databases that support them) or restructure into CTEs:

SQL

In practice, mixing UNION and UNION ALL in a single statement is confusing and error-prone. Pick one and stick with it, or break complex combinations into CTEs to make the logic explicit.

UNION vs JOIN

UNION and JOIN both combine data, but in fundamentally different directions.

A JOIN matches rows from two tables based on a condition and produces wider rows (more columns). A UNION stacks rows from two queries and produces taller results (more rows). The column count stays the same.

QuestionUse
"Show each user alongside their subscription info"JOIN
"List all user IDs who appear in either follows or artist_follows"UNION
"For each stream, include the track title and artist name"JOIN (multiple)
"Combine completed and refunded payments into one list"UNION ALL
"Find users who have both a stream and a payment"JOIN or INTERSECT
"Build a timeline of all user activity across different tables"UNION ALL

The key distinction: if you need to enrich rows with data from another table, use a JOIN. If you need to stack similar rows from different sources, use UNION.

One common mistake is using UNION when a JOIN is appropriate. For example, "show me each user's name and their most recent stream" requires a JOIN because you need columns from both users and streams in the same row. A UNION can't do this because it doesn't pair rows from different queries.

ORDER BY and LIMIT with UNION

An ORDER BY at the end of a UNION or UNION ALL applies to the entire combined result, not just the last query.

SQL

This returns the 10 most recent follows across both tables, sorted by date. The ORDER BY and LIMIT sit outside both queries and operate on the merged result.

What goes wrong here?

SQL

In PostgreSQL, this is a syntax error. You cannot put ORDER BY on an individual query within a UNION unless you wrap it in a subquery. The database sees ORDER BY followed_at UNION ALL and doesn't know what to do with it.

If you need each component query sorted before combining (which is rare), wrap it:

SQL

This takes the 5 most recent user follows and the 5 most recent artist follows, then stacks them. The subqueries handle their own sorting and limiting independently. Note that the final result's order is not guaranteed unless you add an outer ORDER BY.

DialectORDER BY in individual UNION querySubquery workaround
PostgreSQLSyntax errorYes, with subquery
MySQLAllowed but ignored without LIMITYes
SQL ServerSyntax errorYes, with TOP or subquery