Last Updated: May 2, 2026
The previous chapter used >= and < to build range filters, and chained OR to match multiple values. Both patterns work, but they get verbose fast.
BETWEEN and IN give you cleaner syntax for the same operations.
This chapter uses the users, streams, artists, albums, and tracks tables. Insert the following rows to follow along:
The artists table includes three entries with NULL countries (Chromatic, Silent Echo, Phantom Wave), which we will need to demonstrate the NOT IN NULL trap later in the chapter.
BETWEEN tests whether a value falls within an inclusive range. The syntax reads almost like English:
This is equivalent to:
Both endpoints are included. That "inclusive on both ends" behavior is the single most important thing to remember about BETWEEN.
Find tracks between 180 and 300 seconds long:
This returns tracks with duration 180, 181, ... 299, 300. The equivalent using comparison operators would be duration_seconds >= 180 AND duration_seconds <= 300. BETWEEN is more readable, especially when the column name is long.
A slightly more targeted query. Find tracks between 3 and 4 minutes:
This picks up Sunrise Protocol (237), Voltage (198), Spring Awakening (204), Turntable (185), and Skyline (220). All five fall within the 180-240 range, endpoints included.
BETWEEN works with dates the same way it works with numbers.
Example: Find users who signed up in the first half of 2024
This returns dave_kr (2024-01-05), emma_dev (2024-02-14), frank_j (2024-04-01), and gina_null (2024-06-20). All four signup dates fall within the January 1 to June 30 range, inclusive.
Date BETWEEN is common in reporting queries: "all orders this quarter," "subscriptions started last month," "streams from the past week." It is clean and readable for DATE columns specifically. Timestamps are a different story, which we will cover shortly.
NOT BETWEEN excludes rows within the range.
Example: Find tracks that are either very short or very long
This returns Crackle (156) and Late Night Drive (312) and Afterglow (348). Every track inside the 180-300 range is excluded.
NOT BETWEEN is equivalent to column < low OR column > high. The boundary values (180 and 300) are excluded from the result because they fall inside the range.
BETWEEN also works on strings, using lexicographic (dictionary) ordering:
This returns artists whose names start with A through L (and exactly 'M' itself, since BETWEEN is inclusive). Dusty Vinyl and Lena Park would match. Chromatic would too, since 'C' comes after 'A'.
String BETWEEN is rarely used in practice because lexicographic ranges are unintuitive. Stick to LIKE or pattern matching (covered in the next chapter) for most string filtering.
BETWEEN works perfectly with DATE columns. With TIMESTAMP columns, it introduces a subtle but dangerous bug.
Say you want all streams from March 1, 2024. This looks reasonable:
The string '2024-03-01' is cast to a timestamp as 2024-03-01 00:00:00. So this query only matches streams that started at exactly midnight. Every stream that happened at 08:00, 09:00, or any other time that day is excluded.
The natural instinct is to fix it by extending the upper bound to the next day:
This is better but still wrong. BETWEEN is inclusive on both ends, so this matches everything from 2024-03-01 00:00:00 through 2024-03-02 00:00:00. Any stream that started at exactly midnight on March 2 sneaks in. In practice this might be rare, but "rare" bugs in date logic are the ones that show up in production at the worst possible time.
The following diagram shows the problem:
Green nodes are the rows you want. The red node is the unwanted midnight row that BETWEEN includes. The teal node is excluded.
The safe pattern for timestamp ranges uses >= on the start and < on the end, without BETWEEN:
This is called a half-open interval: inclusive on the left, exclusive on the right. It captures every moment on March 1 (from midnight through 23:59:59.999...) and stops cleanly at the boundary of March 2.
Half-open intervals have another practical advantage: consecutive ranges never overlap and never leave gaps. "March" is >= '2024-03-01' AND < '2024-04-01'. "April" is >= '2024-04-01' AND < '2024-05-01'. No row can appear in both ranges, and no row between the ranges is missed.
For a full week of streams:
The rule is straightforward: use BETWEEN for DATE columns (where there is no time component), and use half-open intervals for TIMESTAMP columns.
| Column Type | Safe Approach | Example |
|---|---|---|
DATE | BETWEEN works fine | WHERE signup_date BETWEEN '2024-01-01' AND '2024-06-30' |
TIMESTAMP | Half-open interval | WHERE started_at >= '2024-03-01' AND started_at < '2024-03-02' |
IN tests whether a value matches any item in a list. It replaces chains of OR conditions.
The previous chapter found US or GB users like this:
IN does the same thing with less noise:
Both queries return the same rows. IN becomes more valuable as the list grows. Matching five countries with OR requires five separate equality checks joined by OR. With IN, it is one clean line:
IN works with any data type.
Example: Find specific tracks by ID
This returns Sunrise Protocol, Spring Awakening, and Skyline. The list can contain any number of values, though very long literal lists (hundreds of items) may have performance implications depending on the database.
Find users who signed up on specific dates:
This returns alice_m, dave_kr, and gina_null. Each signup_date is compared against the three dates in the list.
The real power of IN shows up with subqueries. Instead of a hardcoded list, the list comes from another query.
Find all users who have at least one stream:
The subquery returns every user_id that appears in streams. The outer query then finds users whose user_id matches any of those values.
Find the names of artists who have albums released in 2024:
The subquery finds artist_ids with 2024 albums (Lena Park's "Seasons" and Nova Pulse's "Neon Skies"). The outer query returns those artist names.
NOT IN excludes rows that match any value in the list:
This returns only premium users: alice_m, carol_uk, frank_j, and hiro_jp. NOT IN with a literal list of non-NULL values works exactly as you would expect. The trouble starts when NULLs enter the picture.
Say you want to find users whose country does not appear in the artists table. This seems straightforward:
This query returns zero rows. Not because every user's country appears in the artists table, but because the subquery returns a NULL value (from Chromatic, Silent Echo, or Phantom Wave), and that NULL poisons the entire NOT IN.
To understand the trap, you need to see how the database evaluates NOT IN internally. The subquery SELECT country FROM artists returns: 'US', 'GB', 'KR', NULL, NULL, NULL.
For each user row, NOT IN expands to a chain of <> comparisons joined by AND:
The result is FALSE (because 'KR' <> 'KR' is FALSE), which is expected. Dave's country is in the artists table, so he should be excluded.
But look at hiro_jp (country = 'JP'):
TRUE AND TRUE AND TRUE AND UNKNOWN evaluates to UNKNOWN. WHERE only includes rows where the result is TRUE, so hiro_jp is excluded. The same thing happens for every user row, because every row hits that <> NULL comparison and gets UNKNOWN.
The result: zero rows, regardless of the actual data.
Here is the confusing part. IN with the same NULL-containing subquery works correctly:
This returns alice_m (US), bob_jones (US), carol_uk (GB), dave_kr (KR), and emma_dev (US). It correctly finds users whose country appears in the artists table.
IN expands to OR comparisons:
TRUE OR FALSE OR FALSE OR UNKNOWN evaluates to TRUE. One TRUE is enough for OR to succeed, and the UNKNOWN is harmless.
The asymmetry comes from how AND and OR handle UNKNOWN differently:
| Expression | Result |
|---|---|
| TRUE AND UNKNOWN | UNKNOWN |
| TRUE OR UNKNOWN | TRUE |
| FALSE AND UNKNOWN | FALSE |
| FALSE OR UNKNOWN | UNKNOWN |
With IN (OR logic), one TRUE match is enough to include the row, and NULLs cannot override that. With NOT IN (AND logic), every comparison must be TRUE for the row to pass, and a single UNKNOWN from a NULL comparison blocks the entire result.
The safe alternative to NOT IN with subqueries is NOT EXISTS:
NOT EXISTS checks whether the correlated subquery returns any rows. If the subquery returns zero rows, NOT EXISTS is TRUE and the outer row is included. The comparison a.country = u.country handles NULLs correctly because NULL = NULL is UNKNOWN (not TRUE), so a NULL artist country never "matches" a user country.
This query returns hiro_jp (JP), since no artist has country 'JP'. It also excludes gina_null (NULL country) because NULL = NULL is UNKNOWN, not TRUE, so the subquery returns no matching rows for gina_null, which means NOT EXISTS is TRUE. Whether this behavior is what you want depends on your business logic. The point is that NOT EXISTS gives predictable, debuggable results, while NOT IN with NULLs gives silently wrong results.
If you prefer NOT IN syntax, you can filter out NULLs explicitly:
Adding WHERE country IS NOT NULL to the subquery removes the NULL values that poison NOT IN. This returns hiro_jp (JP) since no artist has JP as their country. gina_null is excluded because NULL NOT IN (...) evaluates to UNKNOWN regardless of the list contents.
Both approaches work. NOT EXISTS is generally preferred because it does not require you to remember to add the NULL filter, and it handles all edge cases consistently.
| Approach | NULL-Safe? | Notes |
|---|---|---|
NOT IN (subquery) | No | Returns zero rows if subquery contains NULL |
NOT IN (subquery WHERE col IS NOT NULL) | Yes | Requires explicit NULL filter |
NOT EXISTS (correlated subquery) | Yes | Handles NULLs correctly by default |