Last Updated: May 3, 2026
Before writing your first SQL query, let’s settle two things: where you’ll run SQL, and what data you’ll use. Both are already handled for you.
Every chapter in this course includes an embedded SQL editor, so you can run queries directly in your browser with no installation, setup, or signup required.
Click "Run" on this query or edit and run to confirm the environment is working:
The StreamFlow database (described below) is preloaded, which means every query in the course can be edited, re-run, and experimented with right where you read it.
Throughout this course, we’ll primarily use PostgreSQL as the default SQL dialect.
PostgreSQL is one of the most widely used relational databases in the world. It is powerful, standards-compliant, feature-rich, and an excellent choice for learning SQL because it balances clean syntax with advanced real-world capabilities.
All examples, exercises, and query outputs in this course are written with PostgreSQL in mind by default.
That said, the core ideas you learn here apply across most relational databases. When syntax or behavior differs between PostgreSQL, MySQL, and SQL Server, the chapter will call it out explicitly so you understand both the PostgreSQL version and the cross-database differences.
The goal is not just to teach one database, but to help you build SQL skills that transfer anywhere.
StreamFlow is a fictional music streaming platform inspired by services like Spotify. Users sign up, choose subscription plans, follow artists, create playlists, and stream tracks. Behind the scenes, the platform also runs marketing campaigns, processes payments, serves ads, and tracks user activity.
The dataset is designed to feel like a real production system. It is realistic enough to model real business problems, small enough to understand quickly, and rich enough to demonstrate every major SQL concept in this course.
It also includes the kinds of messy details real data often has:
Every example, exercise, and interview-style problem in this course uses the same dataset.
That means once you understand the schema, you can focus on writing better SQL instead of constantly relearning new tables and relationships.
Most chapters include their own focused sample data through INSERT queries.
The browser environment already has all tables pre-created, so you can start querying immediately. If a chapter needs additional or specially shaped data, such as overlapping subscriptions, intentional duplicates, or sparse time-series records, it includes its own INSERT block at the top of the page.
This keeps every chapter self-contained and ensures the examples behave exactly as intended.
Five clusters worth recognizing:
users, subscriptions, payments. Who is on the platform, what they pay for.artists, albums, tracks, genres. What's available to listen to.streams. Every play event. This is the largest table and drives most analytical questions.playlists, playlist_tracks, follows, artist_follows. How users curate and connect.campaigns, ad_impressions. What ads ran and who saw them.All 14 tables are pre-created in the browser environment. The full schema is below for reference. You don't need to run any of these CREATE TABLE statements; they're listed so you know what columns exist when you write queries.
| Table | Purpose | Key Columns |
|---|---|---|
users | Account holders | user_id, country, subscription_tier, signup_date |
subscriptions | Plan history per user, including renewals and cancellations | user_id, plan, start_date, end_date |
genres | Music genres, with optional parent for hierarchy | genre_id, parent_genre_id |
artists | Performing artists, optionally managed by another artist | artist_id, verified, manager_artist_id |
albums | Albums released by artists | album_id, artist_id, genre_id |
tracks | Individual songs on albums; carries JSONB metadata | track_id, album_id, duration_seconds |
streams | One row per play event. The largest table, used heavily for analytics | user_id, track_id, started_at, duration_listened |
playlists | User-created playlists, optionally forked from another | creator_user_id, forked_from_playlist_id |
playlist_tracks | Many-to-many between playlists and tracks, with ordering | playlist_id, track_id, position |
follows | User-to-user follow graph | follower_user_id, followed_user_id |
artist_follows | User-to-artist follow graph | user_id, artist_id |
payments | Payment transactions, including refunds | user_id, amount, status |
campaigns | Marketing campaigns across channels | campaign_id, channel, start_date, end_date |
ad_impressions | Ad views and clicks tied to a campaign | user_id, campaign_id, clicked |
If you want to follow along outside the browser, you can run StreamFlow locally on PostgreSQL, MySQL, or SQL Server. Postgres is the recommended default; the syntax in the course matches what Postgres expects.
PostgreSQL:
Other databases: A few SERIAL/BIGSERIAL and JSONB types need adjustment. The mappings:
| PostgreSQL | MySQL | SQL Server |
|---|---|---|
SERIAL | INT AUTO_INCREMENT | INT IDENTITY(1,1) |
BIGSERIAL | BIGINT AUTO_INCREMENT | BIGINT IDENTITY(1,1) |
JSONB | JSON | NVARCHAR(MAX) (or JSON in Azure SQL) |
TIMESTAMP WITH TIME ZONE | DATETIME | DATETIMEOFFSET |
BOOLEAN | TINYINT(1) | BIT |
Re-running the chapter INSERTs locally hits a duplicate-key error if the data is already there. Two ways around it:
The browser environment handles this for you automatically on Reset.