Last Updated: May 12, 2026
For this exercise, we will work with below data:
Write a query that finds all email addresses in the users table that appear more than once when compared case-insensitively. Show the normalized lowercase email and how many users share it. Order by normalized_email.
Write a query that detects duplicate stream rows defined by the same user_id, track_id, and started_at. Show those three columns plus the count of occurrences. Order by user_id, track_id, started_at.
Write a query that detects duplicates in the streams table when "duplicate" is defined as the same user listening to the same track on the same calendar date in UTC. Show user_id, track_id, stream_date, and the count of streams in that day. Only return groups with more than one row, ordered by user_id, track_id, stream_date.
Write a query that finds near-duplicate stream pairs where two streams share the same user_id and track_id and their started_at values are within 60 seconds of each other. Show original_id, duplicate_id, and gap_seconds. Order by original_id, duplicate_id.
Write a query that finds users in the users table whose normalized email (case-insensitive) matches another user's email AND whose country differs from at least one other user sharing that email. Show user_id, username, email, and country. Order by user_id.