AlgoMaster Logo

Exercise: Finding and Removing Duplicates

Last Updated: May 12, 2026

1 min read

For this exercise, we will work with below data:

SQL

Exercise 1

Easy

Write a query that finds all email addresses in the users table that appear more than once when compared case-insensitively. Show the normalized lowercase email and how many users share it. Order by normalized_email.

SQL

Expected Query

Expected Output

Exercise 2

Easy

Write a query that detects duplicate stream rows defined by the same user_idtrack_id, and started_at. Show those three columns plus the count of occurrences. Order by user_idtrack_idstarted_at.

SQL

Expected Query

Expected Output

Exercise 3

Medium

Write a query that detects duplicates in the streams table when "duplicate" is defined as the same user listening to the same track on the same calendar date in UTC. Show user_idtrack_idstream_date, and the count of streams in that day. Only return groups with more than one row, ordered by user_idtrack_idstream_date.

SQL

Expected Query

Expected Output

Exercise 4

Hard

Write a query that finds near-duplicate stream pairs where two streams share the same user_id and track_id and their started_at values are within 60 seconds of each other. Show original_idduplicate_id, and gap_seconds. Order by original_idduplicate_id.

SQL

Expected Query

Expected Output

Exercise 5

Hard

Write a query that finds users in the users table whose normalized email (case-insensitive) matches another user's email AND whose country differs from at least one other user sharing that email. Show user_idusernameemail, and country. Order by user_id.

SQL

Expected Query

Expected Output