We have an array that should contain every integer from 1 to n exactly once, but something went wrong. One number appears twice, and because of that, another number is missing entirely. Our job is to find both: the duplicate and the missing number.
The constraint 1 <= nums[i] <= nums.length carries most of the weight. Every value sits in the range [1, n], so each value can double as an index into the array. That mapping is what makes the sub-linear-space solutions possible.
Counting occurrences of every number solves the problem directly. Beyond that, the tight value-to-index relationship opens up sorting, sum and sum-of-squares math, and in-place index marking.
2 <= nums.length <= 10^4: With n up to 10,000, an O(n) or O(n log n) solution runs comfortably, so there is no need to settle for the O(n) extra space of a hash map.1 <= nums[i] <= nums.length: Every value maps to a valid index in [1, n], which is what enables the index-marking and mathematical approaches below.Sorting puts every value next to the value one larger than it. After sorting, the duplicate appears as two equal adjacent elements, and the missing number appears as a gap where two adjacent elements differ by more than one. A single scan over the sorted array finds both.
nums[i] == nums[i-1], we found the duplicate.nums[i] > nums[i-1] + 1, the missing number is nums[i-1] + 1.missing is still not found, the missing number must be n.[duplicate, missing].Input:
The array is already sorted, so [1, 2, 2, 4]. Since nums[0] = 1, the missing number is not 1, and missing stays at -1 for now. We scan adjacent pairs from index 1:
i=1: nums[1]=2, nums[0]=1. Not equal and the gap is exactly 1, so no duplicate and no missing here.i=2: nums[2]=2, nums[1]=2. Equal, so duplicate = 2.i=3: nums[3]=4, nums[2]=2. Since 4 > 2 + 1, there is a gap, so missing = 2 + 1 = 3.After the loop, missing is no longer -1, so the final-fallback check (missing equals n) does not fire. We return [2, 3].
Result:
Sorting does more work than the problem requires, since full order is unnecessary when we only need per-value counts. The next approach drops to linear time by counting occurrences directly.
In a correct array, every number from 1 to n appears exactly once. The error changes exactly two of those counts: the duplicate has count 2, and the missing number has count 0. Building a frequency map in one pass, then scanning the numbers 1 through n, recovers both. The number whose count is 2 is the duplicate; the number whose count is 0 is the missing one.
[duplicate, missing].The hash map uses O(n) extra space. The next approach removes that by reducing the problem to two equations, using no extra storage beyond a few accumulators.
Let dup be the duplicate and miss the missing number. Compare the given array against a correct array containing 1 through n exactly once. Two aggregates capture the difference. The sum of the array exceeds the expected sum 1 + 2 + ... + n by exactly dup - miss, because the duplicate added one extra copy of dup and the missing number removed one copy of miss. The sum of squares exceeds the expected sum of squares by dup^2 - miss^2.
Those are two equations in two unknowns:
Since dup^2 - miss^2 = (dup - miss)(dup + miss), dividing the second equation by d gives dup + miss. With both dup - miss and dup + miss known, dup = ((dup-miss) + (dup+miss)) / 2 and miss = dup - d.
This runs in O(n) time and O(1) extra space, and unlike the next approach it never modifies the input array.
sumExpected = n(n+1)/2 and sqExpected = n(n+1)(2n+1)/6.sumActual (sum of all elements) and sqActual (sum of squares).d = sumActual - sumExpected, which equals dup - miss.s = (sqActual - sqExpected) / d, which equals dup + miss.dup = (d + s) / 2 and miss = dup - d.[dup, miss].The sum of squares grows quickly: for n up to 10,000 it reaches about 3.3 x 10^11, well past the range of a 32-bit integer. The accumulators must use 64-bit integers (long in Java and C#, long long in C++, int64/i64 in Go and Rust). JavaScript and TypeScript numbers are double-precision floats, which represent integers exactly up to 2^53, so no overflow occurs there.
The math approach is O(1) space but relies on 64-bit arithmetic and a division that assumes d is nonzero. The next approach reaches the same O(1) space using only the sign bit of the existing array entries, with no large intermediate values.
Every value in the array is between 1 and n, so each value can index into the array itself. Visiting a number negates the value at the index it points to, marking that index as seen. Reaching an index that is already negative means the number pointing there has been seen before, which identifies the duplicate. After the marking pass, the one index that still holds a positive value corresponds to the missing number.
This uses the input array as a boolean visited array, with the sign bit as the flag. It runs in O(n) time and O(1) extra space.
The sign of nums[i] records whether the number i+1 has appeared. In a correct array, each index 0 to n-1 is negated exactly once, since each number 1 to n appears exactly once. The duplicate appears twice, so its target index is negated on the first occurrence and found already negative on the second, which is when it is reported. The missing number never appears, so its target index is never negated and stays positive. Reading the value with abs throughout means the index computed from a value is unaffected by whether that slot was already marked.
abs(nums[i]) - 1.nums[targetIndex]. If it's already negative, then abs(nums[i]) is the duplicate.nums[targetIndex] to mark it as visited.nums[index] > 0 corresponds to the missing number (which is index + 1).[duplicate, missing].