Last Updated: March 31, 2026
Insertion Sort builds a sorted array one element at a time. It takes each element and inserts it into its correct position within the already sorted portion, much like sorting playing cards in your hand.
At every step, the algorithm shifts larger elements to the right to make space for the current element. This makes it efficient for small datasets and nearly sorted arrays, where only a few shifts are needed.
While it is not suitable for large inputs, Insertion Sort is an important algorithm to understand because of its simplicity, stability, and practical performance in specific scenarios.
In this chapter, you will learn how Insertion Sort works step by step, how to implement it, and when it can outperform more complex algorithms.
Insertion Sort builds a sorted portion of the array one element at a time. It picks the next unsorted element and inserts it into its correct position within the already-sorted region.
At any point during the algorithm, the array is divided into two regions:
The algorithm repeatedly takes the first element from the unsorted region and places it where it belongs in the sorted region, shifting larger elements to the right to make room.
Why is this considered the most intuitive sorting algorithm? Because it does not require any clever tricks or complex data structures. You look at each element, compare it to what you have already sorted, and put it in the right place. That is it.
This simplicity makes it an excellent starting point for understanding sorting algorithms. But do not let the simplicity fool you. Insertion Sort has real practical value, especially in scenarios where data is nearly sorted or arriving in a stream.
Here is the step-by-step process:
key.key with each element in the sorted region, moving from right to left.key one position to the right.key into the gap created by the shifts.The key insight here is shifting, not swapping. Many beginners implement Insertion Sort using adjacent swaps, but the standard approach shifts elements to the right in a single pass and then places the key in its final position. This is more efficient because each shift is a single assignment, while a swap requires three assignments (using a temporary variable).
Notice the inner while loop. It keeps shifting elements to the right as long as they are greater than the key. The moment it finds an element that is smaller than or equal to the key, it stops. The key drops into the gap right after that element. This is what makes Insertion Sort stable: equal elements maintain their original relative order because we only shift elements that are strictly greater than the key.
The outer loop picks each unsorted element starting from index 1. The inner while loop shifts larger elements to the right. After the loop, the key is placed into the correct position.
Let us trace through the algorithm with the array [7, 3, 5, 1, 9, 2]. We will go step by step to see exactly what happens during each iteration.
We compare 3 with elements in the sorted region (just 7).
7 > 3, so shift 7 one position to the right.3 at index 0.We compare 5 with elements in the sorted region [3, 7], moving right to left.
7 > 5, so shift 7 to the right.3 < 5, so stop. Place 5 at index 1.We compare 1 with elements in the sorted region [3, 5, 7], moving right to left.
7 > 1, shift right.5 > 1, shift right.3 > 1, shift right.1 at index 0.This is the worst case for a single insertion. The element 1 had to travel all the way to the front, requiring every sorted element to shift.
We compare 9 with the last element in the sorted region.
7 < 9, so stop immediately. Place 9 at index 4 (it stays in place).This is the best case for a single insertion. When an element is already larger than everything in the sorted region, no shifting is needed. This is why Insertion Sort is fast on nearly sorted data.
We compare 2 with elements in the sorted region [1, 3, 5, 7, 9], moving right to left.
9 > 2, shift right.7 > 2, shift right.5 > 2, shift right.3 > 2, shift right.1 < 2, stop. Place 2 at index 1.The algorithm made 5 passes (for elements at indices 1 through 5), performing a total of 11 comparisons and 11 shifts. Notice how the number of shifts varied dramatically per step, from 0 (for 9) to 4 (for 2). This is what gives Insertion Sort its adaptive behavior.
| Case | Time Complexity | Explanation |
|---|---|---|
| Best | O(n) | Array is already sorted. The inner loop never executes. Each element is compared once and stays in place. |
| Average | O(n^2) | On average, each element needs to be compared with half of the sorted region. |
| Worst | O(n^2) | Array is sorted in reverse order. Every element must travel all the way to the front, causing maximum shifts. |
| Space | O(1) | Only a constant amount of extra memory is used (the key variable and loop counters). |
Stability: Yes. Insertion Sort is stable because it only shifts elements that are strictly greater than the key. Equal elements are never reordered relative to each other.
Adaptive: Yes. The running time depends on how "sorted" the input already is. If the input has very few inversions (pairs of elements that are out of order), Insertion Sort runs close to O(n). This adaptive property is what makes it so valuable for nearly sorted data.
To put the numbers in perspective, consider sorting 10,000 elements:
| Scenario | Approximate Comparisons |
|---|---|
| Already sorted | 9,999 |
| Nearly sorted (10 inversions) | ~10,009 |
| Random order | ~25,000,000 |
| Reverse sorted | ~50,000,000 |
The difference between the best case and worst case is enormous. For nearly sorted data, Insertion Sort is competitive with O(n log n) algorithms. For random data, it is not.
Nearly sorted data. If the input is almost sorted with only a few elements out of place, Insertion Sort's adaptive nature makes it extremely efficient. Each out-of-place element only needs a small number of shifts.
Small arrays. For arrays with fewer than 10-20 elements, the overhead of more complex algorithms like Merge Sort or Quick Sort (function call overhead, partitioning logic) actually makes them slower than Insertion Sort. The simplicity of Insertion Sort shines on small inputs.
Online sorting (sorting as data arrives). When you receive elements one at a time and need to maintain a sorted collection, Insertion Sort is a natural fit. Each new element is simply inserted into the correct position.
As a subroutine in hybrid algorithms. This is perhaps the most important practical use case. Real-world sorting implementations use Insertion Sort as a building block:
Large random datasets. With O(n^2) average-case complexity, Insertion Sort simply cannot compete with O(n log n) algorithms for large inputs. Sorting 1 million random elements with Insertion Sort would take roughly 500 billion comparisons, while Merge Sort would need about 20 million.
Performance-critical paths with unpredictable data. If you cannot guarantee the input is small or nearly sorted, use a general-purpose O(n log n) algorithm instead.
The following table summarizes the trade-offs:
| Factor | Insertion Sort | Merge Sort | Quick Sort |
|---|---|---|---|
| Best case | O(n) | O(n log n) | O(n log n) |
| Worst case | O(n^2) | O(n log n) | O(n^2) |
| Space | O(1) | O(n) | O(log n) |
| Stable | Yes | Yes | No |
| Adaptive | Yes | No | No |
| Small arrays | Excellent | Overhead | Overhead |