{"title":"Bucket Sort","description":"","content":"Bucket Sort is a distribution-based sorting algorithm that divides elements into multiple buckets, sorts each bucket individually, and then combines the results to produce the final sorted array.\n\nThe idea is to spread elements across buckets based on their value range so that each bucket contains a small subset of the data. With a uniform distribution, this produces near-linear time complexity.\n\n\n\n\n\nPerformance depends heavily on how the buckets are defined and how evenly the elements are distributed. Bucket sort is typically combined with another sorting algorithm, such as Insertion Sort, to sort individual buckets.\n\nThis chapter covers how Bucket Sort works, how to design effective bucket strategies, and when it outperforms traditional sorting algorithms.\n\n---\n\n# What Is Bucket Sort?\n\nInstead of repeatedly comparing pairs of elements, bucket sort divides the input into a fixed number of \"buckets\" (or bins), distributes elements into those buckets based on their value, sorts each bucket individually, and then concatenates the results.\n\nHere is the high-level idea:\n\n1. Create an array of empty buckets.\n2. Scatter: walk through the input and place each element into the bucket that covers its value range.\n3. Sort each bucket, typically with insertion sort or any other comparison sort.\n4. Gather: concatenate all buckets in order to produce the sorted output.\n\n\n```mermaid\nflowchart TD\n A[\"Input Array
[0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51]\"]:::primary\n\n A --> B[\"Create n empty buckets\"]:::teal\n B --> C[\"Distribute elements
into buckets by value\"]:::secondary\n C --> D[\"Sort each bucket
individually\"]:::green\n D --> E[\"Concatenate all
buckets in order\"]:::teal\n E --> F[\"Sorted Array
[0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\"]:::green\n\n classDef primary fill:#00ceff,stroke:#000,color:#000\n classDef secondary fill:#ffa94d,stroke:#000,color:#000\n classDef green fill:#69db7c,stroke:#000,color:#000\n classDef teal fill:#38d9a9,stroke:#000,color:#000\n```\n\n\nBucket sort resembles counting sort, and in fact generalizes it. Counting sort creates one \"bucket\" per distinct value and only works with integers. Bucket sort creates a fixed number of buckets, each covering a range of values, and works with any data type, including floating point numbers. When bucket count equals the range of integer values, bucket sort reduces to counting sort.\n\nBucket sort works best when the input is uniformly distributed over a known range. Data that falls between 0 and 1, or between 0 and 1000, with values spread roughly evenly across that range, suits bucket sort. When all values cluster into a single bucket, the algorithm's performance reduces to the performance of the inner sort.\n\n---\n\n# How It Works\n\nThe walkthrough below uses the canonical case: sorting floating point numbers in the range [0, 1).\n\n### Step 1: Create n Empty Buckets\n\nGiven n elements, create an array of n empty lists (buckets). Each bucket covers an equal slice of the value range.\n\nFor values in [0, 1) with n = 7 elements, the seven buckets cover these ranges:\n\n\n| Bucket | Range |\n|--------|-------|\n| 0 | [0.00, 0.14) |\n| 1 | [0.14, 0.29) |\n| 2 | [0.29, 0.43) |\n| 3 | [0.43, 0.57) |\n| 4 | [0.57, 0.71) |\n| 5 | [0.71, 0.86) |\n| 6 | [0.86, 1.00) |\n\n\n### Step 2: Distribute Elements into Buckets\n\nFor each element, compute its bucket index using the formula:\n\n\n```shell\nbucketIndex = floor(n * value)\n```\n\n\nFor a general range [minValue, maxValue], the formula becomes:\n\n\n```shell\nbucketIndex = floor(n * (value - minValue) / (maxValue - minValue))\n# When value == maxValue, clamp the result to n - 1\n```\n\n\nThis maps each value to a bucket in O(1) time: one multiplication, one floor, with a clamp to handle the edge case where `value == maxValue` produces an index of n.\n\n### Step 3: Sort Each Bucket\n\nSort each bucket individually. Insertion sort is the traditional choice. When elements are uniformly distributed, each bucket holds roughly n/b elements (where b is the number of buckets). With b = n, each bucket holds about 1 element on average, so insertion sort's O(m^2) cost per bucket is negligible. Even when a bucket has a few elements, insertion sort is fast on small inputs due to low overhead.\n\n### Step 4: Concatenate All Buckets\n\nWalk through the bucket array from index 0 to n-1, appending each bucket's sorted contents to the output. Since the buckets cover increasing value ranges, this produces a fully sorted array.\n\n\n```mermaid\nflowchart LR\n subgraph Input\n A[\"0.42\"]:::primary\n B[\"0.32\"]:::primary\n C[\"0.23\"]:::primary\n D[\"0.52\"]:::primary\n E[\"0.25\"]:::primary\n F[\"0.47\"]:::primary\n G[\"0.51\"]:::primary\n end\n\n subgraph Buckets\n B1[\"Bucket 1
[0.23, 0.25]\"]:::teal\n B2[\"Bucket 2
[0.32, 0.42]\"]:::teal\n B3[\"Bucket 3
[0.47, 0.51, 0.52]\"]:::teal\n end\n\n subgraph Output\n O[\"[0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\"]:::green\n end\n\n A --> B2\n B --> B2\n C --> B1\n D --> B3\n E --> B1\n F --> B3\n G --> B3\n\n B1 --> O\n B2 --> O\n B3 --> O\n\n classDef primary fill:#00ceff,stroke:#000,color:#000\n classDef teal fill:#38d9a9,stroke:#000,color:#000\n classDef green fill:#69db7c,stroke:#000,color:#000\n```\n\n\nThe diagram above simplifies the bucket indices for readability. The actual algorithm leaves several buckets empty and skips them during concatenation.\n\n---\n\n# Code Implementation\n\n\n```java\nimport java.util.ArrayList;\nimport java.util.Collections;\nimport java.util.List;\n\npublic class BucketSort {\n public static void bucketSort(float[] arr) {\n int n = arr.length;\n if (n <= 1) return;\n\n // Step 1: Create n empty buckets\n List> buckets = new ArrayList<>();\n for (int i = 0; i < n; i++) {\n buckets.add(new ArrayList<>());\n }\n\n // Step 2: Distribute elements into buckets\n for (float value : arr) {\n int bucketIndex = (int) (n * value);\n // Handle edge case where value == 1.0\n if (bucketIndex == n) bucketIndex = n - 1;\n buckets.get(bucketIndex).add(value);\n }\n\n // Step 3: Sort each bucket\n for (List bucket : buckets) {\n Collections.sort(bucket);\n }\n\n // Step 4: Concatenate all buckets\n int index = 0;\n for (List bucket : buckets) {\n for (float value : bucket) {\n arr[index++] = value;\n }\n }\n }\n\n public static void main(String[] args) {\n float[] arr = {0.42f, 0.32f, 0.23f, 0.52f, 0.25f, 0.47f, 0.51f};\n bucketSort(arr);\n // Output: [0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\n for (float v : arr) {\n System.out.print(v + \" \");\n }\n }\n}\n```\n\n```python\ndef bucket_sort(arr: list[float]) -> list[float]:\n n = len(arr)\n if n <= 1:\n return arr\n\n # Step 1: Create n empty buckets\n buckets = [[] for _ in range(n)]\n\n # Step 2: Distribute elements into buckets\n for value in arr:\n bucket_index = int(n * value)\n # Handle edge case where value == 1.0\n if bucket_index == n:\n bucket_index = n - 1\n buckets[bucket_index].append(value)\n\n # Step 3: Sort each bucket\n for bucket in buckets:\n bucket.sort()\n\n # Step 4: Concatenate all buckets\n result = []\n for bucket in buckets:\n result.extend(bucket)\n\n return result\n\narr = [0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51]\nprint(bucket_sort(arr))\n# Output: [0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\n```\n\n```cpp\n#include \n#include \n#include \n\nvoid bucketSort(std::vector& arr) {\n int n = arr.size();\n if (n <= 1) return;\n\n // Step 1: Create n empty buckets\n std::vector> buckets(n);\n\n // Step 2: Distribute elements into buckets\n for (float value : arr) {\n int bucketIndex = static_cast(n * value);\n // Handle edge case where value == 1.0\n if (bucketIndex == n) bucketIndex = n - 1;\n buckets[bucketIndex].push_back(value);\n }\n\n // Step 3: Sort each bucket\n for (auto& bucket : buckets) {\n std::sort(bucket.begin(), bucket.end());\n }\n\n // Step 4: Concatenate all buckets\n int index = 0;\n for (const auto& bucket : buckets) {\n for (float value : bucket) {\n arr[index++] = value;\n }\n }\n}\n\nint main() {\n std::vector arr = {0.42f, 0.32f, 0.23f, 0.52f, 0.25f, 0.47f, 0.51f};\n bucketSort(arr);\n // Output: 0.23 0.25 0.32 0.42 0.47 0.51 0.52\n for (float v : arr) {\n std::cout << v << \" \";\n }\n return 0;\n}\n```\n\n```go\npackage main\n\nimport (\n\t\"fmt\"\n\t\"sort\"\n)\n\nfunc bucketSort(arr []float64) []float64 {\n\tn := len(arr)\n\tif n <= 1 {\n\t\treturn arr\n\t}\n\n\t// Step 1: Create n empty buckets\n\tbuckets := make([][]float64, n)\n\n\t// Step 2: Distribute elements into buckets\n\tfor _, value := range arr {\n\t\tbucketIndex := int(float64(n) * value)\n\t\t// Handle edge case where value == 1.0\n\t\tif bucketIndex == n {\n\t\t\tbucketIndex = n - 1\n\t\t}\n\t\tbuckets[bucketIndex] = append(buckets[bucketIndex], value)\n\t}\n\n\t// Step 3: Sort each bucket\n\tfor i := range buckets {\n\t\tsort.Float64s(buckets[i])\n\t}\n\n\t// Step 4: Concatenate all buckets\n\tindex := 0\n\tfor _, bucket := range buckets {\n\t\tfor _, value := range bucket {\n\t\t\tarr[index] = value\n\t\t\tindex++\n\t\t}\n\t}\n\n\treturn arr\n}\n\nfunc main() {\n\tarr := []float64{0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51}\n\tbucketSort(arr)\n\t// Output: [0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\n\tfor _, v := range arr {\n\t\tfmt.Printf(\"%.2f \", v)\n\t}\n}\n```\n\n```csharp\nusing System;\nusing System.Collections.Generic;\n\npublic class BucketSort {\n public static void Sort(float[] arr) {\n int n = arr.Length;\n if (n <= 1) return;\n\n // Step 1: Create n empty buckets\n List[] buckets = new List[n];\n for (int i = 0; i < n; i++) {\n buckets[i] = new List();\n }\n\n // Step 2: Distribute elements into buckets\n foreach (float value in arr) {\n int bucketIndex = (int)(n * value);\n // Handle edge case where value == 1.0\n if (bucketIndex == n) bucketIndex = n - 1;\n buckets[bucketIndex].Add(value);\n }\n\n // Step 3: Sort each bucket\n foreach (var bucket in buckets) {\n bucket.Sort();\n }\n\n // Step 4: Concatenate all buckets\n int index = 0;\n foreach (var bucket in buckets) {\n foreach (float value in bucket) {\n arr[index++] = value;\n }\n }\n }\n\n public static void Main() {\n float[] arr = {0.42f, 0.32f, 0.23f, 0.52f, 0.25f, 0.47f, 0.51f};\n Sort(arr);\n // Output: 0.23 0.25 0.32 0.42 0.47 0.51 0.52\n Console.WriteLine(string.Join(\" \", arr));\n }\n}\n```\n\n```rust\npub struct BucketSort;\n\nimpl BucketSort {\n pub fn bucket_sort(arr: &mut Vec) {\n let n = arr.len();\n if n <= 1 {\n return;\n }\n\n // Step 1: Create n empty buckets\n let mut buckets: Vec> = vec![Vec::new(); n];\n\n // Step 2: Distribute elements into buckets\n for &value in arr.iter() {\n let mut bucket_index = (n as f32 * value) as usize;\n // Handle edge case where value == 1.0\n if bucket_index == n {\n bucket_index = n - 1;\n }\n buckets[bucket_index].push(value);\n }\n\n // Step 3: Sort each bucket\n for bucket in buckets.iter_mut() {\n bucket.sort_by(|a, b| a.partial_cmp(b).unwrap());\n }\n\n // Step 4: Concatenate all buckets\n let mut index = 0;\n for bucket in buckets.iter() {\n for &value in bucket.iter() {\n arr[index] = value;\n index += 1;\n }\n }\n }\n}\n\nfn main() {\n let mut arr = vec![0.42f32, 0.32f32, 0.23f32, 0.52f32, 0.25f32, 0.47f32, 0.51f32];\n BucketSort::bucket_sort(&mut arr);\n // Output: [0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\n for v in arr {\n print!(\"{} \", v);\n }\n}\n```\n\n```javascript\nfunction bucketSort(arr) {\n const n = arr.length;\n if (n <= 1) return arr;\n\n // Step 1: Create n empty buckets\n const buckets = Array.from({ length: n }, () => []);\n\n // Step 2: Distribute elements into buckets\n for (const value of arr) {\n let bucketIndex = Math.floor(n * value);\n // Handle edge case where value == 1.0\n if (bucketIndex === n) bucketIndex = n - 1;\n buckets[bucketIndex].push(value);\n }\n\n // Step 3: Sort each bucket\n for (const bucket of buckets) {\n bucket.sort((a, b) => a - b);\n }\n\n // Step 4: Concatenate all buckets\n let index = 0;\n for (const bucket of buckets) {\n for (const value of bucket) {\n arr[index++] = value;\n }\n }\n\n return arr;\n}\n\nconst arr = [0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51];\nconsole.log(bucketSort(arr));\n// Output: [0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\n```\n\n```typescript\nfunction bucketSort(arr: number[]): number[] {\n const n = arr.length;\n if (n <= 1) return arr;\n\n // Step 1: Create n empty buckets\n const buckets: number[][] = Array.from({ length: n }, () => []);\n\n // Step 2: Distribute elements into buckets\n for (const value of arr) {\n let bucketIndex = Math.floor(n * value);\n // Handle edge case where value == 1.0\n if (bucketIndex === n) bucketIndex = n - 1;\n buckets[bucketIndex].push(value);\n }\n\n // Step 3: Sort each bucket\n for (const bucket of buckets) {\n bucket.sort((a, b) => a - b);\n }\n\n // Step 4: Concatenate all buckets\n let index = 0;\n for (const bucket of buckets) {\n for (const value of bucket) {\n arr[index++] = value;\n }\n }\n\n return arr;\n}\n\nconst arr: number[] = [0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51];\nconsole.log(bucketSort(arr));\n// Output: [0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\n```\n\n\n---\n\n# Example Walkthrough\n\nTrace the algorithm on the array `[0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51]`. With n = 7 elements, the algorithm creates 7 buckets.\n\n### Distribution Phase\n\nFor each element, compute `bucketIndex = floor(7 * value)`:\n\n\n```shell\nvalue = 0.42 -> floor(7 * 0.42) = floor(2.94) = 2 -> Bucket 2\nvalue = 0.32 -> floor(7 * 0.32) = floor(2.24) = 2 -> Bucket 2\nvalue = 0.23 -> floor(7 * 0.23) = floor(1.61) = 1 -> Bucket 1\nvalue = 0.52 -> floor(7 * 0.52) = floor(3.64) = 3 -> Bucket 3\nvalue = 0.25 -> floor(7 * 0.25) = floor(1.75) = 1 -> Bucket 1\nvalue = 0.47 -> floor(7 * 0.47) = floor(3.29) = 3 -> Bucket 3\nvalue = 0.51 -> floor(7 * 0.51) = floor(3.57) = 3 -> Bucket 3\n```\n\n\nAfter distribution, the buckets look like this:\n\n\n| Bucket | Contents |\n|--------|----------|\n| 0 | [] |\n| 1 | [0.23, 0.25] |\n| 2 | [0.42, 0.32] |\n| 3 | [0.52, 0.47, 0.51] |\n| 4 | [] |\n| 5 | [] |\n| 6 | [] |\n\n\n### Sorting Phase\n\nSort each non-empty bucket individually:\n\n\n| Bucket | Before Sort | After Sort |\n|--------|------------|------------|\n| 1 | [0.23, 0.25] | [0.23, 0.25] |\n| 2 | [0.42, 0.32] | [0.32, 0.42] |\n| 3 | [0.52, 0.47, 0.51] | [0.47, 0.51, 0.52] |\n\n\nBucket 1 was already sorted. Bucket 2 needed one swap. Bucket 3 required a small rearrangement. Sorting many small groups is cheaper than sorting one large group, which is the core efficiency of bucket sort.\n\n### Concatenation Phase\n\nWalk through buckets 0 through 6 in order, collecting all elements:\n\n\n```shell\nBucket 0: (empty)\nBucket 1: 0.23, 0.25\nBucket 2: 0.32, 0.42\nBucket 3: 0.47, 0.51, 0.52\nBucket 4: (empty)\nBucket 5: (empty)\nBucket 6: (empty)\n\nResult: [0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\n```\n\n\n\n```mermaid\nflowchart TD\n subgraph Distribution\n I[\"[0.42, 0.32, 0.23, 0.52, 0.25, 0.47, 0.51]\"]:::primary\n end\n\n subgraph Buckets After Sort\n B0[\"Bucket 0: empty\"]:::secondary\n B1[\"Bucket 1: [0.23, 0.25]\"]:::teal\n B2[\"Bucket 2: [0.32, 0.42]\"]:::teal\n B3[\"Bucket 3: [0.47, 0.51, 0.52]\"]:::teal\n B4[\"Bucket 4-6: empty\"]:::secondary\n end\n\n subgraph Result\n R[\"[0.23, 0.25, 0.32, 0.42, 0.47, 0.51, 0.52]\"]:::green\n end\n\n I --> B0\n I --> B1\n I --> B2\n I --> B3\n I --> B4\n\n B0 --> R\n B1 --> R\n B2 --> R\n B3 --> R\n B4 --> R\n\n classDef primary fill:#00ceff,stroke:#000,color:#000\n classDef secondary fill:#ffa94d,stroke:#000,color:#000\n classDef teal fill:#38d9a9,stroke:#000,color:#000\n classDef green fill:#69db7c,stroke:#000,color:#000\n```\n\n\n---\n\n# Complexity Analysis\n\nThe performance of bucket sort depends heavily on how evenly the input data is distributed across buckets. In this chapter, **b** refers to the number of buckets, to avoid clashing with **k** used in counting sort (value range) and radix sort (digit base).\n\n\n| Case | Time Complexity | Explanation |\n|------|----------------|-------------|\n| **Best** | O(n + b) | Elements are uniformly distributed, each bucket has ~1 element, no real sorting needed within buckets. |\n| **Average** | O(n + n^2/b + b) | With b = n buckets and uniform distribution, this simplifies to O(n). Each bucket has O(1) elements on average. |\n| **Worst** | O(n^2) | All elements land in a single bucket and the inner sort is insertion sort. If the inner sort is a comparison sort like merge sort, the worst case is O(n log n). |\n| **Space** | O(n + b) | n elements stored across b buckets, plus the bucket array itself. |\n\n\nA more detailed look at the average case: with n elements and b buckets, uniform distribution gives each bucket approximately n/b elements. Sorting one bucket with insertion sort costs O((n/b)^2). Across b buckets, the total sorting cost is:\n\n\n```shell\nb * O((n/b)^2) = O(n^2 / b)\n```\n\n\nAdding the O(n) distribution step and O(b) bucket creation:\n\n\n```shell\nTotal = O(n + n^2/b + b)\n```\n\n\nWhen b = n (number of buckets equals number of elements), this becomes:\n\n\n```shell\nO(n + n^2/n + n) = O(n + n + n) = O(n)\n```\n\n\nWith b = n buckets and uniform distribution, bucket sort runs in linear time.\n\n### Stability\n\nWhether bucket sort is **stable** depends on the implementation, specifically the inner sorting algorithm and how elements are collected during concatenation. A stable inner sort (like insertion sort), combined with collecting elements in the original insertion order within each bucket, produces an overall stable sort. An unstable inner sort (like quicksort) makes bucket sort unstable.\n\nFor example, consider sorting `[0.32a, 0.45, 0.32b]` with 3 buckets. Both `0.32a` and `0.32b` land in the same bucket in that order. If the inner sort is insertion sort, the bucket stays `[0.32a, 0.32b]` and the final output preserves the original order. If the inner sort is quicksort, the bucket may end up as `[0.32b, 0.32a]`, breaking stability.\n\nWhen stability matters, pair bucket sort with insertion sort or another stable inner sort.\n\n---\n\n# When to Use Bucket Sort\n\nBucket sort is not a general-purpose sorting algorithm. It performs well in specific scenarios and poorly in others.\n\n### Good Use Cases\n\n- **Uniformly distributed floating point numbers.** Sorting random numbers between 0 and 1 is the canonical use case, where bucket sort is hard to beat.\n- **Data with a known, bounded range.** Values that fall between 0 and 10,000 (for example) allow buckets to cover that range efficiently.\n- **Sorting by a hash or computed key.** When the distribution function produces well-spread keys, bucket sort performs well.\n- **External sorting.** When data does not fit in memory, bucket sort's divide-and-conquer-by-range approach maps naturally to disk-based sorting, where each bucket can be a separate file.\n- **Histogram-based applications.** Applications that already bin data into ranges (statistics, image processing) fit bucket sort naturally.\n\n### Poor Use Cases\n\n- **Skewed or clustered distributions.** When most values cluster in a narrow range, most elements end up in the same bucket, and the inner sort runs on nearly all the data. Example: sorting ages of college students (mostly 18-22) with buckets spanning 0-100.\n- **Unknown value range.** Without the min and max, bucket indices cannot be computed efficiently. An extra pass would be required to find the range first.\n- **Small arrays.** The overhead of creating and managing buckets is not worth it for small inputs. A simple insertion sort or the language's built-in sort is faster.\n- **Integer data with large range.** Integers spread across billions need an impractical number of buckets. Radix sort is a better choice.\n\n\n```mermaid\nflowchart TD\n Q1{\"Known value range?\"}:::primary\n Q2{\"Uniform distribution?\"}:::primary\n Q3{\"Floating point data?\"}:::primary\n\n Y1[\"Bucket Sort\"]:::green\n N1[\"Consider Radix Sort
or Comparison Sort\"]:::secondary\n N2[\"Consider Counting Sort
or Comparison Sort\"]:::secondary\n N3[\"Comparison Sort
(Merge/Quick Sort)\"]:::red\n\n Q1 -->|Yes| Q2\n Q1 -->|No| N3\n Q2 -->|Yes| Q3\n Q2 -->|No| N1\n Q3 -->|Yes| Y1\n Q3 -->|No| N2\n\n classDef primary fill:#00ceff,stroke:#000,color:#000\n classDef secondary fill:#ffa94d,stroke:#000,color:#000\n classDef green fill:#69db7c,stroke:#000,color:#000\n classDef red fill:#ff8787,stroke:#000,color:#000\n```\n\n\n---\n\n# Quiz","pageType":"dsa"}

Get Premium