AlgoMaster Logo

Introduction to Bitmask DP

Last Updated: April 5, 2026

Ashish

Ashish Pratap Singh

Bitmask DP is used when the state involves a subset of elements. Instead of explicitly storing sets, we represent them as binary masks, allowing us to efficiently track which elements are included and perform fast transitions.

Bitmask DP lets you track subsets of items using plain integers. It turns problems like "assign n people to n tasks" or "visit all n cities" into DP problems where the state is just a number between 0 and 2^n - 1.

But there is a catch. It only works when n is small (typically n <= 20), because the number of subsets grows exponentially. But when the constraint fits, bitmask DP transforms seemingly impossible brute-force problems into clean, efficient solutions.

What Is Bitmask DP?

At its heart, bitmask DP is dynamic programming where each state represents a subset of elements, encoded as a binary number.

Consider a set of 4 elements: {A, B, C, D}. Each element is either included or not, which we can represent with a bit. The bitmask 1011 (which equals 11 in decimal) means elements at indices 0, 1, and 3 are included, while element 2 is not.

Here is a quick mapping for n = 3 elements {a, b, c}:

DecimalBinarySubset
0000{} (empty)
1001{a}
2010{b}
3011{a, b}
4100{c}
5101{a, c}
6110{b, c}
7111{a, b, c}

Notice how every possible subset maps to exactly one integer. This is the power of bitmask representation: you compress a subset into a single number, and that number becomes your DP state.

The following diagram shows the subset lattice for n = 3. Each node is a bitmask, and edges connect subsets that differ by exactly one element.

Each level represents subsets of a particular size. The DP transitions move upward through this lattice, building larger subsets from smaller ones.

Bit Manipulation Essentials

Before diving into problems, you need a handful of bit operations. These are the building blocks you will use in bitmask DP solutions.

OperationCode (Java)Example (mask = 1010, i = 1)Result
Check if i-th bit is set(mask >> i) & 1(1010 >> 1) & 11 (set)
Set the i-th bitmask | (1 << i)1010 | 00101010 (no change, already set)
Clear the i-th bitmask & ~(1 << i)1010 & 11011000
Toggle the i-th bitmask ^ (1 << i)1010 ^ 00101000
Count set bitsInteger.bitCount(mask)bitCount(1010)2
Check if mask is 0mask == 01010 == 0false
Full mask (all n bits set)(1 << n) - 1(1 << 4) - 11111 (15)

Let us walk through the most common pattern: iterating over all elements and trying to add each unused one to the current subset.

This loop checks each element. If it is not already in the subset (bit is 0), we create a new mask with that element included and make a DP transition. This pattern shows up in nearly every bitmask DP problem.

How to Identify Bitmask DP Problems

The strongest signal is a small n combined with subset selection. When you see n <= 20 (or sometimes n <= 15) and the problem involves choosing, assigning, or visiting elements, bitmask DP should be on your radar.

Here are the common problem types:

Problem TypeSignalExamples
Assignment/MatchingAssign n items to n slots optimallyMinimum Cost to Connect Sticks, Assign Tasks
Travelling SalesmanVisit all n nodes exactly onceShortest Hamiltonian Path
Subset PartitioningPartition n items into groupsPartition to K Equal Sum Subsets
Set CoverCover all elements using fewest subsetsMinimum Number of Work Sessions
CountingCount valid arrangements of n itemsNumber of Ways to Wear Hats

The key constraint to look for is n <= 20. If n can be up to 1000, bitmask DP is not the right approach because 2^1000 states are far too many. But 2^20 is about one million, which is perfectly manageable.

Another signal is when the problem asks you to track "which elements have been used." If you find yourself thinking about maintaining a boolean visited array across recursive calls, that visited array can often be compressed into a bitmask.

The Core Idea

The general framework for bitmask DP is:

  1. State: dp[mask] represents the answer (minimum cost, number of ways, boolean feasibility, etc.) when the set of elements included so far is described by mask.
  2. Base case: Usually dp[0] = initial value (empty subset).
  3. Transition: For each mask, try adding each element i that is not yet in the mask. Compute dp[mask | (1 << i)] from dp[mask].
  4. Answer: Typically dp[(1 << n) - 1], the state where all elements are included.

The time complexity is **O(2^n * n) in most cases: there are 2^n possible masks, and for each mask you iterate over n elements. The space complexity is O(2^n)** for the DP table.

This diagram shows how DP states transition from the empty set toward the full set. Each arrow represents adding one element. You can think of it as a BFS through subsets, where each level adds one more element.

Example Walkthrough: Partition to K Equal Sum Subsets (LeetCode 698)

Problem Statement

Given an integer array nums and an integer k, return true if it is possible to divide the array into k non-empty subsets whose sums are all equal.

Example: nums = [4, 3, 2, 3, 5, 2, 1], k = 4 Total sum = 20, target sum per subset = 20 / 4 = 5. One valid partition: {5}, {4, 1}, {3, 2}, {3, 2}. Return true.

Constraints: 1 <= k <= nums.length <= 16

Notice n <= 16. That is a strong hint toward bitmask DP.

Building the Intuition

The brute force approach would try every possible way to partition n elements into k groups. That is astronomically expensive. But we can reframe the problem.

Instead of thinking about "which group does each element go to," think about it differently: we process elements one by one, adding each to the current bucket. When a bucket reaches the target sum, we start a new bucket. The key question at each step is: "given the set of elements I have already used, can I successfully partition the remaining elements?"

This is where bitmask DP fits perfectly. We define:

  • State: dp[mask] = the current sum within the bucket we are currently filling, given that the elements indicated by mask have already been placed.

If dp[mask] equals the target, that means a bucket just got completed, so we reset to 0 and start filling the next bucket. If dp[mask] + nums[i] would exceed the target, we skip element i for this bucket.

The answer is true if dp[(1 << n) - 1] is well-defined (we managed to place all elements).

Here is a cleaner way to think about it. We use dp[mask] to store the sum of the current (partially filled) bucket. When the sum hits target, the bucket is full and we start fresh at 0 for the next bucket. The value stored is actually sum_of_used_elements % target, which tells us how full the current bucket is.

Step-by-Step Trace

Let us trace through a few key transitions with nums = [4, 3, 2, 3, 5, 2, 1], k = 4, target = 5.

We use -1 to mean "this state is unreachable."

The key insight is that every time the current bucket sum hits the target, we reset to 0. If we manage to fill all elements (reach the full mask) with a non-negative dp value, it means we successfully partitioned everything into buckets of exactly target sum.

Implementation

Complexity Analysis

  • Time Complexity: O(2^n n). We iterate over all 2^n masks, and for each mask we try all n elements. With n = 16, this is about 16 65536 = roughly 1 million operations.
  • Space Complexity: O(2^n) for the DP array. With n = 16, this is 65536 integers, well within memory limits.