Last Updated: March 31, 2026
Combinatorics is the study of counting, and it shows up in coding interviews whenever a problem asks “how many ways” something can be done.
A direct approach quickly becomes infeasible as input sizes grow. Combinatorics provides the tools to count efficiently using formulas, patterns, and mathematical reasoning instead of enumeration.
In this chapter, we will build a clear understanding of key concepts like permutations and combinations, and learn how to apply them to solve counting problems in a structured and efficient way.
The two fundamental building blocks of combinatorics are permutations and combinations. The difference comes down to one question: does order matter?
Formula for Permutations:
Formula for Combinations:
Notice that C(n, r) = P(n, r) / r!, because combinations remove the r! orderings that permutations count separately.
Quick example: From a group of 5 people {A, B, C, D, E}, suppose we want to pick 3.
The factorial function is the foundation of all combinatorial formulas. For a non-negative integer n:
Key properties you should know:
Pascal's Triangle is one of the most elegant structures in mathematics, and it gives us a way to compute C(n, r) without ever computing a factorial.
The idea is simple. Each entry in the triangle is the sum of the two entries directly above it:
This is called Pascal's identity. It works because choosing r items from n items can be split into two cases: either a specific item is included (leaving C(n-1, r-1) choices) or it is not (leaving C(n-1, r) choices).
Here is what the first six rows look like:
Before we move to implementation, here are identities you should know.
| Identity | Formula | Meaning |
|---|---|---|
| Symmetry | C(n, r) = C(n, n-r) | Choosing r to include = choosing n-r to exclude |
| Row sum | C(n,0) + C(n,1) + ... + C(n,n) = 2^n | Total subsets of an n-element set |
| Pascal's identity | C(n, r) = C(n-1, r-1) + C(n-1, r) | Include or exclude one element |
| Vandermonde's identity | C(m+n, r) = sum of C(m,k)*C(n,r-k) | Choosing from two groups |
| Hockey stick | C(r,r) + C(r+1,r) + ... + C(n,r) = C(n+1, r+1) | Sum along a diagonal |
The symmetry identity is especially useful for optimization. If you need C(100, 97), compute C(100, 3) instead, which requires far fewer multiplications.
When n is large (say, up to 10^6), the values of C(n, r) become astronomically large. Interview problems typically ask you to return the answer modulo 10^9 + 7 (a prime number). This is where modular arithmetic becomes essential.
The approach has three parts:
To compute modular inverse, we use Fermat's little theorem. For a prime p:
This works because a^(p-1) = 1 (mod p) for any a not divisible by p, so a^(p-2) * a = 1 (mod p).
The key optimization is computing inv_fact backwards. Instead of calling modular exponentiation n times, we compute inv_fact[n] once using fast exponentiation, then derive inv_fact[i] = inv_fact[i+1] * (i+1) mod p going downward. This brings the total precomputation to O(n) time.
The stars and bars technique answers a classic question: how many ways can you distribute n identical items into k distinct bins?
The answer is C(n + k - 1, k - 1).
The intuition is beautiful. Imagine n stars in a row representing the items. You need k-1 bars to separate them into k groups. The total number of symbols is n + k - 1, and you are choosing where to place the k - 1 bars.
Example: Distribute 5 identical balls into 3 bins.
We need C(5 + 3 - 1, 3 - 1) = C(7, 2) = 21 ways.
One such distribution: **|*|** means 2 balls in bin 1, 1 in bin 2, 2 in bin 3.
This technique is the foundation for many interview problems involving distributing resources, partitioning numbers, or counting solutions to equations like x1 + x2 + x3 = n where xi >= 0.
Let us build the first 5 rows step by step.
Using the cancellation trick:
There are 120 ways to choose 3 items from 10.
How many unique paths are there from the top-left to the bottom-right of a 4x3 grid, moving only right or down?
To go from (0,0) to (3,2), we must make exactly 3 right moves and 2 down moves, for a total of 5 moves. The number of paths is the number of ways to choose which 2 of the 5 moves are "down" (or equivalently, which 3 are "right"):
More generally, for an m x n grid: paths = C(m + n - 2, m - 1).
Below are three implementations: factorial-based nCr (for small values), Pascal's Triangle via DP, and nCr with modular arithmetic using precomputed factorials and inverse factorials.
This approach computes nCr directly using the cancellation trick. It works well for small n (roughly n < 20 to avoid overflow without modular arithmetic).
This approach builds the entire triangle up to row n. It is useful when you need multiple nCr values for the same n, or when you want to avoid division entirely.
This is the approach you need for competitive programming and most interview problems involving large n. It precomputes factorials and inverse factorials modulo a prime (10^9 + 7), allowing O(1) nCr queries after O(n) preprocessing.
| Approach | Time | Space | Best For |
|---|---|---|---|
| Factorial-based nCr | O(r) per query | O(1) | Small n, single query |
| Pascal's Triangle | O(n^2) build, O(1) query | O(n^2) | Multiple queries, small n |
| Modular nCr (precomputed) | O(n) build + O(log p) for inverse, O(1) query | O(n) | Large n, mod arithmetic |
For most interview problems, the modular approach is what you need. Precompute once, query in O(1). The Pascal's Triangle approach is great when n is small (say, under 1000) and you want to avoid dealing with modular inverse.
Here is a quick reference of how combinatorics maps to common problem types:
| Problem | Combinatorial Formula |
|---|---|
| Subsets of size k from n elements | C(n, k) |
| Total subsets of n elements | 2^n |
| Grid paths (m x n, only right/down) | C(m+n-2, m-1) |
| Distribute n identical items into k bins | C(n+k-1, k-1) |
| Arrange n items with duplicates (a of type 1, b of type 2, ...) | n! / (a! * b! * ...) |
| Choose and arrange r from n | P(n, r) = n! / (n-r)! |
| Catalan number (balanced parentheses, BSTs) | C(2n, n) / (n+1) |