AlgoMaster Logo

Repeated String Match

Last Updated: April 2, 2026

medium
5 min read

Understanding the Problem

We need to find the smallest number of times we can repeat string a so that b appears somewhere inside the repeated string. If b can never appear, we return -1.

Think about it this way: if b = "cdab" and a = "abcd", then b straddles a boundary between two copies of a. The "cd" comes from the end of one copy and the "ab" comes from the start of the next. So we need at least 2 copies of a to cover b, but in some cases we need one extra copy to handle the overlap at the boundary.

The key observation is this: we never need more than ceil(len(b) / len(a)) + 1 repetitions of a. Once we have enough copies to cover the length of b plus one extra copy for potential boundary alignment, either b is a substring or it never will be.

Key Constraints:

  • 1 <= a.length, b.length <= 10^4 -- Both strings can be up to 10,000 characters. The repeated string could be up to about 20,000 characters, so string operations on it are feasible.
  • a and b consist of lowercase English letters -- No special characters to worry about. Standard ASCII comparison.

Approach 1: Brute Force (Repeated Concatenation)

Intuition

The simplest idea: keep appending copies of a to a growing string until either b is found as a substring or we know it will never appear.

How do we know when to stop? If b has length m and a has length n, we need the repeated string to be at least as long as b. That means we need at least ceil(m / n) copies. But b might start partway through one copy of a and end partway through the next, so we might need one additional copy to catch that overlap. If b is not found after ceil(m / n) + 1 copies, it will never be found.

Why not? Because adding more copies of a beyond that point only extends the string further to the right. The substring b has already had every possible starting alignment within one full period of a. One extra copy covers all boundary cases.

Algorithm

  1. Initialize repeated as an empty string and count as 0.
  2. Keep appending a to repeated until repeated.length >= b.length.
  3. Check if b is a substring of repeated. If yes, return count.
  4. Append one more copy of a (to handle boundary overlap) and check again.
  5. If still not found, return -1.

Example Walkthrough

1Start: a="abc", b="cabcab". Build repeated until len >= 6
1/6

Code

The bottleneck here is the substring search. What if we could check whether b matches at each position in O(1) amortized time using a hash function?

Approach 2: Rabin-Karp (Rolling Hash)

Intuition

Instead of checking every possible alignment character by character, we can use a rolling hash to speed up the substring search. The idea behind Rabin-Karp: compute a hash of b, then slide a window of length len(b) across the repeated a string. At each position, compute the hash of the current window. If the hashes match, do a full character comparison to confirm (since hash collisions are possible).

The clever part is that we do not need to actually build the entire repeated string. We can simulate the repeated string by using modular indexing: position i in the repeated string corresponds to character a[i % len(a)]. This saves space and keeps the logic clean.

Rolling the hash forward by one position takes O(1) work: subtract the contribution of the outgoing character, multiply by the base, and add the incoming character. So we check all positions in O(n + m) time total.

Algorithm

  1. First, check if every character in b exists in a. If not, return -1 immediately.
  2. Compute the minimum repetitions needed: minReps = ceil(len(b) / len(a)). We will search over minReps + 1 repetitions (one extra for boundary alignment).
  3. Compute the hash of b using a polynomial rolling hash.
  4. Compute the hash of the first len(b) characters of the (virtual) repeated string.
  5. Slide the window one position at a time across all valid starting positions. At each position:
    • If the hash matches, verify character by character using modular indexing.
    • If confirmed, return ceil((startPos + len(b)) / len(a)) as the number of repetitions needed.
  6. If no match is found after checking all positions, return -1.

Example Walkthrough

1Virtual string: "abcabcabc" (3 copies). Slide window of len 6.
0
a
1
b
2
c
3
a
4
b
5
c
window
6
a
7
b
8
c
1/6

Code

Rabin-Karp is theoretically optimal, but it adds a lot of implementation complexity. Is there a way to get similar practical performance with much simpler code?

Approach 3: Optimized Concatenation (Practical Optimal)

Intuition

The Rabin-Karp approach is theoretically elegant but adds implementation complexity with modular arithmetic and modular inverse. In practice, for this problem's constraints (strings up to 10^4), a simpler approach works just as well and is much easier to write in an interview.

The idea: compute the exact number of repetitions needed mathematically, build the string, and use the language's built-in substring search. Most languages implement efficient string matching internally (often using variations of Boyer-Moore or similar), so the built-in contains/indexOf/find is fast in practice.

The difference from Approach 1 is that we compute the count mathematically upfront rather than concatenating in a loop, and we add an early exit by checking the character set first. We know we need ceil(len(b) / len(a)) copies at minimum, and at most one extra. So we check exactly two candidates.

Algorithm

  1. Check if all characters in b exist in a. If not, return -1.
  2. Compute minReps = ceil(len(b) / len(a)).
  3. Build repeated = a * minReps (repeat a exactly minReps times).
  4. If b is a substring of repeated, return minReps.
  5. Otherwise, build repeated = a * (minReps + 1) and check again.
  6. If still not found, return -1.

Example Walkthrough

1a="ab", b="babab" (len=5). minReps = ceil(5/2) = 3
1/5

Code