AlgoMaster Logo

Minimize Malware Spread

Last Updated: March 28, 2026

hard
4 min read

Understanding the Problem

The problem boils down to: given an undirected graph and a set of initially infected nodes, figure out which single node's removal from the initial set would leave the fewest total infected nodes after the malware finishes spreading.

The critical insight is that malware spreads across entire connected components. If a connected component contains at least one initially infected node, every node in that component will eventually be infected. So the question becomes: which initially infected node, when removed, would "save" the most nodes from infection?

Here's where it gets interesting. Removing an initially infected node only saves its component if that node is the sole initially infected node in that component. If a component has two or more nodes from initial, removing just one of them won't help because the other one will still infect the entire component. So the best candidate for removal is the node that's the only initially infected node in its component, and whose component is the largest.

Key Constraints:

  • 2 <= n <= 300 → With n up to 300, the adjacency matrix has at most 90,000 entries. O(n^2) is perfectly fine, and even O(n^3) would be under 27 million operations.
  • graph[i][j] == graph[j][i] → Undirected graph. We can use Union Find without worrying about edge direction.
  • 1 <= initial.length <= n → The initial set can be as large as the entire graph. Edge case: every node might be initially infected.

Approach 1: Brute Force (Simulate Each Removal)

Intuition

The most direct approach: try removing each node from initial one at a time, simulate the malware spread using BFS, and count how many nodes end up infected. Pick the removal that yields the fewest infections.

For each candidate node to remove, we create a modified initial set (excluding that node), then run BFS from every remaining initially infected node to find all reachable nodes. The total number of reachable nodes is the infection count for that removal.

Algorithm

  1. Sort the initial array (so we naturally handle tie-breaking by smallest index).
  2. For each node removeNode in initial:

a. Create a set of remaining infected nodes = initial minus removeNode.

b. Run BFS from all remaining infected nodes to count total infected nodes.

  1. Return the removeNode that resulted in the smallest infection count.

Example Walkthrough

1Initial infected nodes: [0, 1]. Try removing each one.
0
0
1
1
1/4

Code

We're running a full BFS for every node in initial, but the component structure doesn't change between removals. What if we computed the connected components just once?

Approach 2: Connected Components (DFS)

Intuition

Here's the key observation that makes this problem elegant: malware spreads across entire connected components. If any node in a component is initially infected, all nodes in that component become infected. So we only need to think at the component level.

The question becomes: for each connected component, how many nodes from initial does it contain?

  • If a component contains zero nodes from initial, it stays clean regardless of what we remove.
  • If a component contains exactly one node from initial, removing that node saves the entire component from infection.
  • If a component contains two or more nodes from initial, removing any one of them doesn't help because the others will still infect the whole component.

So the optimal node to remove is the one that's the sole infected node in the largest component. If no node is uniquely responsible for infecting its component, every removal saves zero nodes, and we return the smallest index in initial.

Algorithm

  1. Find all connected components using DFS. Assign each node a component ID and compute each component's size.
  2. For each component, count how many nodes from initial belong to it.
  3. For each node in initial: if it's the only initially infected node in its component, its "savings" equals the component size.
  4. Return the node with the highest savings. Break ties by returning the smallest index.

Example Walkthrough

1Initialize: no components assigned yet
0
-1
start DFS
1
-1
2
-1
1/6

Code

The DFS approach works well but requires explicit stack management. Union Find can build the same component structure more elegantly.

Approach 3: Union Find (Optimal)

Intuition

Union Find is a natural fit for connected component problems. Instead of running DFS to discover components, we can scan the adjacency matrix and union nodes that are connected. After processing all edges, nodes in the same component share the same root. We can then query component sizes and count initial nodes per component in a single pass.

The logic is identical to Approach 2: find a node in initial that's the sole infected node in its component, and pick the one in the largest component. Union Find just gives us a more elegant way to find and track components.

Algorithm

  1. Initialize a Union Find structure with n nodes.
  2. Scan the adjacency matrix. For each edge graph[i][j] == 1 where i < j, union nodes i and j.
  3. After all unions, for each node in initial, find its root (component representative).
  4. Count how many nodes from initial share each root. Also track each component's total size.
  5. Iterate through initial (sorted). For each node whose component has exactly 1 initial node, its savings equals the component size.
  6. Return the node with the highest savings, breaking ties by smallest index.

Example Walkthrough

1Initialize Union Find: each node is its own parent
0
0
1
1
2
2
1/6

Code