Minimum Window Substring: A Step-by-Step Interview Walkthrough

2 days ago
15 min read

Minimum Window Substring is the problem that separates people who've used sliding windows from people who understand them. Most candidates have seen a window expand and contract on simpler problems, but this one forces two harder questions at once: how do you efficiently check whether the current window contains everything you need (including duplicates), and how do you know when to expand versus when to shrink? It's tagged Hard for good reason — the naive approach is straightforward but slow, and the efficient approach requires a genuinely clever bookkeeping trick to make the validity check O(1) instead of a full rescan. Interviewers reach for it because it tests whether you can take the sliding-window pattern you already know and extend it to a setting where "is this window valid?" is itself non-trivial. The signal is whether you can manage two pointers and a running notion of validity simultaneously without re-examining the whole window every step.

Finding the smallest span that contains a required set of items is a real, recurring task. Search engines build result snippets by finding the shortest passage of a document that contains all your query terms. Log-analysis tools find the smallest time window in which every stage of an error sequence appears. Bioinformatics tools find the shortest DNA segment containing all of a set of target markers. Network analyzers find the smallest packet capture window covering every step of a handshake. Any time you need the tightest contiguous region that still covers a required collection, you're solving exactly this problem.

Problem Statement

Given two strings s and t, return the minimum-length substring of s that contains every character of t, including duplicates.

If no such substring exists, return the empty string "".

If there are multiple windows of the same minimum length, any one of them is acceptable.

Example:

s = "ADOBECODEBANC", t = "ABC"

result → "BANC".

That four-character window contains an A, a B, and a C, and no shorter window of s covers all three.

1. Clarify Requirements Before Jumping Into Code

Before writing a single line, we should restate the problem in our own words and ask the kinds of questions that would change our approach.

Input: two strings, s (the text we search in) and t (the characters we must cover).

Output: the shortest substring of s containing all of t's characters with the right multiplicities, or "" if none exists.

Details to clarify:

Does multiplicity matter? Yes. If t = "AABC", the window must contain two A's, not just one. This is the detail that makes the validity check harder than a simple "contains" test.
Does the window have to be contiguous? Yes. It's a substring, not a subsequence. The characters must appear together in one unbroken span.
Does order within the window matter? No. The window just needs to contain the required characters; their arrangement is irrelevant.
What if t is longer than s? Then no window can possibly cover it. Return "".
Are there character-set constraints? Assume general characters (the approach works for any alphabet).

The thing to flag is that "contains all of t" is a multiset condition, not a set condition. We're not checking which distinct characters are present. We're checking that each required character appears at least as many times in s as it does in t. Getting this multiplicity handling right is the heart of the problem, and it's where a naive "do I have all the letters?" check silently goes wrong.

2. Identify the Category of the Question

A few signals jump out:

We're looking for a contiguous substring — a window.
We want to minimize the window's length subject to a coverage condition.
The coverage condition (contains all of t) gets easier to satisfy as the window grows and harder as it shrinks.

That combination — a contiguous window, a condition that's monotonic in window size, minimize the length — is the fingerprint of the sliding-window pattern, specifically the variable-size (expand/contract) flavor. The same family as Longest Substring Without Repeating Characters and Minimum Size Subarray Sum, but with a more complex validity condition. The monotonic structure ("bigger windows are more likely valid, smaller ones less likely") is exactly what lets two pointers sweep the string once instead of re-examining every substring.

3. Brute Force Solution

Let's think about the naive approach to understand the problem. The direct idea: examine every possible substring of s, check whether it contains all of t, and keep the shortest valid one.

best = ""
for start in 0..n-1:
    for end in start..n-1:
        window = s[start..end]
        if window contains all of t (with multiplicity):
            if window is shorter than best:
                best = window
return best

There are O(n²) substrings, and checking each one against t takes O(n + m) (count the window's characters, compare to t's). That's O(n³) or so — far too slow for large inputs.

The brute force does clarify what we're after, though, and it shows us two sources of waste worth naming. First, we re-check overlapping substrings from scratch every time, even though s[start..end+1] is just s[start..end] plus one character. Second, we're testing every window, including ones we could rule out by reasoning. If a window starting at start is already valid, then every longer window from the same start is also valid but no shorter, so there's no point checking them. That observation about how validity behaves as the window grows is the seed of the efficient solution.

4. Brainstorm More Solutions

Step 1: Notice that validity is monotonic in window size

Let's think about how the "contains all of t" condition behaves as we change the window.

If we fix the left end of our window and slide the right end outward, growing the window one character at a time, each character we add can only help coverage: it can satisfy a still-unmet requirement or pad one we've already met, but it can never un-cover a character we already found. So as a window grows, it goes from invalid to valid; and, it stays valid.

The reverse is also true: as we shrink a window, coverage can only get worse. A valid window, contracted far enough, eventually becomes invalid.

What this means is that for each left position, there's a single threshold right position where the window first becomes valid — and that's the shortest valid window starting at that left position. We never need to check the longer windows from the same left, because they can't beat it. So the search isn't really over all O(n²) substrings; it's over a much smaller set of "tightest valid windows."

Step 2: Turn the observation into a two-pointer sweep

We know from Step 1 that for each left position there's a tightest valid window, and that growing a window preserves validity while shrinking it eventually destroys validity. Let's use those two facts to figure out how a pair of pointers, left and right, should move.

Start with the window empty and ask: what do we do when the current window is invalid (missing some of t)? Shrinking it (moving left up) only removes characters, which can never fix an invalid window, since it makes coverage worse. So an invalid window gives us only one productive move: bring in more characters by advancing right. We push right forward until the window finally covers all of t.

Now we're at a valid window, and the question flips. We have a window that covers t, but it might be too big, padded with extra characters on the left that aren't pulling their weight. Could there be a shorter valid window with this same right end? Maybe — so we try advancing left and dropping characters, checking the length each time we still have a valid window. We keep trimming until one more step would break coverage. At that point we've found the tightest valid window for this particular right end, and trimming further is pointless.

But once we've trimmed left as far as it'll go, would we ever want to move left backward again? Moving left back only lengthens the window, and a longer window can never beat the tight one we just recorded. So left, like right, only ever needs to move forward. Right advances to discover new windows, left advances to tighten them, and once each has passed a particular position, it has no reason to return.

So the rules we've derived assemble into a clean expand-then-contract sweep:

Expand: advance right, pulling in characters, until the window becomes valid.
Contract: advance left, dropping characters and recording the length whenever the window is still valid, until one more step would break validity.
Repeat: the window is now just-invalid again, so go back to expanding right, then contract again — continuing until right runs off the end of the string.

Because each pointer crosses the string at most once, this sweep is linear — provided we can check "is the window valid?" cheaply at every step. Let's figure that out next.

Step 3: How do we check validity without rescanning?

The expand-then-contract sweep is only fast if checking "does the current window cover all of t?" is cheap. If we recount the window's characters and compare to t on every pointer move, each check is O(m) and we've gained nothing over brute force.

Can we design an O(1) validity check that updates incrementally as characters enter and leave the window? To do that, we need to track at all times whether the window currently covers t, and update that knowledge with each single-character change rather than recomputing it.

The natural instinct when we need O(1) checks against a flexible set is to use a map. So first, let's capture what we need: a map need where need[c] is the number of times character c appears in t. This map captures all of the requirements we must satisfy, i.e., all of the characters in t as well as the number of times each character appears in t.

Now, as the window changes, we also need to track what's inside our window. To do that, we could maintain a second map window counting characters currently inside our current window. The window is valid exactly when, for every character c in need, window[c] >= need[c].

But checking that across all distinct characters every time is still O(distinct chars). Can we do better?

Step 4: Make the validity check O(1) by tracking what changes

Even after our optimization in step 3, we're still doing a lot of rescanning. Let's see if we can find a way to eliminate the need to scan over the set of distinct characters.

Each step adds or removes exactly one character. A single character touches exactly one requirement — the requirement for that character. So between two consecutive windows, at most one requirement can flip from unmet to met, or from met to unmet. Everything else about coverage is unchanged. Recomputing the entire validity picture every step is therefore wasteful: we're recalculating a pile of facts that didn't move to detect a change in the one that did.

That suggests the fix. Rather than re-derive validity from scratch, let's store the one number that summarizes it and nudge that number as things change. To do that, we need to understand how many requirements are currently satisfied. Let's call that formed: the count of distinct characters c whose window count has reached need[c]. Since the window is valid exactly when every distinct requirement is met, validity becomes the single comparison formed == required. That's O(1), which is what we wanted — but only if we can keep formed accurate cheaply.

So the real work is updating formed correctly as one character enters or leaves. Let's think carefully about when it should actually change.

Think about adding a character c. When we add c to our window, we increment window[c]. Does that change how many requirements are satisfied? Only if this particular character is what tipped c's requirement from unmet to met — that is, only if window[c] is now equal to need[c] for the first time. If window[c] was already at or above need[c], the requirement was already counted in formed, and piling on another copy satisfies nothing new. That means test must be equality, window[c] == need[c], not >=: equality fires exactly once, at the moment of crossing, so we never count the same requirement twice. When it fires, we can increment formed.

Removing a character d is the mirror image. We decrement window[d]. The only way that change breaks coverage is if it drops d below its requirement — if window[d] was exactly equal to need[d] and now falls under it. If we had a surplus of d, removing one still leaves the requirement met, and formed shouldn't budge. So the test is window[d] < need[d] after decrementing; when it's true, decrement formed.

The pattern in both cases is the same: formed moves by exactly one, and only at the precise instant a requirement crosses its threshold in either direction. Because we only ever touch it at those crossings, formed stays a faithful count of satisfied requirements at all times — and so formed == required is always a correct, O(1) validity test. With that, every pointer move from Step 2 does constant work, and the linear sweep we designed is genuinely linear.

Step 5: Assemble the algorithm

Putting the pieces together:

Initialize our key variables:
1. Build need from t,
2. set a variable required to the number of distinct characters in t,
3. Initialize formed = 0,
4. an empty window map,
5. left = 0, right = 0
6. and a record of the best (shortest) valid window seen.
Move right across s. For each character: add it to window, and bump formed if it just completed a requirement.
Whenever formed == required (window valid), enter a contraction loop: record the current window if it's the shortest so far, then remove s[left], decrement formed if that broke a requirement, and advance left. Keep contracting while the window stays valid.
When the sweep finishes, return the best window recorded, or "" if none was ever valid.

The elegance is that expansion and contraction interleave naturally: right advances in the outer loop, and the inner while (formed == required) loop greedily shrinks from the left every time we achieve validity.

Step 6: Walk through the example

Let's trace s = "ADOBECODEBANC", t = "ABC". So need = {A:1, B:1, C:1}, required = 3.

We expand right, tracking formed:

Add A, D, O, B (formed reaches 2 after B), E, then C at index 5 → formed = 3. Window is s[0..5] = "ADOBEC", valid. Contract: record length 6. Remove A (window[A] drops below need) → formed = 2, left = 1. Window now invalid; stop contracting.
Keep expanding: O, D, E, B, then A at index 10 → formed = 3 again. Window s[1..10] = "DOBECODEBA". Contract: record length 10, then shrink — D, O, B, E all leave without breaking validity (they're either unneeded or still above their requirement), recording lengths 9, 8, 7, 6 along the way, until removing C at index 5 breaks validity → formed = 2, left = 6. Best so far: length 6 ("CODEBA").
Keep expanding: N, then C at index 12 → formed = 3. Window s[6..12]. Contract: shrink from the left — O, D leave (recording lengths 7, then 6, no improvement), then removing E gives length 5 ("EBANC", an improvement), then we're at s[9..12] = "BANC", length 4 — a new best. Removing B next breaks validity → stop. left = 10.
right reaches the end. Done.

Best window: "BANC", length 4. Correct — and notice the answer emerged from continuously tightening windows, never re-examining any substring from scratch.

Step 7: Complexity

Let n = |s| and m = |t|. Let's derive the cost from how the pointers move.

Building need scans t once: O(m).

In the main sweep, right advances from 0 to n exactly once, and left only ever advances (never resets), so across the entire run left also moves at most n times. Every pointer move does O(1) work — a map update and the formed comparison, both constant thanks to the counter trick. So the total pointer work is O(n).

Total time: O(n + m) — linear in the combined input size. This is optimal; we must read all of s and t at least once.

Space is O(k), where k is the number of distinct characters in t, for the need and window maps. For a fixed alphabet that's O(1).

The whole speedup came from replacing a per-step O(m) validity rescan with an O(1) incremental counter. Whenever a sliding-window problem has an expensive validity check, ask whether you can maintain validity incrementally as characters enter and leave, instead of recomputing it.

5. Discuss Trade-Offs Between Solutions

Approach	Time	Space	When I'd use it
Check every substring	O(n³)	O(m)	Never — but it defines what "valid window" means
Sliding window, rescan validity each step	O(n × m)	O(m)	A middle step — correct but the rescan is wasteful
Sliding window + formed counter	O(n + m)	O(k)	My default — incremental O(1) validity check

The counter-based sliding window is the right answer. The intermediate version (sliding window but recounting validity each step) is worth mentioning to show why the formed counter matters: without it, you still pay O(m) per step and lose the linear-time guarantee. The counter is what makes the two-pointer sweep actually fast.

6. Pseudocode

if |s| < |t| or t is empty: return ""

need = character counts of t
required = number of distinct characters in need
formed = 0
window = empty map
left = 0
best = (length = infinity, start = 0)

for right in 0 .. |s| - 1:
    c = s[right]
    window[c]++
    if c in need and window[c] == need[c]:
        formed++ # this char just completed a requirement

    while formed == required:  # window is valid — tighten it
        if (right - left + 1) < best.length:
            best = (right - left + 1, left)
        d = s[left]
        window[d]--
        if d in need and window[d] < need[d]:
            formed-- # removing d broke a requirement
        left++

return "" if best.length is infinity else s[best.start .. best.start + best.length - 1]

7. Edge Cases

Things to verify before claiming we're done:

t longer than s → no window can cover it; the guard at the top returns "". ✓
t empty → typically return "" (nothing to cover); handle per the problem's convention.
No valid window exists (some character of t never appears in s) → formed never reaches required, best length stays infinity, return "". ✓
t has duplicate characters (t = "AABC") → the need counts and the equality check handle multiplicity correctly; a window needs two A's before A's requirement is formed. ✓
The entire string is the answer (s == t) → expands to validity at the last character, contracts to exactly s. ✓
Multiple windows of equal minimum length → we keep the first one found (strict < comparison), which is a valid answer.

The multiplicity case is the one that breaks naive solutions. Using == when a character reaches its needed count (rather than >=) is precisely what makes duplicates work — a second required A only completes the requirement when window[A] hits need[A], not before.

8. Full Code

import java.util.*;

public class MinimumWindowSubstring {

    public static String minWindow(String s, String t) {
        if (t.isEmpty() || s.length() < t.length()) {
            return "";
        }

        // need[c] = how many of character c the 
        // window must contain
        Map<Character, Integer> need = new HashMap<>();
        for (char c : t.toCharArray()) {
            need.put(c, need.getOrDefault(c, 0) + 1);
        }

        // distinct requirements to satisfy
        int required = need.size();
        // requirements currently satisfied
        int formed = 0;


        Map<Character, Integer> window = new HashMap<>();
        int left = 0;
        int bestLen = Integer.MAX_VALUE;
        int bestLeft = 0;

        for (int right = 0; right < s.length(); right++) {
            char c = s.charAt(right);
            window.put(c, window.getOrDefault(c, 0) + 1);

            // Only count it if this char JUST reached its required 
            // count (== not >=) — padding an already-met 
            // requirement changes nothing
            if (need.containsKey(c) && window.get(c).intValue() == need.get(c).intValue()) {
                formed++;
            }

            // Window is valid: record it, then shrink from the 
            // left as far as possible
            while (formed == required) {
                if (right - left + 1 < bestLen) {
                    bestLen = right - left + 1;
                    bestLeft = left;
                }

                char d = s.charAt(left);
                window.put(d, window.get(d) - 1);
                // Dropping d may have pushed it below its 
                // requirement
                if (need.containsKey(d) && window.get(d).intValue() < need.get(d).intValue()) {
                    formed--;
                }
                left++;
            }
        }

        return bestLen == Integer.MAX_VALUE
                ? ""
                : s.substring(bestLeft, bestLeft + bestLen);
    }
}

A note on a subtle detail: we compare counts with .intValue() because Java caches small Integer objects to save memory but not large ones. In Java, == compares memory addresses for objects, not values, which means == checks on Integers is unreliable. .intValue() on the other hand forces Java to extract the raw primitive int value, giving us an accurate numeric comparison. It's a small thing, but exactly the kind of bug that passes small tests and fails on larger inputs.

9. Test the Code

// Canonical example
System.out.println(minWindow("ADOBECODEBANC", "ABC"));  // "BANC"

// Whole string is the answer
System.out.println(minWindow("a", "a"));                // "a"

// No valid window — 'c' never appears
System.out.println(minWindow("ab", "abc"));             // ""

// t longer than s
System.out.println(minWindow("a", "aa"));               // ""

// Duplicates in t — needs TWO a's
System.out.println(minWindow("aaflslflsssggzxa", "aaa"));  // "aaflslflsssggzxa" (needs all three a's)

// t with repeated chars, tighter case
System.out.println(minWindow("aaabbbccc", "abc"));      // "abbbc"? -> shortest covering a,b,c

// Single-character target
System.out.println(minWindow("bba", "a"));              // "a"

These hit the meaningful cases: the canonical example, a whole-string answer, an impossible case (returns ""), a target longer than the source, and — the important one — targets with duplicate characters, which verify that multiplicity is handled correctly. The duplicate cases are what separate a correct solution from one that treats t as a set of distinct letters and silently accepts windows missing repeated characters.

10. Key Lessons

When a problem asks for the smallest contiguous span satisfying a coverage condition, reach for the variable-size sliding window. The pattern works because validity is monotonic in window size — bigger windows are more likely valid — which is what lets two forward-only pointers replace an O(n²) substring search.
Expand to reach validity, contract to minimize, repeat. Neither pointer ever moves backward, because a smaller left only yields longer (never better) windows. Recognizing why both pointers are monotonic is what justifies the linear-time sweep.
When a sliding window's validity check is expensive, maintain it incrementally. The formed counter turns an O(m) "does the window cover everything?" rescan into an O(1) comparison by tracking how many requirements are currently met and updating that count only when a requirement crosses its threshold.
Multiset coverage requires counting, not set membership. Use == (not >=) when a character reaches its needed count, so duplicates are handled exactly: a requirement is completed only at the precise moment its count is met, never double-counted.
Watch for language-specific traps in your comparisons. Comparing boxed integers with == in Java is a classic silent bug; comparing primitive values avoids it. The fastest algorithm in the world still fails if a subtle equality check is wrong.

The thing that makes Minimum Window Substring click is the realization that the expensive part is checking validity, and that you can make that check O(1) by maintaining a single formed counter that flips one requirement at a time. Once you see that expansion and contraction are just responses to a validity signal you can track incrementally, the Hard label dissolves: it's a standard expand-then-contract sweep wrapped around a clever counter. Training yourself to ask "can I maintain this condition incrementally instead of recomputing it?" is the habit that unlocks not just this problem, but the entire family of windowed-coverage questions.

Good Luck and Happy Coding!