Deriving KMP in Real Time

mosesg1123
Apr 17
4 min read

Updated: Apr 22

I’m staring at my whiteboard, fresh off writing the brute‑force substring search: two nested loops, compare needle at every haystack position, O(n·m) worst case.

My interviewer is waiting, so I need a faster approach—but KMP’s details are fuzzy. I remember it “skips” parts of the haystack by preprocessing the needle into a table. How would I rediscover that process in real time?

Noticing the Pain Point

Let me pick a simple example: haystack = “AAAABAAAAC”, needle = “AAAC”. Brute‑force will match the first three “A”s, then fail at “B” vs “C”, then slide the needle by one and re‑match almost all those “A”s again. That repetition is the killer. I want to avoid re‑matching characters that I know already lined up.

Hunting for Reusable Matches

So I ask myself: if I match k characters of the needle and then hit a mismatch, can I somehow shift the needle forward k positions? No, that would skip potential matches.

Instead, can I shift by less - just enough so that part of the already matched prefix lines up with a suffix of what I’ve matched so far?

Concretely, suppose I match “AAA” in the needle before the mismatch. Does “AAA” have any proper prefix that’s also a suffix? Sure - “AA” is both a prefix and a suffix of “AAA”. That means after the mismatch, I could shift the needle forward so that this “AA” lines up with the last two matched characters in the haystack, and resume matching at the third position of the needle, not back at zero.

Ok, so that reuses what I’ve already matched and skips exactly the right amount.

Formalizing Prefix‑Suffix Shifts

To make this systematic, I need for each position j in the needle the length of the longest proper prefix of needle[0…j] that’s also a suffix of that substring. Let’s call that array LPS (Longest Proper Prefix‐Suffix).

For a tiny pattern “ABABC”, I’d walk through:

j=0 (“A”): no proper prefix/suffix → LPS[0]=0
j=1 (“AB”): prefixes={“A”}, suffixes={“B”} → no match → LPS[1]=0
j=2 (“ABA”): prefixes={“A”,“AB”}, suffixes={“A”,“BA”} → “A” matches → LPS[2]=1
j=3 (“ABAB”): prefixes={“A”,”AB”,”ABA”}, suffixes={“B”,”AB”,”BAB”} → “AB” matches → LPS[3]=2
j=4 (“ABABC”): no prefix/suffix match → LPS[4]=0

With that table, whenever I have matched j characters and see a mismatch, I can set j = LPS[j-1] instead of j=0, and keep i where it is in the haystack.

Building LPS Efficiently

I don’t want to compute LPS in O(m²) by checking all prefixes/suffixes at each j. Instead, I can build it in O(m) by reusing previous LPS information—much like dynamic programming on the pattern itself:

Initialize LPS[0] = 0.
Keep two pointers: prefixLength = 0 (length of current candidate prefix), and position = 1 (scanning the pattern).
While position < m:
- If pattern[position] == pattern[prefixLength], then prefixLength++, LPS[position] = prefixLength, position++
- Else if prefixLength > 0, set prefixLen = LPS[prefixLen - 1] and try again (don’t advance position)
- Else (prefixLen == 0), set LPS[position] = 0, pos++

This loop walks the pattern once, bouncing prefixLen back via previously computed LPS values until we either find a match or settle on zero.

Executing the Accelerated Search

Now, armed with LPS, I do my haystack scan:

i = 0 (haystack), j = 0 (needle)
While i < n:
- If haystack[i] == needle[j], i++, j++;
  - if j == m, we’ve found a full match at i–m
- Else if j > 0, set j = LPS[j-1] (skip in the needle)
- Else (j == 0), just i++

This way, i never moves backward, and j only moves forward or jumps back to wherever the LPS table tells it—total time O(n + m).

Thinking Through an Example

Let’s say haystack “AAAABAAAAC” vs needle “AAAC”:

Build LPS for “AAAC”: LPS = [0,1,2,0].
Start i=0,j=0: match A/A → i=1,j=1 → A/A → i=2,j=2 → A/A → i=3,j=3
haystack[3]=A vs needle[3]=C → mismatch, j>0 so j = LPS[2] = 2
Don’t touch i, now compare haystack[3]=A with needle[2]=A → match → i=4,j=3
haystack[4]=B vs needle[3]=C → mismatch, j>0 so j = LPS[2] = 2
Compare haystack[4]=B vs needle[2]=A → mismatch, j>0 so j = LPS[1] = 1
Compare haystack[4]=B vs needle[1]=A → mismatch, j>0 so j = LPS[0] = 0
Now j=0, mismatch → i=5,j=0
Continue scanning… eventually at i=6 we’ll realign and match fully.

All the wasted re‑matches are gone.

Wrapping Up

I’ve just reverse‑engineered KMP completely by:

Recognizing brute‑force re‑matches the same comparisons
Realizing that within any matched prefix there might be a smaller prefix that’s also a suffix
Precomputing those lengths in an LPS table using a linear sweep
Using LPS to skip ahead in the needle on mismatch, never moving the haystack pointer backward

That gives a true O(n + m) substring search. Even if I can’t recite every detail from memory, this line of reasoning shows I can derive KMP on the fly - one of the hallmarks of a strong algorithmic thinker.

A couple of time top carry forward to future questions:

Walking through an example is a great way to get your mind working
Look for places in your algorithm where you are repeating computational work. That's often a great starting point when you are looking for ways to optimize your solutions.
Precomputing and storing the results of that computation is an effective approach for eliminating repeat calculations.

Good Luck and Happy Coding!