Needle In a Haystack - Part 1
- mosesg1123
- Apr 17
- 3 min read
Updated: Apr 22
Searching for a substring within a larger string is a fundamental task in many real‑world applications—think text editors, search engines, or log analysis tools. The strStr() problem, also known as “needle in a haystack,” is a perfect warm‑up that tests your ability to implement basic string search efficiently.
Problem Statement
Implement a function strStr(haystack, needle) that returns the index of the first occurrence of needle in haystack, or –1 if needle is not part of haystack. If needle is an empty string, return 0.
1. Clarify Requirements Before Jumping Into Code
I need to confirm a few details first
Empty needle returns 0 by definition
Case matters—search is exact, not case‑insensitive
Only simple substring match, not regex or wildcards
Aim for optimal time if possible, but a correct brute‑force is acceptable as a first step
2. Identify the Category of the Question
This is a string search problem. At its simplest it’s a nested‑loop comparison, but it can also be optimized using pattern‑matching algorithms like Knuth–Morris–Pratt (KMP) or Rabin–Karp for linear or average‑linear time complexity.
3. Consider a Brute‑Force Approach to Understand the Problem
Basic approach: for each position i in haystack up to len(haystack) – len(needle), compare haystack[i:i+len(needle)] to needle character by character.
First match → return i; if none, return –1.
Time complexity O(n × m), space O(1).
4. Brainstorm More Solutions
Brute‑force substring compare: easy to write, O(n × m) time
KMP algorithm: preprocess needle to build lps array, then single-pass search in O(n + m) time, O(m) extra space
Rabin–Karp: rolling hash for average O(n + m), but worst‑case O(n × m) and possible hash collisions
Python built‑in: not allowed in interview but underscores the simplicity of brute‑force
5. Discuss Trade‑Offs Between Your Solutions
Approach | Time | Space | Pros | Cons |
Brute‑Force | O(n·m) | O(1) | Very simple | Too slow for large inputs |
Rabin–Karp | O(n + m) avg | O(1) | Average linear, simple logic | Collision handling, worst-case O(n·m) |
Knuth–Morris–Pratt | O(n + m) | O(m) | Guaranteed linear | More complex to implement |
KMP is best for guaranteed performance, but for now let's assume you don't remember the other algorithms and just need to get a solution out. We'll implement the brute force solution here and follow-up with a second post to talk through how you might work out the other algorithms in real time, even if you don't remember them.
6. Write Pseudocode to Structure Your Thoughts
function strStr(haystack, needle):
if needle is empty:
return 0
n = length(haystack)
m = length(needle)
for i from 0 to n - m:
j = 0
while j < m and haystack[i + j] == needle[j]:
j += 1
if j == m:
return i
return -1
7. Consider Edge Cases
Empty needle: returns 0
Empty haystack but non‑empty needle: returns –1
Needle longer than haystack: returns –1
Needle equals haystack exactly: returns 0
Multiple occurrences: should return the first index
8. Write Full Code Syntax
def str_str(haystack: str, needle: str) -> int:
if not needle:
return 0
n, m = len(haystack), len(needle)
for i in range(n - m + 1):
match = True
for j in range(m):
if haystack[i + j] != needle[j]:
match = False
break
if match:
return i
return -1
9. Test Your Code
assert str_str("hello", "ll") == 2
assert str_str("aaaaa", "bba") == -1
assert str_str("", "") == 0
assert str_str("", "a") == -1
assert str_str("abc", "abc") == 0
assert str_str("abcabc", "cab") == 2
assert str_str("mississippi", "issip") == 4
print("All tests passed!")
10. Key Lessons to Remember for Future Questions
For simple substring search, brute‑force nested loops often suffice and are quick to code. If you're in a time crunch and need to get a solution out, implementing the brute-force algorithm is better than nothing.
For top tech companies however, your best shot of getting offer would be if you remembered the optimized algorithms and be able to implement them in real time.
In a follow-up post, we'll talk through how you might go about working out the KMP algorithm in real time when you can only remember high level details but not the specifics.
Good Luck and Happy Coding!


Comments