Contains Duplicate – A Warm Up
- mosesg1123
- Apr 16
- 3 min read
Updated: 7 days ago
In today’s session, I’m tackling the Contains Duplicate problem, a common warm‑up that helps demonstrate your grasp of array manipulation and hash-based techniques.
Problem Statement
Given an integer array nums, return True if any value appears at least twice, and False if every element is distinct.
1. Clarify Requirements Before Jumping Into Code
I need to verify a few details. First, I assume that nums is an array of integers. I need to return True as soon as I find a duplicate; otherwise, I return False. I understand that I’m not supposed to modify the array, and I’m allowed to use extra space as needed. These questions ensure I have the complete picture before diving in.
2. Identify the Category of the Question
This question is fundamentally about array traversal and frequency counting. It immediately suggests the use of a hash set or hash table since I need to quickly check for the presence of an element that I’ve seen before. Recognizing that pattern points me toward an O(n) time solution using a hash set.
3. Consider a Brute‑Force Approach to Understand the Problem
For a brute‑force approach, I could use two nested loops, comparing every element with every other element. This solution would look something like this:
for i from 0 to length(nums) - 1:
for j from i+1 to length(nums) - 1:
if nums[i] equals nums[j]:
return True
return False
This approach is simple but runs in O(n²) time, which isn’t scalable for large arrays. It helps me understand the problem’s basics, but it’s too inefficient.
4. Brainstorm More Solutions
I can improve on the brute‑force solution by leveraging a hash set. I’ll iterate through the array once and store each element in the hash set. Before adding an element, I’ll check if it’s already in the set. If it is, I return True right away. This method cuts down the time complexity to O(n).
Another approach is to sort the array first. Then, I can compare adjacent elements to detect duplicates. However, sorting takes O(n log n) time and may require additional space if I need to preserve the original array.
5. Discuss Trade‑Offs Between Your Solutions
Brute‑Force:
Time: O(n²)
Space: O(1)
Pros: Straightforward and easy to implement.
Cons: Inefficient for large datasets.
Hash Set:
Time: O(n)
Space: O(n)
Pros: Fast; scales well with input size.
Cons: Requires extra space.
Sorting:
Time: O(n log n)
Space: O(1) if done in-place, O(n) otherwise.
Pros: Can avoid extra data structure if in-place sort is acceptable.
Cons: Slower than the hash set approach, and sorting might modify the array.
Given the requirements, the hash set approach is the most optimal.
6. Write Pseudocode to Structure Your Thoughts
function containsDuplicate(nums):
initialize empty set seen
for each number in nums:
if number is in seen:
return True
add number to seen
return False
7. Consider Edge Cases
Empty array ([]) – Should return False.
Array with one element ([x]) – Should return False.
Array with all unique elements – Should return False.
Array with duplicates (e.g. [1,2,3,1]) – Should return True.
Array with multiple duplicates (e.g. [1,1,1,2,2]) – Should return True immediately upon detecting the first duplicate.
8. Write Full Code Syntax
def contains_duplicate(nums):
seen = set()
for num in nums:
if num in seen:
return True
seen.add(num)
return False
9. Test Your Code
# Test 1: Duplicate present
nums1 = [1, 2, 3, 1]
assert contains_duplicate(nums1) == True
# Test 2: All unique elements
nums2 = [1, 2, 3, 4]
assert contains_duplicate(nums2) == False
# Test 3: Multiple duplicates
nums3 = [1, 1, 1, 2, 2]
assert contains_duplicate(nums3) == True
# Test 4: Single element
nums4 = [5]
assert contains_duplicate(nums4) == False
# Test 5: Empty array
nums5 = []
assert contains_duplicate(nums5) == False
print("All tests passed!")
10. Key Lessons to Remember for Future Questions
In problems that involve checking for duplicates or frequencies, a hash set is typically the fastest and most straightforward approach.
Since this question is pretty straight-forward, you may know the optimal solution off the top of your head. While that's impressive, it's still important to communicate the other possibilities! Even if one solution is obviously superior, it's still important to show the interviewer you know how to identify multiple solutions and consider the trade-offs.
Happy coding!
Comments