Top K Frequent Elements
- mosesg1123
- 7 hours ago
- 4 min read
Finding the most frequent items in a dataset is a common real-world task—think trending topics in social media, most visited pages on a website, or top-selling products. In interviews, Top K Frequent Elements tests your ability to count frequencies and then select the top performers efficiently.
Problem Statement
Given an integer array nums and an integer k, return the k most frequent elements in any order. You may assume k is always valid (1 ≤ k ≤ number of distinct elements).
public int[] topKFrequent(int[] nums, int k);
Aim for better than O(n log n) when possible.
1. Clarify Requirements Before Jumping Into Code
Let me confirm a few details:
k is between 1 and the number of unique elements.
If two elements have the same frequency, any relative order in the output is fine.
Input array can be large; we want something close to O(n).
We need to return exactly k elements, not all frequencies.
2. Identify the Category of the Question
This is a frequency counting plus selection problem. We need to count occurrences (hash map) and then find the top k by frequency. Patterns include sorting by frequency, heap (priority queue), bucket sort, or even quickselect on frequency pairs.
3. Consider a Brute-Force Approach to Understand the Problem
The simplest idea is:
Count all frequencies in a map.
Convert map entries to a list.
Sort the list descending by frequency.
Take the first k keys.
Time Complexity: O(n + d log d), where d is number of distinct elements
Space Complexity: O(n + d)
Drawback: Sorting all d entries is more work than needed if k ≪ d.
4. Brainstorm More Solutions
I need to avoid full sort when d is large. What are my options?
Min-Heap of size k: Push each distinct element with its frequency into a min-heap keyed by frequency. Keep heap size ≤ k by popping the smallest. After processing d entries, heap contains top k. This is O(d log k) time, O(d + k) space.
Bucket Sort on Frequencies: Frequencies range from 1 to n. What if I create an array of buckets where bucket[f] holds all keys with frequency f? Then, I can scan the buckets starting from the highest frequency, collecting elements until I've reached k elements. This is O(n + d) time, O(n + d) space.
Quickselect on Entries: Use a selection algorithm on the distinct entries array to partition around the kth most frequent pivot. Average O(d) time, O(d) space for recursion.
5. Discuss Trade-Offs Between Your Solutions
Approach | Time | Space | Pros | Cons |
Sort by frequency | O(n + d log d) | O(d) | Very simple | Sort cost when d large |
Min-Heap (size k) | O(n + d log k) | O(d + k) | Good when k ≪ d, incremental selection | Log factor and heap overhead |
Bucket Sort | O(n + d) | O(n + d) | True linear time, no heap, direct grouping by frequency | Extra array of size n |
Quickselect | O(n + d) avg | O(d) | Average linear selection without full sort | Complex pivot logic, worst O(d²) |
Bucket sort is fastest and simplest to implement in one pass after counting.
6. Write Pseudocode to Structure Your Thoughts
function topKFrequent(nums, k):
// 1. Count frequencies
freqMap = map from int to int
for num in nums:
freqMap[num]++
// 2. Build buckets
maxFreq = length(nums)
buckets = array of lists of size maxFreq + 1
for (num, freq) in freqMap:
buckets[freq].add(num)
// 3. Gather top k
result = empty list
for f from maxFreq down to 1:
for num in buckets[f]:
result.add(num)
if result.size == k:
return result as array
7. Consider Edge Cases
All elements identical (e.g. [2,2,2], k=1) → returns [2].
k equals number of distinct → returns all distinct elements.
n small, k=1 → returns the single most frequent.
Negative and positive values → map handles any integers.
nums empty → invalid per constraints (k at least 1), but we’d return an empty result.
8. Write Full Code Syntax
import java.util.*;
public class Solution {
public int[] topKFrequent(int[] nums, int k) {
// 1. Frequency count
Map<Integer,Integer> freqMap = new HashMap<>();
for (int num : nums) {
freqMap.put(num, freqMap.getOrDefault(num,0) + 1);
}
// 2. Bucket by frequency
int n = nums.length;
List<Integer>[] buckets = new List[n + 1];
for (int i = 0; i <= n; i++) {
buckets[i] = new ArrayList<>();
}
for (Map.Entry<Integer,Integer> entry : freqMap.entrySet()) {
int num = entry.getKey(), freq = entry.getValue();
buckets[freq].add(num);
}
// 3. Collect top k
List<Integer> resultList = new ArrayList<>();
for (int f = n; f >= 1 && resultList.size() < k; f--) {
for (int num : buckets[f]) {
resultList.add(num);
if (resultList.size() == k) {
break;
}
}
}
// Convert to array
int[] result = new int[k];
for (int i = 0; i < k; i++) {
result[i] = resultList.get(i);
}
return result;
}
}
9. Test Your Code
public static void main(String[] args) {
Solution sol = new Solution();
System.out.println(Arrays.toString(
sol.topKFrequent(new int[]{1,1,1,2,2,3}, 2)
)); // [1,2]
System.out.println(Arrays.toString(
sol.topKFrequent(new int[]{1}, 1)
)); // [1]
System.out.println(Arrays.toString(
sol.topKFrequent(new int[]{4,1,-1,2,-1,2,3}, 2)
)); // [-1,2] or [2,-1]
System.out.println(Arrays.toString(
sol.topKFrequent(new int[]{5,5,4,4,3,3,2,2,1,1}, 3)
)); // [5,4,3]
System.out.println(Arrays.toString(
sol.topKFrequent(new int[]{}, 0)
)); // [] (if k=0 allowed)
}
10. Key Lessons to Remember for Future Questions
Count then select is often a two-step pattern: map frequencies, then choose top elements.
Bucket sort on frequency is a linear-time trick when frequencies range from 0 to n.
When k ≪ d, a min-heap of size k can be simpler and memory-efficient.
Always iterate buckets or heap from highest frequency to collect top results.
Clear pseudocode that separates counting, bucketing, and selection clarifies complex logic.
Edge cases—single element, all identical, empty—ensure robust solutions under constraints.
Comentarios