top of page

Top K Frequent Elements

  • mosesg1123
  • Apr 28
  • 4 min read

Finding the most frequent items in a dataset is a common real-world task—think trending topics in social media, most visited pages on a website, or top-selling products. In interviews, Top K Frequent Elements tests your ability to count frequencies and then select the top performers efficiently.


Problem Statement

Given an integer array nums and an integer k, return the k most frequent elements in any order. You may assume k is always valid (1 ≤ k ≤ number of distinct elements).

public int[] topKFrequent(int[] nums, int k);

Aim for better than O(n log n) when possible.


1. Clarify Requirements Before Jumping Into Code

Let me confirm a few details:

  • k is between 1 and the number of unique elements.

  • If two elements have the same frequency, any relative order in the output is fine.

  • Input array can be large; we want something close to O(n).

  • We need to return exactly k elements, not all frequencies.


2. Identify the Category of the Question

This is a frequency counting plus selection problem. We need to count occurrences (hash map) and then find the top k by frequency. Patterns include sorting by frequency, heap (priority queue), bucket sort, or even quickselect on frequency pairs.


3. Consider a Brute-Force Approach to Understand the Problem

The simplest idea is:

  1. Count all frequencies in a map.

  2. Convert map entries to a list.

  3. Sort the list descending by frequency.

  4. Take the first k keys.

  5. Time Complexity: O(n + d log d), where d is number of distinct elements

  6. Space Complexity: O(n + d)

  7. Drawback: Sorting all d entries is more work than needed if k ≪ d.


4. Brainstorm More Solutions

I need to avoid full sort when d is large. What are my options?

  • Min-Heap of size k: Push each distinct element with its frequency into a min-heap keyed by frequency. Keep heap size ≤ k by popping the smallest. After processing d entries, heap contains top k. This is O(d log k) time, O(d + k) space.

  • Bucket Sort on Frequencies: Frequencies range from 1 to n. What if I create an array of buckets where bucket[f] holds all keys with frequency f? Then, I can scan the buckets starting from the highest frequency, collecting elements until I've reached k elements. This is O(n + d) time, O(n + d) space.

  • Quickselect on Entries: Use a selection algorithm on the distinct entries array to partition around the kth most frequent pivot. Average O(d) time, O(d) space for recursion.


5. Discuss Trade-Offs Between Your Solutions

Approach

Time

Space

Pros

Cons

Sort by frequency

O(n + d log d)

O(d)

Very simple

Sort cost when d large

Min-Heap (size k)

O(n + d log k)

O(d + k)

Good when k ≪ d, incremental selection

Log factor and heap overhead

Bucket Sort

O(n + d)

O(n + d)

True linear time, no heap, direct grouping by frequency

Extra array of size n

Quickselect

O(n + d) avg

O(d)

Average linear selection without full sort

Complex pivot logic, worst O(d²)

Bucket sort is fastest and simplest to implement in one pass after counting.


6. Write Pseudocode to Structure Your Thoughts

function topKFrequent(nums, k):
  // 1. Count frequencies
  freqMap = map from int to int
  for num in nums:
    freqMap[num]++

  // 2. Build buckets
  maxFreq = length(nums)
  buckets = array of lists of size maxFreq + 1
  for (num, freq) in freqMap:
    buckets[freq].add(num)

  // 3. Gather top k
  result = empty list
  for f from maxFreq down to 1:
    for num in buckets[f]:
      result.add(num)
      if result.size == k:
        return result as array

7. Consider Edge Cases

  • All elements identical (e.g. [2,2,2], k=1) → returns [2].

  • k equals number of distinct → returns all distinct elements.

  • n small, k=1 → returns the single most frequent.

  • Negative and positive values → map handles any integers.

  • nums empty → invalid per constraints (k at least 1), but we’d return an empty result.


8. Write Full Code Syntax

import java.util.*;

public class Solution {
    public int[] topKFrequent(int[] nums, int k) {
        // 1. Frequency count
        Map<Integer,Integer> freqMap = new HashMap<>();
        for (int num : nums) {
            freqMap.put(num, freqMap.getOrDefault(num,0) + 1);
        }

        // 2. Bucket by frequency
        int n = nums.length;
        List<Integer>[] buckets = new List[n + 1];
        for (int i = 0; i <= n; i++) {
            buckets[i] = new ArrayList<>();
        }
        for (Map.Entry<Integer,Integer> entry : freqMap.entrySet()) {
            int num = entry.getKey(), freq = entry.getValue();
            buckets[freq].add(num);
        }

        // 3. Collect top k
        List<Integer> resultList = new ArrayList<>();
        for (int f = n; f >= 1 && resultList.size() < k; f--) {
            for (int num : buckets[f]) {
                resultList.add(num);
                if (resultList.size() == k) {
                    break;
                }
            }
        }

        // Convert to array
        int[] result = new int[k];
        for (int i = 0; i < k; i++) {
            result[i] = resultList.get(i);
        }
        return result;
    }
}

9. Test Your Code

public static void main(String[] args) {
    Solution sol = new Solution();

    System.out.println(Arrays.toString(
        sol.topKFrequent(new int[]{1,1,1,2,2,3}, 2)
    ));  // [1,2]

    System.out.println(Arrays.toString(
        sol.topKFrequent(new int[]{1}, 1)
    ));  // [1]

    System.out.println(Arrays.toString(
        sol.topKFrequent(new int[]{4,1,-1,2,-1,2,3}, 2)
    ));  // [-1,2] or [2,-1]

    System.out.println(Arrays.toString(
        sol.topKFrequent(new int[]{5,5,4,4,3,3,2,2,1,1}, 3)
    ));  // [5,4,3]

    System.out.println(Arrays.toString(
        sol.topKFrequent(new int[]{}, 0)
    ));  // [] (if k=0 allowed)
}

10. Key Lessons to Remember for Future Questions

  • Count then select is often a two-step pattern: map frequencies, then choose top elements.

  • Bucket sort on frequency is a linear-time trick when frequencies range from 0 to n.

  • When k ≪ d, a min-heap of size k can be simpler and memory-efficient.

  • Always iterate buckets or heap from highest frequency to collect top results.

  • Clear pseudocode that separates counting, bucketing, and selection clarifies complex logic.

  • Edge cases—single element, all identical, empty—ensure robust solutions under constraints.

Recent Posts

See All
Merge Intervals

"Merge Intervals" is one of those classic algorithm problems that shows up frequently in technical interviews. It's a great test of your...

 
 
 
Jump Game II

Jump Game II is a classic follow-up to the original Jump Game problem. It’s not just about whether you can reach the end... now you have to do it in the fewest jumps possible! That small change turns

 
 
 
Jump Game

The "Jump Game" question is a popular one for interviews because it tests your ability to think greedily and work with dynamic movement through an array. It's a great warm-up for range-based greedy lo

 
 
 

Drop Me a Line, Let Me Know What You Think

Thanks for submitting!

© 2023 by Train of Thoughts. Proudly created with Wix.com

bottom of page