The O(n) Club: Top K Frequent Words—How To Count, Sort, and Not Cry
The O(n) Club: Top K Frequent Words—How To Count, Sort, and Not Cry
- 2 min read

The O(n) Club: Top K Frequent Words—How To Count, Sort, and Not Cry

On this page
Introduction

The O(n) Club: Top K Frequent Words—How To Count, Sort, and Not Cry

⚡ TL;DR

Count each word. Sort the results by how often each word shows up (descending), but alphabetically for ties. If you’re allergic to bugs, do both, not one. Java snack below:

🧠 How Devs Usually Mess This Up

Someone always forgets something. If you only sort by frequency, get ready for chaos whenever two words tie—Java’s hash order is more unpredictable than a manager’s lunch break. With heaps, if you skip customizing the comparator, you’ll get a ‘top K’ where ‘potato’ beats ‘apple’ and chaos reigns. And let’s not forget those who assume K can’t exceed the number of unique words (spoiler: it absolutely can). Pro tip: treat the tie-break as seriously as you treat your favorite coffee order.

🔍 What’s Actually Going On

Picture a spelling bee. Every word spelled gets a sticker. At the end, Ms. Lexicographical sorts the charts: most stickers on top, and for kids with the same count, she picks whoever’s name comes first alphabetically. In code, that’s counting, then two-level sorting: popularity first, then rigidity of the dictionary. Same as real search engine autocompletes or leaderboard apps—the algorithmic nerd’s version of award ceremonies.

🛠️ PseudoCode

  • Count frequencies: Use a HashMap<String, Integer> and tally each word.
  • Sort or Heap: Decide if you’re going with a classic full sort (easy, O(N log N)), or a min-heap of size K with a custom comparator (harder, O(N log K)).
    • If sorting: Put map keys into a list and sort by (frequency desc, lex asc).
    • If min-heap: Use PriorityQueue<String> and a comparator that puts least-frequent (and reverse lex for ties) on top, so you end up with the actual best K.
  • Return: After sort, grab K words; with a heap, pop and reverse because min-heap likes being backward.
// Count frequencies
Map&lt;String, Integer&gt; freq = new HashMap&lt;&gt;();
for each word in words:
    freq[word] = freq.get(word, 0) + 1
 // Sort
List words = freq.keys()
words.sort((a, b) -&gt; freq[b] - freq[a] != 0 ? freq[b] - freq[a] : a.compareTo(b))
// or heap with custom comparator if you like pain
 // Return top K
return words.subList(0, K)

💻 The Code

import java.util.*;
 public class TopKFrequentWords {
    public List&lt;String&gt; topKFrequent(String[] words, int k) {
        Map&lt;String, Integer&gt; freq = new HashMap&lt;&gt;();
        for (String word : words) {
            freq.put(word, freq.getOrDefault(word, 0) + 1);
        }
        List&lt;String&gt; wordList = new ArrayList&lt;&gt;(freq.keySet());
        wordList.sort((a, b) -&gt; {
            int countCompare = freq.get(b) - freq.get(a);
            return countCompare != 0 ? countCompare : a.compareTo(b);
        });
        return wordList.subList(0, Math.min(k, wordList.size()));
    }
}

⚠️ Pitfalls, Traps & Runtime Smacks

  • Lexicographical tie-break: Miss this, and your answer is about as reliable as a vending machine after 4pm.
  • K > unique words: Out-of-bounds? Not today. Use Math.min(k, list.size()) for that extra hygienic touch.
  • Heap confusion: For min-heap, comparator should put least-relevant word at the top—the opposite of what you want at karaoke night.
  • Complexity honesty: Sorting is fine unless your dataset is the size of the British Library. Otherwise, even interviewers just want to see if you can spell ‘comparator’.

🧠 Interviewer Brain-Tattoo

“Ties go to the dictionary—because even in code, alphabetical order is less controversial than rock-paper-scissors.”