Data Structures & Algorithms

Table of Contents

Huffman codes

lossless encoding of characters, the ones that occur more frequently have a shorter encoding

e.g. encode “mississippi”

to generate a tree:

  1. Build a table of values with frequencies:

    m i s p
    1 4 4 2
  2. Repeat until tree is complete: take two smallest nodes from list, create a new node that is sum of two smallest, add the two smallest as children (right > left), then add the parent node back into the list.

    screenshot.png

  3. Assign weights to left-right branches — left is 0, right is 1.

  4. Traverse tree to generate a code for each character:

    lettercode
    a 1100
    b 1101
    c 100
    d 101
    e 111
    f 0

Complexity

with a min-heap-based min-priority queue, in O(nlogn)