Back to Course


0% Complete
0/82 Steps
  1. Getting Started with Algorithm
    What is an Algorithm?
  2. Characteristics of Algorithm
    1 Topic
  3. Analysis Framework
  4. Performance Analysis
    3 Topics
  5. Mathematical Analysis
    2 Topics
  6. Sorting Algorithm
    Sorting Algorithm
    10 Topics
  7. Searching Algorithm
    6 Topics
  8. Fundamental of Data Structures
  9. Queues
  10. Graphs
  11. Trees
  12. Sets
  13. Dictionaries
  14. Divide and Conquer
    General Method
  15. Binary Search
  16. Recurrence Equation for Divide and Conquer
  17. Finding the Maximum and Minimum
  18. Merge Sort
  19. Quick Sort
  20. Stassen’s Matrix Multiplication
  21. Advantages and Disadvantages of Divide and Conquer
  22. Decrease and Conquer
    Insertion Sort
  23. Topological Sort
  24. Greedy Method
    General Method
  25. Coin Change Problem
  26. Knapsack Problem
  27. Job Sequencing with Deadlines
  28. Minimum Cost Spanning Trees
    2 Topics
  29. Single Source Shortest Paths
    1 Topic
  30. Optimal Tree Problem
    1 Topic
  31. Transform and Conquer Approach
    1 Topic
  32. Dynamic Programming
    General Method with Examples
  33. Multistage Graphs
  34. Transitive Closure
    1 Topic
  35. All Pairs Shortest Paths
    6 Topics
  36. Backtracking
    General Method
  37. N-Queens Problem
  38. Sum of Subsets problem
  39. Graph Coloring
  40. Hamiltonian Cycles
  41. Branch and Bound
    2 Topics
  42. 0/1 Knapsack problem
    2 Topics
  43. NP-Complete and NP-Hard Problems
    1 Topic
Lesson 30, Topic 1
In Progress

Huffman Trees and Codes

Lesson Progress
0% Complete

Huffman’s algorithm
Step 1
Initialize n one-node trees and label them with the symbols of the alphabet given. Record the frequency of each symbol in its tree’s root to indicate the tree’s weight. (More generally, the weight of a tree will be equal to the sum of the frequencies in the tree’s leaves.)

Step 2 Repeat the following operation until a single tree is obtained. Find two trees with the smallest weight. Make them the left and right subtree of a new tree and record the sum of their weights in the root of the new tree as its weight.

A tree constructed by the above algorithm is called a Huffman tree. It defines—in the manner described above—a Huffman code.

  1. Data can be encoded efficiently using Huffman Codes.
  2. It is a widely used and beneficial technique for compressing data.
  3. Huffman’s greedy algorithm uses a table of the frequencies of occurrences of each character to build up an optimal way of representing each character as a binary string.

Suppose we have 105 characters in a data file. Normal Storage: 8 bits per character (ASCII) – 8 x 105 bits in a file. But we want to compress the file and save it compactly. Suppose only six characters appear in the file:

Huffman Codes

How can we represent the data in a Compact way?

(i) Fixed length Code: Each letter represented by an equal number of bits. With a fixed length code, at least 3 bits per character:

For example:

a			000

b			001

c			010

d			011

e			100

f			101

For a file with 105 characters, we need 3 x 105 bits.

(ii) A variable-length code: It can do considerably better than a fixed-length code, by giving many characters short code words and infrequent character long codewords.

For example:

  a       0

  b      101

  c      100

  d      111

  e      1101

  f      1100
Number of bits = (45 x 1 + 13 x 3 + 12 x 3 + 16 x 3 + 9 x 4 + 5 x 4) x 1000
= 2.24 x 105bits

Thus, 224,000 bits to represent the file, a saving of approximately 25%.This is an optimal character code for this file.

Prefix Codes:

The prefixes of an encoding of one character must not be equal to complete encoding of another character, e.g., 1100 and 11001 are not valid codes because 1100 is a prefix of some other code word is called prefix codes.

Prefix codes are desirable because they clarify encoding and decoding. Encoding is always simple for any binary character code; we concatenate the code words describing each character of the file. Decoding is also quite comfortable with a prefix code. Since no codeword is a prefix of any other, the codeword that starts with an encoded data is unambiguous.

Greedy Algorithm for constructing a Huffman Code:

Huffman invented a greedy algorithm that creates an optimal prefix code called a Huffman Code.

Huffman Codes

The algorithm builds the tree T analogous to the optimal code in a bottom up manner. It starts with a set of |C| leaves (C is the number of characters) and performs |C| – 1 ‘merging’ operations to create the final tree. In the Huffman algorithm ‘n’ denotes the quantity of a set of characters, z indicates the parent node, and x & y are the left & right child of z respectively.

Algorithm of Huffman Code

Huffman (C)
1. n=|C|
2. Q ← C
3. for i=1 to n-1
4. do
5. z= allocate-Node ()
6. x=  left[z]=Extract-Min(Q)
7. y= right[z] =Extract-Min(Q)
8. f [z]=f[x]+f[y]
9. Insert (Q, z)
10. return Extract-Min (Q)

Example: Find an optimal Huffman Code for the following set of frequencies:

  1. a: 50   b: 25   c: 15   d: 40   e: 75  


Algorithm of Huffman Code


Algorithm of Huffman Code

Again for i=2

Algorithm of Huffman Code
Algorithm of Huffman Code
Algorithm of Huffman Code

Similarly, we apply the same process we get

Algorithm of Huffman Code
Algorithm of Huffman Code

Thus, the final output is:

Algorithm of Huffman Code

New Report