Huffman Codes: An Information Theory Perspective

222,080
0
Published 2021-07-30
Huffman Codes are one of the most important discoveries in the field of data compression. When you first see them, they almost feel obvious in hindsight, mainly due to how simple and elegant the algorithm ends up being. But there's an underlying story of how they were discovered by Huffman and how he built the idea from early ideas in information theory that is often missed. This video is all about how information theory inspired the first algorithms in data compression, which later provided the groundwork for Huffman's landmark discovery.

0:00 Intro
2:02 Modeling Data Compression Problems
6:20 Measuring Information
8:14 Self-Information and Entropy
11:03 The Connection between Entropy and Compression
16:47 Shannon-Fano Coding
19:52 Huffman's Improvement
24:10 Huffman Coding Examples
26:10 Huffman Coding Implementation
27:08 Recap

This video is supported by a community of Patreons
Special Thanks to the following Patreons:
Burt Humburg
Michael Nawenstein
Richard Wells
Sebastian Gamboa
Zac Landis

Corrections:
At 9:34, the entropy was calculated with log base 10 instead of the expected log base 2,. The correct values should be H(P) = 1.49 bits and H(P) = 0.47 bits.

At 16:23, all logarithms should be negated, I totally forgot about the negative sign.

At 27:24, I should have said the least likely symbols should have the *longest encoding.

This video wouldn't be possible without the open source library manim created by 3blue1brown: github.com/3b1b/manim

Here is link to the repository that contains the code used to generate the animations in this video: github.com/nipunramk/Reducible

Music attributions:
Midsummer Sky by Kevin MacLeod is licensed under a Creative Commons Attribution 4.0 license. creativecommons.org/licenses/...
Source: incompetech.com/music/royalty-...
Artist: incompetech.com/
Luminous Rain by Kevin MacLeod is licensed under a Creative Commons Attribution 4.0 license. creativecommons.org/licenses/...
Source: incompetech.com/music/royalty-...
Artist: incompetech.com/

This video put together a variety of great resources on information theory, data compression, Shannon-Fano codes, and Huffman codes. Here are some links that I found most helpful while doing research.

A great resource on Shannon's source coding theorem (with the proof): mbernste.github.io/posts/sourcecoding/

Huffman's original paper: compression.ru/download/articles/huff/huffman_1952…

Great chapter on Information Theory motivating various data compression ideas: www.princeton.edu/~cuff/ele201/kulkarni_text/infor…

Great book for further reading: Data Compression by Khalid Sayood

Algorithmic perspective on Huffman Codes: Section 5.2 of Algorithms by Papadimitriou et al

More elaboration on discovery of Huffman Codes: www.maa.org/sites/default/files/images/upload_libr…

A really nice resource on data compression and information theory: www.ece.virginia.edu/~ffh8x/moi/compression.html

Implementation of Huffman Codes: www.techiedelight.com/huffman-coding/

All Comments (21)
  • @SamBrickell
    I love the stories where students solve unsolved problems just because the professor neglected to tell them it was difficult.
  • @AA-gl1dr
    "Just call it entropy, nobody really knows what it is" Truly an intelligent man.
  • The section leading up to 8:00 epitomizes a problem solving technique that sets apart mathematicians: rather than directly search for a formula that describes a thing, instead list what properties you expect that thing to have. Make deductions from those properties, and eventually all kinds of insights and formulas fall on your lap.
  • @jimboli9400
    The legend is back. I love your work, the production quality, the content, everything! You are the computer science equivalent of 3b1b.
  • The wisest thing Huffman’s professor did was not mention it was an unsolved problem. Great, clear presentation, others on youTube are so fast they require you to pretty much already know how it works before you watch it, which is insane.
  • @mementomori7160
    20:26, "optimal binary representation must have the two least likely symbols at the bottom of the tree" My first thought: "So can we count it as one node and repeat the process? Nah, that can't be right, that would be too easy" XD turns out it really is easy
  • @alexismandelias
    Having also learned the Huffman encoding algorithm as an example of a greedy algorithm (along with its accompanying greedy-style proof), this video provided me with a new perspective which, combined with the interesting history of alternative algorithms, gave me a fuller understanding and appreciation for the topic, which I have to admit is extremely well presented!
  • @wampwamp1458
    This video is an amazing compilation of information, in a well presented order and in an engaging way. Information well encoded! Cheers
  • Your presentation is entertaining, thought-provoking and truly educational. One of the best channels on YouTube in my opinion.
  • I just recently learned about Huffman encoding and this video is absolutely AMAZING. You really motivated the solution incredibly well! Congratulations and hope you make more videos
  • @emmamovsesyan
    The smallest shift in perspective can sometimes make all the difference <3 Thank you so much! I have enjoyed this so much!
  • @pafloxyq
    Really loved it! It was knowing about Huffman codes that made me take information theory classes. It is really an elegant piece of mathematics.
  • @aethrya
    Man it's so amazing that any of this works at all. Being an autodidact philomath and self educated amateur mathematician and cryptographer, I am so glad to have all of this information available to learn at my own pace. Thank you.
  • @Danielle_1234
    Best explanation of Huffman Encoding I've seen. Bravo!
  • This is very impressive. It takes a lot of hard to present so much so well. Excellent vid!
  • @telemanchos7986
    Incredible educative video, I respect the amount of work you put into this! I enjoyed the overview of the field and the connection between information and probability theory was splendidly shown! I'm looking forward to your future videos!
  • @rccowboys
    This has got to be the most amazing video I have ever seen. Seriously! It was absolutely the most intriguing thing I have never pondered. Great explanation skills for slow people like me but yet very entertaining at the same time. Thank you sir for your effort and information!
  • @RequiosWoW
    I'm taking numerical analysis course right now, and my chapter was just on Huffman Codes, great timing!
  • @Ouuiea
    As always, amazing quality of a video!! loved it. Please keep doing them <3