It is designed for compressing (longer) texts, mapping the characters involved to bitfields of various length. Suppose that the frequency of each character is given [or one can also derive it from the given text as (occurrences of char) / (full text)].
Then the algorithm creates the mapping so that
- the more frequent characters get smaller bitfields
- each assigned bitfield is uniquely recognizable if concatenated (that's how the binary trees enter the picture).
For details and examples, you can start e.g. from wikipedia or the following video.