Friday, September 21, 2007

Data Compression Puzzle

Data Compression Puzzle
This is the kind of logic puzzle that involves knowledge of data-compression techniques, or at least an ability to make a reasonable speculation as to how data compression works. The hint to get you started if you know absolutely nothing about data compression is given here, but it's not going to be nearly enough to be much use in actually explaining the answer to the question. One form of data compression:

Take a string of text 6 characters long - say, "camera", for instance.
Each two-letter combination can be assigned a certain value (determined before-hand and including all possible character combinations). Suppose "ca" = a, "me" = b and "ra" = c. So "camera" compresses initially as "abc".

Why can't this be done on "abc" to make it a two-letter combination, which could then be made into a one-letter combination? For "abc", there would be "ab" to compress, but just "c" by itself - it can be paired with a "dummy character" or something similar if needed to indicate end of line. Thus, the answer is _not_ "because 'abc' has an uneven number of letters'"
Now answer this big question
Why can't files be repeatedly compressed until they're only 1 byte big?