Sunday, June 25, 2017

A Simple Review of an Intuition on Data Compression

Team. Yesterday, we spoke with a young man at a company in the United States about this idea of compression data so it becomes arbitrarily small. So, we will make this idea "plain", because it spans a few older web-notes with other topics interspersed. In short, an integer of any size might have its value reduced by subtracting from it. This is something that one learns in elementary school. What one might learn in later years of education is that one can generate a stream of seemingly random values using a command found in most common programming languages used in modern computing. This command, often called "random", produces a pseudo-random stream of numbers. That means that it appears random; however, if one starts the random function with the same input, called a seed, the same stream will be produced.

So, if we start with 13 and subtract the sequence 7 , 5 we have 1, which is smaller than 13 and requires less storage space, if one uses the number of bits absolutely need for representing the number. This is the task of compression.

On the contrary, if we have 1, which was termed the "remnant", and add in the same short sequence 7, 5, we have 13. This is the task of decompression.

Please note, that the starting whole number can be enormous. And, the sequence generated by the random command can produce a list of values as long need for compressing a larger number and the number produced by random can be also be sizable.

The final piece of puzzle is that we need some stop sign placed within the number. The stop sign, also called a processing sentinel, that we have chosen is a special number called a "check-sum". These uniquely identify a larger number. A check-sum is a special number generated from a much larger number. Check-sums commonly are identifiers used for files which one might find on the World Wide Web and place on one's notebook personal computer using the file transfer protocol, FTP. This use of a stop-sign lets us know when we should stop calling the random command for more values when decompressing. And, if it was not immediately obvious, we simply stop calling this command, as we are compressing, when the remnant is small enough.

This is only intuition; however, it a very strong intuitive feeling.

This is the original intent of the phrase DNA storage crafted in a Mead notebook in the mid-1990s. With acronym DNA standing for "distinct number algorithm" and not the word associated with the genetic material which governs the growth of animal and plant cells.

It has been said of our reasoning that it "simple". And, that we are "simple" and "simple-minded". With that, we must agree, since we have been finding "simple approaches" for solving problems that our school teachers call "impossible". And, if we do not solve them in five minutes, we have often made great progress on challenging problems.

We are far from being the most elite in any branch of science, engineering, mathematics, or discipline established over man's history. But, over the years, we received a reputation for producing notions and insights that have the potential for great profit. All have not been that way; some have.

