Year 2008 :: Zip with Universe

We all use Zip utilities in our daily life in software. This helps us to reduce the size of a large file. The most common programs are Winzip, Winrar and 7-Zip. Winrar is probably the most famous one.

Existing Zip utilities are quite efficient in their work. However, I used to think if we can further squeeze any file. Most of the existing programs use pattern matching techniques to compress the file. This can reduce the file only up to a certain limit. We know that technology will continue to develop better compression algorithms and we will definitely get more efficient programs in future.

But... I thought of a completely different technique and I must confess that it is just a MAD idea. Still, there is something interesting in it. Something crazy or abnormal as well :)

How

When we look at the data stored in a file, at the granular level it is nothing but a collection of bits. Each bit can have a value of 0 or 1 which is known as "binary value". We all know that. So, if we really spread the content of a file, what we will get is a series of 0 and 1... may be some millions of them. Another file would have a different series of 0 and 1. Each file will be different from the other. For large files, we will first split it into smaller parts and then take the binary series of each part separately.

Now, let us look at this series of 0 and 1 little differently. Logically they are nothing but a big, Super Big Number. It is really difficult to even imagine how big that number will be, but we can sense that the number will still hold all properties of a numeric value. Next, we will add a "sec" (second - the unit of measurement of time) at the end of this number. What do we get ? We get a time. This time, when measured from the occurrence of big bang - indicates a specific point on the life cycle of the Universe. We have many different ways to identify the time - like Year, Month, Day, Hour etc. For example, the content of a file may just point to a time of Morning 6:30:12 AM on 15th Aug 2050 !

So once we transform the big series of 0 and 1 into a Date/Time format, we get a content which is much shorter in length. And this short content can be used to bring back our original file. Likewise, if we replace each different parts of a large file with Date/Time formats, we will achieve a much smaller file at the end of the exercise. And more interestingly, this compression is lossless compression i.e. we can retrieve the exact content of the original file by doing a reverse transformation. No information is actually lost :)

It may be noted that representation of a Date/Time value in the time scale of the Universe may not be possible with merely our calendars. For that, we might have to develop a better calendar system. For example, one "Year" represents 365 days, we may need to group a 1000 years and call it a, say, "Zear". Similarly, We will group 1000 Zears and call it a Xear !

Limitations

As I confessed already, this idea is just freaky. Whether this will actually work or not, I am not sure. I love to dream, but the actual implementation is not my job - unless you pull me and make my hands dirty. When I thought deeply on this compression theory, I myself found a lot of drawbacks that would make this approach ineffective. Hence, I want to admit that it needs many more improvements. I am adding this to my blog only because it came to my mind ever. One day this may be picked up by a master mind and a more efficient technique may be derived based on this idea... you never know :)


No comments:

Post a Comment