Optimized Min Heap Based Similarity Detection For Delta Encoding

While Similarity based Delta Encoding has been used before, the algorithm described here uses a unique variant of the MinHash technique to compute a hash-based Similarity Index value for a data chunk. This value can then be compared to the values of the other chunks to detect similar chunks.

Publication Date
12 March 2013

data de-duplication MinHash delta encoding


Click here to download link on ip.com

Click to share this page via your favorite social network.

Learn more about defensive publications with our examples and frequently asked questions

What we are trying to do?

We are attempting to mobilize the creativity and innovative capacities of the Linux and broader open source community to codify the universe of preexisting inventions in defensive publications that upon publication in the IP.COM database will immediately serve as effective prior art that prevents anyone from having a patent issued that claims inventions that have already been document in a defensive publication. In addition to creating a vehicle to utilize this highly effective form of IP rights management for known inventions, it is hoped that the community will use defensive publications as a means of codifying future inventions should the inventors prefer not to make their invention the subject of a patent disclosure and application.