Skip to content

Commit 697a45b

Browse files
committed
feat: small refinement of hashfusing article
1 parent 946a653 commit 697a45b

1 file changed

Lines changed: 23 additions & 14 deletions

File tree

src/math/hashing/hashfusing.clj

Lines changed: 23 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -23,21 +23,24 @@
2323
;; sufficiently low entropy data is not uniquely represented by fusing hashes
2424
;; together.
2525

26-
;; The good news is that low entropy data is compressible and a practical
27-
;; fusing operator can throw an exception when fusing produces 'bad' hash
28-
;; values.
26+
;; The good news is that low entropy data may always be converted into high
27+
;; entropy data either via compression or injecting noise. A practical fusing
28+
;; operator can easily detect fuses of low entropy data and throw an exception
29+
;; when fusing produces those 'bad' hash values.
2930

3031
;; ## Introduction
3132

3233
;; I have begun working on implementing distributed immutable data structures
3334
;; as a way to efficiently share structured data between multiple clients. One
3435
;; of the key challenges in this area is being able to efficiently reference
35-
;; ordered collections of data. The usual approach is to use Merkle Trees. But,
36-
;; since I plan on using Finger Trees to represent ordered collections of data,
36+
;; ordered collections of data. The usual approach is to use Merkle Trees which
37+
;; have the limitation that they are non-associative and the shape of the tree
38+
;; determines the final hash value at the root of the tree. But, since I plan
39+
;; on using Finger Trees to efficiently represent ordered collections of data,
3740
;; I need computed hashes that are insensitive to the tree shape. This means
3841
;; that I need to be able to fuse hashes together in a way that is associative
39-
;; and not commutative. As a starting point, I have been reading the HP paper
40-
;; describing hash fusing via matrix multiplication:
42+
;; and also non-commutative. As a starting point, I have been reading the HP
43+
;; paper describing hash fusing via matrix multiplication:
4144
;; [https://www.labs.hpe.com/techreports/2017/HPE-2017-08.pdf](https://www.labs.hpe.com/techreports/2017/HPE-2017-08.pdf)
4245

4346
;; ## Hash Fusing via Upper Triangular Matrix Multiplication
@@ -70,11 +73,11 @@
7073
(def c-hex (random-hex 32))
7174
^:kindly/hide-code c-hex
7275

73-
;; To fuse two hashes, we convert each hash to its corresponding upper
74-
;; triangular matrix and then multiply the two matrices together. The result is
75-
;; another upper triangular matrix which can be converted back to a hash by
76-
;; taking the elements above the main diagonal. The following several sections
77-
;; defines this mapping between hashes and upper triangular matrices. For the
76+
;; To fuse two hashes, we convert each hash to an upper triangular matrix and
77+
;; then multiply the two matrices together. The result is another upper
78+
;; triangular matrix which can be converted back to a hash by taking the
79+
;; elements above the main diagonal. The following several sections defines
80+
;; this mapping between hashes and upper triangular matrices. For the
7881
;; experiments below four different bit sizes of cells and four corresponding
7982
;; matrices are defined.
8083

@@ -499,8 +502,8 @@
499502
ab*c (utm-multiply ab c)
500503
a*bc (utm-multiply a bc)]
501504
{:cell-size cell-size
502-
:commutative? (= ab ba)
503-
:associative? (= ab*c a*bc)}))
505+
:associative? (= ab*c a*bc)
506+
:commutative? (= ab ba)}))
504507
(kind/table))
505508

506509
;; ## Experiment 1: Random Fuses
@@ -693,15 +696,21 @@
693696
{:a a-hex :b b-hex :fused (utm64->hex ab)}))
694697
(utm64->hex ab))))
695698

699+
;; Example of high entropy data fusing successfully:
696700
(high-entropy-fuse a-hex b-hex)
697701

702+
;; Example of low entropy data causing an error:
698703
(try
699704
(high-entropy-fuse zero-hex zero-hex)
700705
(catch Exception e
701706
(.getMessage e)))
702707

703708
;; ## Appendix: Why Low Entropy Data Fails (AI Explanation)
704709

710+
;; _The following explanation was generated by Copilot in response to the
711+
;; question: "Why does low entropy data fail to be represented by hash fusing
712+
;; via upper triangular matrix multiplication?"_
713+
705714
;; The reason low entropy data fails to be represented by hash fusing is that
706715
;; the multiplication of upper triangular matrices causes the lower bits of the
707716
;; resulting matrix to rapidly approach zero when the same matrix is repeatedly

0 commit comments

Comments
 (0)