Overview
A Merkle tree (also called a hash tree) is a data structure invented by Ralph Merkle in 1979 that allows efficient and secure verification of the contents of large data sets. In Bitcoin, Merkle trees are used to organize all transactions within a block into a hierarchical hash structure, culminating in a single Merkle root that is stored in the block header.
Structure
Level 3 (Root): H(ABCD)
/ \
Level 2: H(AB) H(CD)
/ \ / \
Level 1: H(A) H(B) H(C) H(D)
| | | |
Level 0: Tx A Tx B Tx C Tx D
(Leaves)
H() = double SHA-256 hash
Each leaf is the hash of a transaction. Each internal node is the hash of its two children concatenated. This continues up the tree until a single root hash remains.
Efficiency of Merkle Proofs
The key advantage of Merkle trees is that you can prove a transaction is included in a block without downloading all transactions. A Merkle proof for a transaction in a block with N transactions requires only log2(N) hashes.
To prove Tx B is in the block, you need:
1. H(A) — sibling at level 1
2. H(CD) — sibling at level 2
3. Merkle root to compare against
Verify: H(H(H(A) + H(B)) + H(CD)) == Merkle Root
For a block with 4,000 transactions, a Merkle proof requires only about 12 hashes instead of all 4,000 transaction hashes. This enables SPV clients to verify transaction inclusion with minimal bandwidth.
Common Misconceptions
Merkle trees do not encrypt data — they only provide integrity verification. Anyone can see the transactions; the Merkle tree simply allows efficient proof that a given transaction was included. Also, Bitcoin's Merkle tree implementation uses double SHA-256 (hashing twice), not single SHA-256.