You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Lodestar's architecture relies heavily on maintaining a full merkle tree of
the beacon state. We represent the tree as a linked data structure, where each node is 1. immutable, 2. lazily computing but caching the hash of its children.
This allows us to minimize the number of hashes we need to perform during state transition. Since all hashes of a prestate are maintained, only the paths thru the tree to the "diff" need to be re-hashed. This reduces the computational cost of hashing.
Also, it allows us perform structural sharing, sharing the memory for beacon states with shared subtrees. For example, between epochs in a sync period, where sync committees remain constant, if we maintain a reference to two beacon states, both states will share the same underlying subtrees for the sync committees. This reduces the memory cost of maintaining several related states.
Hashing Function
We use as-sha256 with several optimizations
we rely on hash inputs always being 64 bytes - this allows us to precompute part of the sha256 internals for a decent gain
The memory usage for each type of object in Javascript is not very efficient coming from a systems language intuition. We store hash objects NOT as 32-byte Uint8Array. A 32-byte Uint8Array takes 223 total bytes! There's a bunch of pointers and additional bookkeeping that's being stored behind the scenes.
We store hash objects as objects with 8 uint32 numbers. eg: {h0: 0, h1: 0, ..., h7: 0}. This takes somewhere between 88 bytes and 216 bytes, depending on the sizes of the indiviudal component numbers. Smaller numbers are represented as Smi (small integer) as an immediate value, while larger numbers are stored on the heap. In practice, this happens TODO.
The hashing speed in lodestar is quite low compared to other implementations.
Investigate hashtree implementation, which hashes multiple hashes at once.
Investigate using a systems language implementation, eg using napi-rs. This could be tried using popular libraries, which likely attempt some hardware acceleration. Also can be tried using a transliteration of as-sha256.
The memory of our hashes in a lodestar node constitute a lot of a running beacon node. And our memory usage, measured per-hash, is still very large compared to systems languages.
Investigate using a systems language implementation, eg using napi-rs.
Investigate using an alternative memory layout in javascript.
This branch has some experiments using napi-rs to store hash objects 'in rust' and using napi-rs for hashing
hashtree does exceptionally well operating on large Uint8Arrays, not as well on rust HashObjects. (see hashtree uint8array row) (hashtree code has since been pulled into a repo here: @chainsafe/hashtree-js)
Using rust libraries for hashing was slower (see rust row), also using a rust port of as-sha256 was slower (see rust object rs-sha256)
This branch pulls "BranchNode" and "LeafNode" into rust and exposes a napi wrapper Node to javascript. This design allows most of the tree to exist in rust, with napi pointers into the tree as navigation demands.
Current testing shows this to be extremely slow, with the beacon node unable to stay synced. Have not had time to diagnose.
Current Hashing Approach
Lodestar's architecture relies heavily on maintaining a full merkle tree of
the beacon state. We represent the tree as a linked data structure, where each node is 1. immutable, 2. lazily computing but caching the hash of its children.
This allows us to minimize the number of hashes we need to perform during state transition. Since all hashes of a prestate are maintained, only the paths thru the tree to the "diff" need to be re-hashed. This reduces the computational cost of hashing.
Also, it allows us perform structural sharing, sharing the memory for beacon states with shared subtrees. For example, between epochs in a sync period, where sync committees remain constant, if we maintain a reference to two beacon states, both states will share the same underlying subtrees for the sync committees. This reduces the memory cost of maintaining several related states.
Hashing Function
We use as-sha256 with several optimizations
we rely on hash inputs always being 64 bytes - this allows us to precompute part of the sha256 internals for a decent gain
we avoid allocations inside library, only using fixed input/output buffers
Related hashing perf analysis Lodestar tree hashing performance lodestar#2206
Hash Cache Representation
The memory usage for each type of object in Javascript is not very efficient coming from a systems language intuition. We store hash objects NOT as 32-byte Uint8Array. A 32-byte Uint8Array takes 223 total bytes! There's a bunch of pointers and additional bookkeeping that's being stored behind the scenes.
We store hash objects as objects with 8 uint32 numbers. eg:
{h0: 0, h1: 0, ..., h7: 0}. This takes somewhere between 88 bytes and 216 bytes, depending on the sizes of the indiviudal component numbers. Smaller numbers are represented as Smi (small integer) as an immediate value, while larger numbers are stored on the heap. In practice, this happens TODO.How to improve?
The hashing speed in lodestar is quite low compared to other implementations.
The memory of our hashes in a lodestar node constitute a lot of a running beacon node. And our memory usage, measured per-hash, is still very large compared to systems languages.
Results
Some experiments were made, results below
cayman/hash-objecthashtreedoes exceptionally well operating on largeUint8Arrays, not as well on rustHashObjects. (seehashtree uint8arrayrow) (hashtreecode has since been pulled into a repo here:@chainsafe/hashtree-js)rustrow), also using a rust port of as-sha256 was slower (seerust object rs-sha256)node -r ts-node/register node_modules/.bin/benchmark packages/ssz/test/perf/hash.test.tsUnfortunately, a rust
HashObjectis more expensive, memory-wise, than the status quo.node -r ts-node/register --expose-gc packages/ssz/test/memory/hash.test.tscayman/hash-cachenode -r ts-node/register node_modules/.bin/benchmark packages/ssz/test/perf/eth2/hashTreeRoot.test.tshash cache
master
node -r ts-node/register --expose-gc packages/ssz/test/memory/eth2Objects.test.tshash cache
master
cayman/napi-merkle-nodeNodeto javascript. This design allows most of the tree to exist in rust, with napi pointers into the tree as navigation demands.