This PR is the first step in the trienode history series.
It introduces the `nodeWithOrigin` struct in the path database, which tracks
the original values of dirty nodes to support trienode history construction.
Note, the original value is always empty in this PR, so it won't break the
existing journal for encoding and decoding. The compatibility of journal
should be handled in the following PR.
Previously, PathDB used a single buffer to aggregate database writes,
which needed to be flushed atomically. However, flushing large amounts
of data (e.g., 256MB) caused significant overhead, often blocking the
system for around 3 seconds during the flush.
To mitigate this overhead and reduce performance spikes, a double-buffer
mechanism is introduced. When the active buffer fills up, it is marked
as frozen and a background flushing process is triggered. Meanwhile, a
new buffer is allocated for incoming writes, allowing operations to
continue uninterrupted.
This approach reduces system blocking times and provides flexibility in
adjusting buffer parameters for improved performance.
This pull request introduces a mechanism to improve state lookup
efficiency in pathdb by maintaining a lookup structure that eliminates
unnecessary iteration over diff layers.
The core idea is to track a mutation history for each dirty state entry
residing in the diff layers. This history records the state roots of all layers
in which the entry was modified, sorted from oldest to newest.
During state lookup, this mutation history is queried to find the most
recent layer whose state root either matches the target root or is a
descendant of it. This allows us to quickly identify the layer containing
the relevant data, avoiding the need to iterate through all diff layers from
top to bottom.
Besides, the overhead for state lookup is constant, no matter how many
diff layers are retained in the pathdb, which unlocks the potential to hold
more diff layers.
Of course, maintaining this lookup structure introduces some overhead.
For each state transition, we need to:
(a) update the mutation records for the modified state entries, and
(b) remove stale mutation records associated with outdated layers.
On our benchmark machine, it will introduce around 1ms overhead which is
acceptable.