This pull request optimizes history indexing by splitting a single large
database
batch into multiple smaller chunks.
Originally, the indexer will resolve a batch of state histories and
commit all
corresponding index entries atomically together with the indexing
marker.
While indexing more state histories in a single batch improves
efficiency, excessively
large batches can cause significant memory issues.
To mitigate this, the pull request splits the mega-batch into several
smaller batches
and flushes them independently during indexing. However, this introduces
a potential
inconsistency that some index entries may be flushed while the indexing
marker is not,
and an unclean shutdown may leave the database in a partially updated
state.
This can corrupt index data.
To address this, head truncation is introduced. After a restart, any
excessive index
entries beyond the expected indexing marker are removed, ensuring the
index remains
consistent after an unclean shutdown.
`binary.AppendUvarint` offers better performance than using append
directly, because it avoids unnecessary memory allocation and copying.
In our case, it can increase the performance by +35.8% for the
`blockWriter.append` function:
```
benchmark old ns/op new ns/op delta
BenchmarkBlockWriterAppend-8 5.97 3.83 -35.80%
```
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
The implementation of `parseIndexBlock` used a reverse loop with slice
appends to build the restart points, which was less cache-friendly and
involved unnecessary allocations and operations. In this PR we change
the implementation to read and validate the restart points in one single
forward loop.
Here is the benchmark test:
```bash
go test -benchmem -bench=BenchmarkParseIndexBlock ./triedb/pathdb/
```
The result as below:
```
benchmark old ns/op new ns/op delta
BenchmarkParseIndexBlock-8 52.9 37.5 -29.05%
```
about 29% improvements
---------
Signed-off-by: jsvisa <delweng@gmail.com>