I removed `Iterator.Count` in #33840, because it appeared to be unused
and did not provide the documented invariant: the returned count should
always be an upper bound on the number of iterations allowed by `Next`.
In order to make `Count` work, the semantics of `CountValues` has to
change to return the number of items up and including the invalid one. I
have reviewed all callsites of `CountValues` to assess if changing this
is safe. There aren't that many, and the only call that doesn't check
the error and return is in the trie node parser,
`trie.decodeNodeUnsafe`. There, we distinguish the node type based on
the number of items, and it previously returned an error for item count
zero. In order to avoid any potential issue that could result from this
change, I'm adding an error check in that function, though it isn't
necessary.
This changes `RawList` to ensure the count of items is always valid.
Lists with invalid structure, i.e. ones where an element exceeds the
size of the container, are now detected during decoding of the `RawList`
and thus cannot exist.
Also remove `RawList.Empty` since it is now fully redundant, and
`Iterator.Count` since it returns incorrect results in the presence of
invalid input. There are no callers of these methods (yet).
Most uses of the iterator are like this:
it, _ := rlp.NewListIterator(data)
for it.Next() {
do(it.Value())
}
This doesn't require the iterator to be a pointer and it's better to
have it stack-allocated. AFAIK the compiler cannot prove it is OK to
stack-allocate when it is returned as a pointer because the methods of
`Iterator` use pointer receiver and also mutate the object.
The iterator type was not exported until very recently, so I think it is
still OK to change this API.
This adds a new type wrapper that decodes as a list, but does not
actually decode the contents of the list. The type parameter exists as a
marker, and enables decoding the elements lazily. RawList can also be
used for building a list incrementally.
This pull request introduces a mechanism to compress trienode history by
storing only the node diffs between consecutive versions.
- For full nodes, only the modified children are recorded in the history;
- For short nodes, only the modified value is stored;
If the node type has changed, or if the node is newly created or
deleted, the entire node value is stored instead.
To mitigate the overhead of reassembling nodes from diffs during history
reads, checkpoints are introduced by periodically storing full node values.
The current checkpoint interval is set to every 16 mutations, though
this parameter may be made configurable in the future.
The list iterator previously returned true on parse errors without
advancing the input, which could lead to non-advancing infinite loops
for callers that do not check Err() inside the loop; to make iteration
safe while preserving error visibility, Next() now marks the iterator as
finished when readKind fails, returning true for the error step so
existing users that check Err() can handle it, and then false on
subsequent calls, and the function comment was updated to document this
behavior and the need to check Err().
Changelog: https://golangci-lint.run/product/changelog/#1610
Removes `exportloopref` (no longer needed), replaces it with
`copyloopvar` which is basically the opposite.
Also adds:
- `durationcheck`
- `gocheckcompilerdirectives`
- `reassign`
- `mirror`
- `tenv`
---------
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
* rlp/rlpgen: remove build tag
This tag was supposed to prevent unstable output when types reference each other. Imagine
there are two struct types A and B, where a reference to type B is in A. If I run rlpgen
on type B first, and then on type A, the generator will see the B.EncodeRLP method and
call it. However, if I run rlpgen on type A first, it will inline the encoding of B.
The solution I chose for the initial release of rlpgen was to just ignore methods
generated by rlpgen using a build tag. But there is a problem with this: if any code in
the package calls EncodeRLP explicitly, the package can't be loaded without errors anymore
in rlpgen, because the loader ignores it. Would be nice if there was a way to just make it
ignore invalid functions during type checking (they're not necessary for rlpgen), but
golang.org/x/tools/go/packages does not provide a way of ignoring them.
Luckily, the types we use rlpgen with do not reference each other right now, so we can
just remove the build tags for now.
This adds built-in support in package rlp for encoding, decoding and generating code dealing with uint256.Int.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Instead of using a limit of three nodes per message, we can pack more nodes
into each message based on ENR size. In my testing, this halves the number
of sent NODES messages, because ENR size is usually < 300 bytes.
This also adds RLP helper functions that compute the encoded size of
[]byte and string.
Co-authored-by: Martin Holst Swende <martin@swende.se>
This changes the CI / release builds to use the latest Go version. It also
upgrades golangci-lint to a newer version compatible with Go 1.19.
In Go 1.19, godoc has gained official support for links and lists. The
syntax for code blocks in doc comments has changed and now requires a
leading tab character. gofmt adapts comments to the new syntax
automatically, so there are a lot of comment re-formatting changes in this
PR. We need to apply the new format in order to pass the CI lint stage with
Go 1.19.
With the linter upgrade, I have decided to disable 'gosec' - it produces
too many false-positive warnings. The 'deadcode' and 'varcheck' linters
have also been removed because golangci-lint warns about them being
unmaintained. 'unused' provides similar coverage and we already have it
enabled, so we don't lose much with this change.
This enables the following linters
- typecheck
- unused
- staticcheck
- bidichk
- durationcheck
- exportloopref
- gosec
WIth a few exceptions.
- We use a deprecated protobuf in trezor. I didn't want to mess with that, since I cannot meaningfully test any changes there.
- The deprecated TypeMux is used in a few places still, so the warning for it is silenced for now.
- Using string type in context.WithValue is apparently wrong, one should use a custom type, to prevent collisions between different places in the hierarchy of callers. That should be fixed at some point, but may require some attention.
- The warnings for using weak random generator are squashed, since we use a lot of random without need for cryptographic guarantees.
This change speeds up trie hashing and all other activities that require
RLP encoding of trie nodes by approximately 20%. The speedup is achieved by
avoiding reflection overhead during node encoding.
The interface type trie.node now contains a method 'encode' that works with
rlp.EncoderBuffer. Management of EncoderBuffers is left to calling code.
trie.hasher, which is pooled to avoid allocations, now maintains an
EncoderBuffer. This means memory resources related to trie node encoding
are tied to the hasher pool.
Co-authored-by: Felix Lange <fjl@twurst.com>
This change adds a code generator tool for creating EncodeRLP method
implementations. The generated methods will behave identically to the
reflect-based encoder, but run faster because there is no reflection overhead.
Package rlp now provides the EncoderBuffer type for incremental encoding. This
is used by generated code, but the new methods can also be useful for
hand-written encoders.
There is also experimental support for generating DecodeRLP, and some new
methods have been added to the existing Stream type to support this. Creating
decoders with rlpgen is not recommended at this time because the generated
methods create very poor error reporting.
More detail about package rlp changes:
* rlp: externalize struct field processing / validation
This adds a new package, rlp/internal/rlpstruct, in preparation for the
RLP encoder generator.
I think the struct field rules are subtle enough to warrant extracting
this into their own package, even though it means that a bunch of
adapter code is needed for converting to/from rlpstruct.Type.
* rlp: add more decoder methods (for rlpgen)
This adds new methods on rlp.Stream:
- Uint64, Uint32, Uint16, Uint8, BigInt
- ReadBytes for decoding into []byte
- MoreDataInList - useful for optional list elements
* rlp: expose encoder buffer (for rlpgen)
This exposes the internal encoder buffer type for use in EncodeRLP
implementations.
The new EncoderBuffer type is a sort-of 'opaque handle' for a pointer to
encBuffer. It is implemented this way to ensure the global encBuffer pool
is handled correctly.
* core: fix warning flagging the use of DeepEqual on error
* apply the same change everywhere possible
* revert change that was committed by mistake
* fix build error
* Update config.go
* revert changes to ConfigCompatError
* review feedback
Co-authored-by: Felix Lange <fjl@twurst.com>
As per benchmark results below, these changes speed up encoding/decoding of
consensus objects a bit.
name old time/op new time/op delta
EncodeRLP/legacy-header-8 384ns ± 1% 331ns ± 3% -13.83% (p=0.000 n=7+8)
EncodeRLP/london-header-8 411ns ± 1% 359ns ± 2% -12.53% (p=0.000 n=8+8)
EncodeRLP/receipt-for-storage-8 251ns ± 0% 239ns ± 0% -4.97% (p=0.000 n=8+8)
EncodeRLP/receipt-full-8 319ns ± 0% 300ns ± 0% -5.89% (p=0.000 n=8+7)
EncodeRLP/legacy-transaction-8 389ns ± 1% 387ns ± 1% ~ (p=0.099 n=8+8)
EncodeRLP/access-transaction-8 607ns ± 0% 581ns ± 0% -4.26% (p=0.000 n=8+8)
EncodeRLP/1559-transaction-8 627ns ± 0% 606ns ± 1% -3.44% (p=0.000 n=8+8)
DecodeRLP/legacy-header-8 831ns ± 1% 813ns ± 1% -2.20% (p=0.000 n=8+8)
DecodeRLP/london-header-8 824ns ± 0% 804ns ± 1% -2.44% (p=0.000 n=8+7)
* rlp: pass length to byteArrayBytes
This makes it possible to inline byteArrayBytes. For arrays, the length is known
at encoder construction time, so the call to v.Len() can be avoided.
* rlp: avoid IsNil for pointer encoding
It's actually cheaper to use Elem first, because it performs less checks
on the value. If the pointer was nil, the result of Elem is 'invalid'.
* rlp: minor optimizations for slice/array encoding
For empty slices/arrays, we can avoid storing a list header entry in the
encoder buffer. Also avoid doing the tail check at encoding time because
it is already known at encoder construction time.