This PR modifies how the metrics library handles `Enabled`: previously,
the package `init` decided whether to serve real metrics or just
dummy-types.
This has several drawbacks:
- During pkg init, we need to determine whether metrics are enabled or
not. So we first hacked in a check if certain geth-specific
commandline-flags were enabled. Then we added a similar check for
geth-env-vars. Then we almost added a very elaborate check for
toml-config-file, plus toml parsing.
- Using "real" types and dummy types interchangeably means that
everything is hidden behind interfaces. This has a performance penalty,
and also it just adds a lot of code.
This PR removes the interface stuff, uses concrete types, and allows for
the setting of Enabled to happen later. It is still assumed that
`metrics.Enable()` is invoked early on.
The somewhat 'heavy' operations, such as ticking meters and exp-decay,
now checks the enable-flag to prevent resource leak.
The change may be large, but it's mostly pretty trivial, and from the
last time I gutted the metrics, I ensured that we have fairly good test
coverage.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This change includes a lot of things, listed below.
The interfaces have been split up into one write-interface and one read-interface, with `Snapshot` being the gateway from write to read. This simplifies the semantics _a lot_.
Example of splitting up an interface into one readonly 'snapshot' part, and one updatable writeonly part:
```golang
type MeterSnapshot interface {
Count() int64
Rate1() float64
Rate5() float64
Rate15() float64
RateMean() float64
}
// Meters count events to produce exponentially-weighted moving average rates
// at one-, five-, and fifteen-minutes and a mean rate.
type Meter interface {
Mark(int64)
Snapshot() MeterSnapshot
Stop()
}
```
This PR makes the concurrency model clearer. We have actual meters and snapshot of meters. The `meter` is the thing which can be accessed from the registry, and updates can be made to it.
- For all `meters`, (`Gauge`, `Timer` etc), it is assumed that they are accessed by different threads, making updates. Therefore, all `meters` update-methods (`Inc`, `Add`, `Update`, `Clear` etc) need to be concurrency-safe.
- All `meters` have a `Snapshot()` method. This method is _usually_ called from one thread, a backend-exporter. But it's fully possible to have several exporters simultaneously: therefore this method should also be concurrency-safe.
TLDR: `meter`s are accessible via registry, all their methods must be concurrency-safe.
For all `Snapshot`s, it is assumed that an individual exporter-thread has obtained a `meter` from the registry, and called the `Snapshot` method to obtain a readonly snapshot. This snapshot is _not_ guaranteed to be concurrency-safe. There's no need for a snapshot to be concurrency-safe, since exporters should not share snapshots.
Note, though: that by happenstance a lot of the snapshots _are_ concurrency-safe, being unmutable minimal representations of a value. Only the more complex ones are _not_ threadsafe, those that lazily calculate things like `Variance()`, `Mean()`.
Example of how a background exporter typically works, obtaining the snapshot and sequentially accessing the non-threadsafe methods in it:
```golang
ms := metric.Snapshot()
...
fields := map[string]interface{}{
"count": ms.Count(),
"max": ms.Max(),
"mean": ms.Mean(),
"min": ms.Min(),
"stddev": ms.StdDev(),
"variance": ms.Variance(),
```
TLDR: `snapshots` are not guaranteed to be concurrency-safe (but often are).
I also changed the `Sample` type: previously, it iterated the samples fully every time `Mean()`,`Sum()`, `Min()` or `Max()` was invoked. Since we now have readonly base data, we can just iterate it once, in the constructor, and set all four values at once.
The same thing has been done for runtimehistogram.
Back when ResettingTImer was implemented, as part of https://github.com/ethereum/go-ethereum/pull/15910, Anton implemented a `Percentiles` on the new type. However, the method did not conform to the other existing types which also had a `Percentiles`.
1. The existing ones, on input, took `0.5` to mean `50%`. Anton used `50` to mean `50%`.
2. The existing ones returned `float64` outputs, thus interpolating between values. A value-set of `0, 10`, at `50%` would return `5`, whereas Anton's would return either `0` or `10`.
This PR removes the 'new' version, and uses only the 'legacy' percentiles, also for the ResettingTimer type.
The resetting timer snapshot was also defined so that it would expose the internal values. This has been removed, and getters for `Max, Min, Mean` have been added instead.
A lot of types were exported, but do not need to be. This PR unexports quite a lot of them.
metrics: refactor metrics (28035)
* intro new timeout (#651)
* intro new timeout
* correct comment
* disable ProcessForensics
* disable ProcessForensics
* change version
* enable periodicProfilingFlag
* fix: ignore old timeout msg
* fix: ignore old timeout msg including equal to the current round
* udpate version file
* move masternode in v2 config
* update number to meet 7 vote for current setup
* add test
* update all failed test
* fix test
* remove comment
* remove comment
* fix test
This PR makes committed blocks non-reorg-able inside `Blockchain` struct. This ensures V2 consensus safety property in the aspect of blockchain head: committed blocks' state will not be reorg and users will always see committed blocks (or their child blocks) as current head of the blockchain.
* stop reorg at committed blocks
* fix tests
* fix tests
* V2 truncate MaxMasternodes from candidates after penalty,
V1 same as before
TestUpdateMultipleMasterNodes: test V2, in snapshot we have all candidates, but at epoch switch, we pick MaxMasternodes
* code looks better
* Fix issue when resync is not getting the right consensus config values
* add test and fix log bug
* fix test
* delete temp file
Co-authored-by: Liam Lai <liam.icheng.lai@gmail.com>
* fix wrong config hash and update v2 params on mainnet
* update config and all the test
* hard code binary into code
* add default config for testing
* update test timestamp
* update forensics proof data structure to accomedate vote type
* refactor log
* change blocknum type to uint64
* fix test
Co-authored-by: Liam Lai <liam.icheng.lai@gmail.com>
* process forensics
* Found common signers at same round for forensics
* find attackers
* add test for forensics
* run setCommittedQCs after processForensics
* clean up the pool old round
* add unit test to cover the vote key format
* add gapNumber to the vote pool key
* fix race condition in pool
* remove verify gap number in vote handler
* typo and checkYourturnWithinFinalisedMasternodes func name to yourturn
* remove redundant code from verifyQC
* Verify QC to optionally pass parent header. This is used to help verifyHeaders
* move difficulty into its own file
* verify header including validator
* re-structure v1 v2 tests
* remove unused test function
* add test to check coinbase and validator address matches
* refactor engine v2 to group private functions into same file
* v2 Hook Reward, need test
* test reward
* fix RewardHook due to modifying params config directly (#56)
* more test
* finish test
Co-authored-by: Jerome <wjrjerome@gmail.com>
* move config into code
* set devnet switch block number very high
* increase timeout and certThreshold for devnet config
Co-authored-by: Jianrong <wjrjerome@gmail.com>
* fix vote and block insertion race condition
* fix race condition in the vote handler using multiple go routine
* check go routine race condition during ci cd
* remove race check as there are eth code that is failing
* remove unused signature list variable
* add isEpochSwitch function and refactor utils
* fix broken first v2 epoch switch block
* use adaptor epoch switch function to determine v1 v2 epoch swtich block
* add test for the GetMasternodesByNumber and GetCurrentEpochSwitchBlock function
* add v2 test for isAuthroisedAddress
* Use GetCurrentEpochSwitchBlock in findNearestSignedBlock api
* New struct in consensus/XDPoS/utils/types.go, util functions, and test. (#14)
* define vote, timeout, sync info, qc, tc, extra fields in types.go, add test in types_test.go
* add json tag in types.go, refine encoder decoder of extra fields
* refactor types.go utils.go
* re-write types, comments
* add Hash SigHash for types, and tests
* define Round type
* remove unnecessary logs
* add v2 engine functions placeholder
* typo fix on the consensus v2 function placeholders
* add countdown timer
* make initilised private to countdown
* add v2 specific config struct
* rename some config variables
* Implement BFT Message receiver (#13)
* fix or skip tests due to PR-136 changes
* add bft receiver functions
* add bft receiver functions
* rename tc to TimeoutCert
* implement more functions
* New struct in consensus/XDPoS/utils/types.go, util functions, and test. (#14)
* define vote, timeout, sync info, qc, tc, extra fields in types.go, add test in types_test.go
* add json tag in types.go, refine encoder decoder of extra fields
* refactor types.go utils.go
* re-write types, comments
* add Hash SigHash for types, and tests
* define Round type
* remove unnecessary logs
* add temp functions
* add v2 engine functions placeholder
* typo fix on the consensus v2 function placeholders
* add countdown timer
* make initilised private to countdown
* push verify function
* add test on receiving vote
* revert type change
* add async on broadcast function
* add quit initial
* fix test
Co-authored-by: Jianrong <wjrjerome@gmail.com>
Co-authored-by: wgr523 <wgr523@gmail.com>
* generate and verify timeout message
* Consensus V2 variable, timeout pool (#19)
* fill in XDPoS_v2 variables and processQC/TC
* add timeout pool, refine engine variables
* refactor type functions
* solve a small pointer bug
* create general pool and its test, refine engine
* refine pool, add xdpos v2 config cert threshold
* refine config
* vote and timeout handlers
* fix pool test
* bft miner preparation
* review comment improvement
* update
* relocate tests
* add and remove comment
* fix the syntax error
* update network layer and add handler functions (#23)
* update network layer and add handler functions
* fix test syntax error
* add ProcessQC implementation
* add ProcessQC tests
* add snapshot test
* add wait qc process
* remove testing files
* add route snapshot
* fix merge issue
* add default v2 behaviour (#24)
* add v2 ecrecover functions and refactor test
* fix all the tests
* put minimun lock variable
* debugging prepare and seal v2 blocks
* Trigger proposeBlockHandler after v2 block received and verified in fetcher
* skip snapshot apply related tests
* update test check
* rename bfter to bft handler and ignore normal behviour
* fix bugs during local 4 node run
* fix test
* fix sync info test
* fix bugs during local 4 node run
* rebase and fix bug
* remove hook validators function"
Co-authored-by: wgr523 <wgr523@gmail.com>
Co-authored-by: Jianrong <wjrjerome@gmail.com>
* fill in XDPoS_v2 variables and processQC/TC
* add timeout pool, refine engine variables
* refactor type functions
* solve a small pointer bug
* create general pool and its test, refine engine
* refine pool, add xdpos v2 config cert threshold
* refine config
* fix or skip tests due to PR-136 changes
* add bft receiver functions
* add bft receiver functions
* rename tc to TimeoutCert
* implement more functions
* New struct in consensus/XDPoS/utils/types.go, util functions, and test. (#14)
* define vote, timeout, sync info, qc, tc, extra fields in types.go, add test in types_test.go
* add json tag in types.go, refine encoder decoder of extra fields
* refactor types.go utils.go
* re-write types, comments
* add Hash SigHash for types, and tests
* define Round type
* remove unnecessary logs
* add temp functions
* add v2 engine functions placeholder
* typo fix on the consensus v2 function placeholders
* add countdown timer
* make initilised private to countdown
* push verify function
* add test on receiving vote
* revert type change
* add async on broadcast function
* add quit initial
* fix test
Co-authored-by: Jianrong <wjrjerome@gmail.com>
Co-authored-by: wgr523 <wgr523@gmail.com>
* define vote, timeout, sync info, qc, tc, extra fields in types.go, add test in types_test.go
* add json tag in types.go, refine encoder decoder of extra fields
* refactor types.go utils.go
* re-write types, comments
* add Hash SigHash for types, and tests
* define Round type
* remove unnecessary logs