-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Benchmarking the ethdebug implementation #16537
Copy link
Copy link
Open
Labels
ethdebughigh effortA lot to implement but still doable by a single person. The task is large or difficult.A lot to implement but still doable by a single person. The task is large or difficult.high impactChanges are very prominent and affect users or the project in a major way.Changes are very prominent and affect users or the project in a major way.must haveSomething we consider an essential part of Solidity 1.0.Something we consider an essential part of Solidity 1.0.
Metadata
Metadata
Assignees
Labels
ethdebughigh effortA lot to implement but still doable by a single person. The task is large or difficult.A lot to implement but still doable by a single person. The task is large or difficult.high impactChanges are very prominent and affect users or the project in a major way.Changes are very prominent and affect users or the project in a major way.must haveSomething we consider an essential part of Solidity 1.0.Something we consider an essential part of Solidity 1.0.
Abstract
Stringing along debugging data
cannot have significant impact on the time it takes to invoke the compiler if debugging info is switched off. Any regression must be justified.
To ensure this, benchmarks should be run regularly.
Specification
Benchmarking real-world contracts can be slow for big projects which are also most representative. For faster feedback loops, performance proxy contracts / source files should be devised by, e.g., programmatically generating code. These are expected to be smaller and faster to compile and yet give reasonable indication of the expected outcome on benchmarks run on real-world data. Said proxy contracts should explore the characteristics of real-world contracts in terms of structural patterns - e.g., deeply nested vs wide - as well as feature usage - e.g., heavy ABI, storage, control flow, etc.
For bigger PRs and changes, run the real-world ones, too. For these, a single standard-json file should be extracted out of the external benchmark projects so that costs and variances inherent to the frameworks are not considered.
All the benchmarks themselves run on a dedicated machine with minimal external load and as-stable-as-possible CPU frequency to get reproducible results, triggered manually and/or via labels. Each benchmark should run a minimum of three times (ideally more often, depending on invocation speed) to get more reliable data. Report at least median, mean, and variance.
The scripts to run the benchmarks should be structured so that they can also be run locally with ease to inform about performance implication while, e.g., developing code. To this end there should be a single entry point to run performance tests with the ability to select individual benchmarks (real-world as well as generated) as well as the full suite or a subset thereof.