HypoFuzz is both an active research project and a production-ready tool.
We believe that this dual nature is a strength for both: users get to leverage a tool based on cutting-edge research, but also ensure that the problems it solves are ones that really matter. There’s a lot of applied research out there which never makes the translation to practice - and we don’t intend to add to it.
HypoFuzz uses calendar-based versioning, with a
Because HypoFuzz is closely tied to Hypothesis internals, we recommend pinning your transitive dependencies using a tool like pip-compile, poetry, or pipenv. This makes it possible to reproduce old environments based only on your VCS history, and ensures that any compatibility problems in new releases can be handled when it suits you.
HypoFuzz’s direct dependencies are:
install_requires = [ "coverage >= 5.2.1", "dash >= 1.15.0", "hypothesis[cli] >= 5.29.4", "pandas >= 1.0.0", "psutil >= 3.0.0", "pytest >= 6.0.1", "requests >= 2.24.0", ]
API deprecation policy¶
The command-line interface will not make incompatible changes without at least three months notice, during which passing deprecated options will emit runtime warnings (e.g. via the terminal, displayed on the dashboard, etc.).
User-accesible integration points with Hypothesis, such as the
ExampleDatabase API, are considered
stable upstream and are unlikely to change even in major version releases.
If it is impractical to continue to support old versions of Hypothesis (or other dependencies) after a major update, the minimum supported version will be updated. If you would like to use new versions of HypoFuzz with old versions of its dependencies, please get in touch to discuss your specific needs.
See Summary of related research for notes on where some these ideas come from.
While we’re interested in the features described below, HypoFuzz remains a research project and plans may change as new information comes to light. Please get in touch to let us know which you would prioritize, or if we’re missing something important to you.
Workflow and development lifecycle¶
expand user-facing documentation, examples, tips, etc.
better reporting of test collection, e.g. tests skipped due to use of fixtures
warn about tests where found examples can’t be replayed because of settings decorator with derandomize=True or database=None; recommend profiles instead
support collecting tests with
unittestas well as pytest
exploit VCS metadata, i.e. target recently-changed parts of the SUT and new / recently changed tests (c.f. pypi-testmon)
Monitoring and reporting¶
The current dashboard is a good start, but there’s plenty of room for improvement.
Main wishlist item: good support for starting the dashboard and worker processes separately.
Cuumulative and per-input coverage reports like coverage html
Compare branch-hit frequency between blackbox and mutational modes, to assist in designing strategies (inspired by this blog post).
integrate a time-travelling debugger like https://pytrace.com/
implement predictive fuzzing like Pythia, and use that for prioritization (currently number of inputs since last discovery)
(maybe) support moving targets between processes; could be more efficient in the limit but constrains scaling. Randomised assignment on restart probably better.
Better mutation logic¶
structure-aware operators for crossover / replacement / splicing etc
validity-aware mutations (Zest/RLCheck), based on structure
Nezha-like differential coverage via dynamic contexts
Guiding towards what?¶
Typically, fuzzers haven’t really been guided towards new arcs (src-dest branches) as they have been good at exploiting them once found at random. We can do that, but we can probably also do better.
use CFG from coverage to tell if new branches are actually available from a given path. If not, we can hit it less often. Note that branch coverage != available bugs; the control flow graph is not identical to the behaviour partition of the program.
try using a custom trace function, investigate performance and use of alternative coverage metrics (e.g. length-n path segments, callstack-aware coverage, etc.)
fuzz arbitrary scores with