Features
The core principle of HypoFuzz is that it should be effortless to adopt: if you have Hypothesis tests, everything else is automatic. If you’re curious about what that “everything else” involves, this page is for you.
Collecting tests
HypoFuzz uses pytest to collect the test functions to fuzz, with almost identical command-line interfaces. If you’re using pytest ... to run your tests, hypothesis fuzz -- ... will fuzz them.
Note that tests which use pytest fixtures,
including autouse
fixtures, are not collected as they may behave
differently outside of the pytest runtime. We recommend using a context
manager and the with
statement instead.
Support for other test runners, such as unittest
,
is on our roadmap.
Execution model
HypoFuzz runs as one or more worker processes, by default one per available core, and an additional process serving the live dashboard as a website.
In each worker process, HypoFuzz prioritizes tests which discover new coverage, which maximises the rate of discovery and therefore minimises the time taken to cover each branch in your code. This adaptive approach is one of HypoFuzz’s advantages over other fuzzing workflows - and the reason you can apply it to a whole test suite at a time.
HypoFuzz dashboard
The HypoFuzz dashboard - online demo here - shows the current state of the fuzzing campaign overall, with a sub-page for each test to show more information.
Fuzzer details
HypoFuzz is, compared to other fuzzers in the literature, a bizzare mixture of every technique that seems to work. Instead of being based on “one brilliant idea” (oversimplifying, AFL = “coverage-guided mutation”, [BohmePhamRoychoudhury19] = “bias towards rare branches”, etc.), we have a single simple goal: fuzzing your property-based test suite should be effortless.
Because HypoFuzz is designed to exploit features that already exist in Hypothesis,
you can write tests which are designed to be fuzzed, but idiomatic @given
tests already work just fine.
Basic design
It’s a standard feedback-directed greybox fuzzer. The interesting parts are
HypoFuzz tests Python code, not native executables
we exploit property-based tests to detect semantic bugs, not just crashes
we use Hypothesis to generate highly-structured and typically valid data
we leverage a wider variety of feedbacks than most fuzzers
we fuzz very many more targets than most fuzzing campaigns
Corpus distillation
We exploit Hypothesis’ world-class test-case reduction logic (“shrinking”) to maintain a seed pool of minimal covering examples for each branch - or other reason to retain a seed.
Those other reasons include user-defined labels via hypothesis.event()
,
real-valued metrics with hypothesis.target()
,
and more to come.
Mutation logic
The mutation logic is minimum-viable at the moment. It works shockingly well, thanks to Hypothesis’ input structure, but substantial improvements are on the roadmap.
Ensemble fuzzing
HypoFuzz natively supports ensemble fuzzing [CJM+19], by periodically loading
any new examples from the database. This works in --unsafe
mode, where each
test function might run in multiple fuzzer processes at the same time, and with
other fuzzer tools leveraging e.g. the .hypothesis.fuzz_one_input
hook.
Ensemble fuzzing can also be modelled as a mixture of the ensembled behaviours, and HypoFuzz therefore attempts to run an adaptive mixture of all the useful behaviours we can implement. To the extent that this works, we get the benefits of ensembling and consume the minimum possible resources to required to do so.