This was somehow wrong and corrected by codegen later on, but we should be re-using the same generic id across an entire definition if the variable refers to the same element.
Before this commit, we would parse 'Pair' as a user-defined
data-types, and thus piggybacking on that whole record system. While
perhaps handy for some things, it's also semantically wrong and
induces a lot more complexity in codegen which now needs to
systematically distinguish every data-type access between pairs, and
others.
So it's better to have it as a separate expression, and handle it
similar to tuples (since it's fundamentally a 2-tuple with a special
serialization).
This makes the search for counterexample slower in some cases by 30-40% with the hope of finding better counterexamples. We might want to add a flag '--simplification-level' to the command-line to let users decide on the level of simplifications.
And move some logic out of project/lib to be near the CheckedModule
instead. The project API is already quite heavy and long, so making it
more lightweight is generally what we want to tend to.
This changes ensure that we only compile modules from dependencies
that are used (or transitively used) in the project. This allows to
discard entire compilation steps at a module level, for modules that
we do not use.
The main goal of this change isn't performances. It's about making
dependencies management slightly easier in the time we decide whether
and how we want to manage transitive dependencies in Aiken.
A concrete case here is aiken-lang/stdlib, which will soon depend on
aiken-lang/fuzz. However, we do not want to require every single
project depending on stdlib to also require fuzz. So instead, we want
to seggregate fuzz API from stdlib in separate module, and only
compile those if they appear in the pruned dependency graph.
While the goal isn't performances, here are some benchmarks analyzing
the performances of deps pruning on a simple project depends on a few
modules from stdlib:
Benchmark 1: ./aiken-without-deps-pruning check scratchpad
Time (mean ± σ): 190.3 ms ± 101.1 ms [User: 584.5 ms, System: 14.2 ms]
Range (min … max): 153.0 ms … 477.7 ms 10 runs
Benchmark 2: ./aiken-with-deps-pruning check scratchpad
Time (mean ± σ): 162.3 ms ± 46.3 ms [User: 572.6 ms, System: 14.0 ms]
Range (min … max): 142.8 ms … 293.7 ms 10 runs
As we can see, this change seems to have an overall positive impact on
the compilation time.
It might be slightly cleaner and more extensible to change to return a summary, potentially even making track the tests, coverage, etc. so it can be serialized to JSON. But, for now, this is much simpler, and the approach that KtorZ suggested.
We'll piggyback on the tracing capabilities of the VM to provide labelling for prop tests. To ensure we do not interfere with normal traces, we only count traces that starts with a NUL byte as label. That convention is assumed to be known of the companion fuzz library that should then provide the labelling capabilities as a dedicated function.
I've been benchmarking that through the shrink of 'large' lists, and the cache brings about 1.5x speed increase. For small and simple cases, the cache as no visible effects (positive or negative).
Before this commit, we would always show the 'declared form' of type aliases, with their generic, non-instantiated parameters. This now tries to unify the annotation with the underlying inferred type to provide even better alias pretty printing.
Using ByteArrays as vectors on-chain is a lot more efficient than relying on actul Data's list of values. From the Rust end, it doesn't change much as we were already manipulating vectors anyway.
Also, this commit makes `apply_term` automatically re-intern the
program since it isn't safe to apply any term onto a UPLC program. In
particular, terms that introduce new let-bindings (via lambdas) will
mess with the already generated DeBruijn indices.
The problem doesn't occur for pure constant terms like Data. So we
still have a safe and fast version 'apply_data' when needed.
This was a mess to say to the least. The mess started when we wanted
to make all definitions in codegen use immutable maps of references --
which was and still is a good idea. Yet, the population of the data
types and functions definitions was done somehow in a separate step,
in a rather ad-hoc manner.
This commit changes that to ensure the project's data_types and
functions are populated while type checking the AST such that we need
not to redo it after.
The code for registering the data type definitions and function
definitions was also duplicated in at least 3 places. It is now a
method of the TypedModule.
Note: this change isn't only just cosmetic, it's also necessary for
the commit that follows which aims at adding tests to the set of
available function definitions, thus allowing to make property tests
callable.
Those end-to-end tests are useful. Both for controlling the behavior of the shrinker, but also to double check the reification of Plutus Data back into untyped expressions.
I had to work-around a few things to get opaque type and private types play nice. Also found a weird bug due to how we apply parameters after unique debruijn indexes have been also applied. A work-around is to re-intern the program.