Constants are like tiny programs, so they are bound by the same rules
as validators and other programs. In fact, functions are slightly more
flexible in that they allow generic constant expressions like
`List<a>`.
Yet, there is no way to contain such generic structure that contain
inhabitants in a way that satisfies the type-checker. In the example
of `List<a>`, the only inhabitant of that type that we can construct
is the empty list. Anything else would require holding onto some
generic value.
In addition, we can't force literal values into generic annotation, as
something like:
```
const foo: List<a> = [1, 2, 3]
```
wouldn't type-check either since the right-side would unify to
`List<Int>`. And again, the only right-hand side that can type-check
is the empty list without any inhabitant.
The added restriction on generic function is necessary because while
we allow constants to return lambda, we cannot (easily) generate UPLC
that is generic in its argument. By the time we generate UPLC, the
underlying types have to be known.
This is not a "proper" fix as it simply get rid of the warning
altogether (whether you use or not the destructured values).
The reason for removing the warning entirely is because (1) it's
simpler, but more so (2) there's no impact on the final code produced
_anyway_. Redundant let bindings are already removed by the compiler;
and while it's an implicit behaviour that requires a proper warning
when it's coming from a user-defined assignment; here the redundant
assignment is introduced by the compiler to begin with as another
implicit behavior!
So we have an implicit behaviour triggering a warning on another
implicit behaviour. Truth is, there's no impact in having those
parameters destructured and unused. So since users are already not
aware that this results in an implicit let assignment being inserted
in place for them; there's no need for the warning at all.
This is only a start. It compiles, but with a few TODOs left open. In particular, it doesn't currently handle constants depending on other constants or functions; nor does it hoist constants.
Technically, we always need a fallback just because the way the UPLC
is going to work. The last case in the handler pattern matching is
always going to be else ...
We could optimize that away and when the validator is exhaustive, make
the last handler the fallback. Yet, it's really a micro optimization
that saves us one extra if/else. So the sake of getting things
working, we always assume that there's a fallback but, with the extra
condition that when the validator is exhaustive (i.e. there's a
handler covering all purposes), the fallback HAS TO BE the default
fallback (i.e. (_) => fail).
This allows us to gracefully format it out, and also raise an error in
case where there's an extraneous custom fallback.
This is a little trick which detects record access and replace them
with a simple var. The var itself is the validator handler name,
though since it contains dots, it cannot be referred to by users
explicitly. Yet fundamentally, it is semantically equivalent to just
calling the function by its name.
Note that this commit also removes the weird backdoor for allowing
importing validators in modules starting with `tests`. Allowing
validators handler to be used in importable module requires more work
and is arguably useful; so we will wait until someone complain and
reconsider the proper way to do it.
Without that, we may encounter weird error messages when writing
validators without an explicit `else`. Since we automatically fill it
with a `fail`; without annotation, it unifies to a generic parameter.
The existing check that would look for the body being an error term is
ill-advised as it doesn't work as soon as one adds tracing, or make
the validator a parameterized validator. Plus, it may simply trigger
the wrong behavior as one can now annotate a validator with _whatever_
and get pass the type-checker by plucking a `fail` keyword as body.
- We now consistently desugar an expect in the last position as
`Void`. Regardless of the pattern. Desugaring to a boolean value is
deemed too confusing.
- This commit also removes the desugaring for let-binding. It's only
ever allowed for _expect_ which then behaves like a side effect.
- We also now allow tests to return either `Bool` or `Void`. A test
that returns `Void` is treated the same as a test returning `True`.
We simply provide a flag with a free-form output which acts as
the module to lookup in the 'env' folder. The strategy is to replace
the environment module name on-the-fly when a user tries to import
'env'.
If the environment isn't found, an 'UnknownModule' error is raised
(which I will slightly adjust in a following commits to something more
related to environment)
There are few important consequences to this design which may not seem
immediately obvious:
1. We parse and type-check every env modules, even if they aren't
used. This ensures that code doesn't break with a compilation error
simply because people forgot to type-check a given env.
Note that compilation could still fail because the env module
itself could provide an invalid API. So it only prevents each
modules to be independently wrong when taken in isolation.
2. Technically, this also means that one can import env modules in
other env modules by their names. I don't know if it's a good or
bad idea at this point but it doesn't really do any wrong;
dependencies and cycles are handlded all-the-same.
- Trace-if-false are now completely discarded in compact mode.
- Only the label (i.e. first trace argument) is preserved.
- When compiling with tracing _compact_, the first label MUST unify to
a string. This shouldn't be an issue generally speaking and would
enforce that traces follow the pattern
```
label: arg_0[, arg_1, ..., arg_n]
```
Note that what isn't obvious with these changes is that we now support
what the "emit" keyword was trying to achieve; as we compile now with
user-defined traces only, and in compact mode to only keep event
labels in the final contract; while allowing larger payloads with
verbose tracing.
There's no reasons for this to be a property of only ArgName::Named to begin with. And now, with the extra indirection introduced for arg_name, it may leads to subtle issues when patterns args are used in validators.
The current inferrence system walks expressions from "top to bottom".
Starting from definitions higher in the source file, and down. When a
call is encountered, we use the information known for the callee
definition we have at the moment it is inferred.
This causes interesting issues in the case where the callee doesn't
have annotations and in only partially known. For example:
```
pub fn list(fuzzer: Option<a>) -> Option<List<a>> {
inner(fuzzer, [])
}
fn inner(fuzzer, xs) -> Option<List<b>> {
when fuzzer is {
None -> Some(xs)
Some(x) -> Some([x, ..xs])
}
}
```
In this small program, we infer `list` first and run into `inner`.
Yet, the arguments for `inner` are not annotated, so since we haven't
inferred `inner` yet, we will create two unbound variables.
And naturally, we will link the type of `[]` to being of the same type
as `xs` -- which is still unbound at this point. The return type of
`inner` is given by the annotation, so all-in-all, the unification
will work without ever having to commit to a type of `[]`.
It is only later, when `inner` is inferred, that we will generalise
the unbound type of `xs` to a generic which the same as `b` in the
annotation. At this point, `[]` is also typed with this same generic,
which has a different id than `a` in `list` since it comes from
another type definition.
This is unfortunate and will cause issues down the line for the code
generation. The problem doesn't occur when `inner`'s arguments are
properly annotated or, when `inner` is actually inferred first.
Hence, I saw two possible avenues for fixing this problem:
1. Detect the presence of 'uncongruous generics' in definitions after
they've all been inferred, and raise a user error asking for more
annotations.
2. Infer definitions in dependency order, with definitions used in
other inferred first.
This commit does (2) (although it may still be a good idea to do (1)
eventually) since it offers a much better user experience. One way to
do (2) is to construct a dependency graph between function calls, and
ensure perform a topological sort.
Building such graph is, however, quite tricky as it requires walking
through the AST while maintaining scope etc. which is more-or-less
already what the inferrence step is doing; so it feels like double
work.
Thus instead, this commit tries to do a deep-first inferrence and
"pause" inferrence of definitions when encountering a call to fully
infer the callee first. To achieve this properly, we must ensure that
we do not infer the same definition again, so we "remember" already
inferred definitions in the environment now.
Discard pattern are _dangerous_ is used recklessly. The problem comes
from maintenance and when adding new fields. We usually don't get any
compiler warnings which may lead to missing spots and confusing
behaviors.
So I have, in some cases, inline discard to explicitly list all
fields. That's a bit more cumbersome to write but hopefully will catch
a few things for us in the future.
This was a mess to say to the least. The mess started when we wanted
to make all definitions in codegen use immutable maps of references --
which was and still is a good idea. Yet, the population of the data
types and functions definitions was done somehow in a separate step,
in a rather ad-hoc manner.
This commit changes that to ensure the project's data_types and
functions are populated while type checking the AST such that we need
not to redo it after.
The code for registering the data type definitions and function
definitions was also duplicated in at least 3 places. It is now a
method of the TypedModule.
Note: this change isn't only just cosmetic, it's also necessary for
the commit that follows which aims at adding tests to the set of
available function definitions, thus allowing to make property tests
callable.
Those end-to-end tests are useful. Both for controlling the behavior of the shrinker, but also to double check the reification of Plutus Data back into untyped expressions.
I had to work-around a few things to get opaque type and private types play nice. Also found a weird bug due to how we apply parameters after unique debruijn indexes have been also applied. A work-around is to re-intern the program.
This is very very rough at the moment. But it does a couple of thing:
1. The 'ArgVia' now contains an Expr/TypedExpr which should unify to a Fuzzer. This is to avoid having to introduce custom logic to handle fuzzer referencing. So this now accepts function call, field access etc.. so long as they unify to the right thing.
2. I've done quite a lot of cleanup in aiken-project mostly around the tests and the naming surrounding them. What we used to call 'Script' is now called 'Test' and is an enum between UnitTest (ex-Script) and PropertyTest. I've moved some boilerplate and relevant function under those module Impl.
3. I've completed the end-to-end pipeline of:
- Compiling the property test
- Compiling the fuzzer
- Generating an initial seed
- Running property tests sequentially, threading the seed through each step.
An interesting finding is that, I had to wrap the prop test in a similar wrapper that we use for validator, to ensure we convert primitive types wrapped in Data back to UPLC terms. This is necessary because the fuzzer return a ProtoPair (and soon an Array) which holds 'Data'.
At the moment, we do nothing with the size, though the size should ideally grow after each iteration (up to a certain cap).
In addition, there are a couple of todo/fixme that I left in the code as reminders of what's left to do beyond the obvious (error and success reporting, testing, etc..)
The parameter is special as it takes no annotation but a 'via' keyword followed by an expression that should unify to a Fuzzer<a>, where Fuzzer<a> = fn(Seed) -> (Seed, a). The current commit only allow name identifiers for now. Ultimately, this may allow full expressions.
This commit allows Data to be optionally annotated with a
phantom-type. This doesn't change anything in codegen but we can now
leverage this information to generate better blueprint schemas.