This is less confusing that getting an 'UnknownModule' error reporting
even a different module name than the one actually being important
('env').
Also, this commit fixes a few errors found in the type-checker
when reporting 'UnknownModule' errors. About half the time, we would
actually attached _imported modules_ instead of _importable modules_
to the error, making the neighboring suggestion quite worse (nay
useless).
The original goal for this commit was to allow casting from Data on
patterns without annotation. For example, given some custom type
'OrderDatum':
```
expect OrderDatum { requested_handle, destination, .. }: OrderDatum = datum
```
would work fine, but:
```
expect OrderDatum { requested_handle, destination, .. } = datum
```
Yet, the annotation feels unnecessary at this point because type can
be inferred from the pattern itself. So this commit allows, whenever
possible (ie when the pattern is neither a discard nor a var), to
infer the type from a pattern.
Along the way, I also found a couple of weird behaviours surrounding
this kind of assignments, in particular in combination with let. I'll
highlight those in the next PR (#979).
- Trace-if-false are now completely discarded in compact mode.
- Only the label (i.e. first trace argument) is preserved.
- When compiling with tracing _compact_, the first label MUST unify to
a string. This shouldn't be an issue generally speaking and would
enforce that traces follow the pattern
```
label: arg_0[, arg_1, ..., arg_n]
```
Note that what isn't obvious with these changes is that we now support
what the "emit" keyword was trying to achieve; as we compile now with
user-defined traces only, and in compact mode to only keep event
labels in the final contract; while allowing larger payloads with
verbose tracing.
This is not fully satisfactory as it pollutes a bit the prelude. Ideally, those functions should only be visible
and usable by the underlying trace code. But for now, we'll just go with it.
This commit introduces a new feature into
the parser, typechecker, and formatter.
The work for code gen will be in the next commit.
I was able to leverage some existing infrastructure
by making using of `AssignmentPattern`. A new field
`is` was introduced into `IfBranch`. This field holds
a generic `Option<Is>` meaning a new generic has to be
introduced into `IfBranch`. When used in `UntypedExpr`,
`IfBranch` must use `AssignmentPattern`. When used in
`TypedExpr`, `IfBranch` must use `TypedPattern`.
The parser was updated such that we can support this
kind of psuedo grammar:
`if <expr:condition> [is [<pattern>: ]<annotation>]`
This can be read as, when parsing an `if` expression,
always expect an expression after the keyword `if`. And then
optionally there may be this `is` stuff, and within that you
may optionally expect a pattern followed by a colon. We will
always expect an annotation.
This first expression is still saved as the field
`condition` in `IfBranch`. If `pattern` is not there
AND `expr:condition` is `UntypedExpr::Var` we can set
the pattern to be `Pattern::Var` with the same name. From
there shadowing should allow this syntax sugar to feel
kinda magical within the `IfBranch` block that follow.
The typechecker doesn't need to be aware of the sugar
described above. The typechecker looks at `branch.is`
and if it's `Some(is)` then it'll use `infer_assignment`
for some help. Because of the way that `is` can inject
variables into the scope of the branch's block and since
it's basically just like how `expect` works minus the error
we get to re-use that helper method.
It's important to note that in the typechecker, if `is`
is `Some(_)` then we do not enforce that `condition` is
of type `Bool`. This is because the bool itself will be
whether or not the `is` itself holds true given a PlutusData
payload.
When `is` is None, we do exactly what was being done
previously so that plain `if` expressions remain unaffected
with no semantic changes.
The formatter had to be made aware of the new changes with
some simple changes that need no further explanation.
This is mainly a syntactic trick/sugar, but it's been pretty annoying
to me for a while that we can't simply pattern-match/destructure
single-variant constructors directly from the args list. A classic
example is when writing property tests:
```ak
test foo(params via both(bytearray(), int())) {
let (bytes, ix) = params
...
}
```
Now can be replaced simply with:
```
test foo((bytes, ix) via both(bytearray(), int())) {
...
}
```
If feels natural, especially coming from the JavaScript, Haskell or
Rust worlds and is mostly convenient. Behind the scene, the compiler
does nothing more than re-writing the AST as the first form, with
pre-generated arg names. Then, we fully rely on the existing
type-checking capabilities and thus, works in a seamless way as if we
were just pattern matching inline.
There's no reasons for this to be a property of only ArgName::Named to begin with. And now, with the extra indirection introduced for arg_name, it may leads to subtle issues when patterns args are used in validators.
This is the best we can do for this without
rearchitecting when we rewrite backpassing to
plain ol' assignments. In this case, if we see
a var and there is no annotation (thus probably not a cast),
then it's safe to rewrite to a `let` instead of an `expect`.
This way, we don't get a warning that is **unfixable**.
We are not trying to solve every little warning edge
case with this fix. We simply just can't allow there
to be a warning that the user can't make go away through
some means. All other edge cases like pattern matching on
a single contructor type with expect warnings can be fixed
via other means.
This is crucial as some checks regarding variable usages depends on
warnings; so we may accidentally remove variables from the AST as a
consequence of backtracking for deep inferrence.
The current inferrence system walks expressions from "top to bottom".
Starting from definitions higher in the source file, and down. When a
call is encountered, we use the information known for the callee
definition we have at the moment it is inferred.
This causes interesting issues in the case where the callee doesn't
have annotations and in only partially known. For example:
```
pub fn list(fuzzer: Option<a>) -> Option<List<a>> {
inner(fuzzer, [])
}
fn inner(fuzzer, xs) -> Option<List<b>> {
when fuzzer is {
None -> Some(xs)
Some(x) -> Some([x, ..xs])
}
}
```
In this small program, we infer `list` first and run into `inner`.
Yet, the arguments for `inner` are not annotated, so since we haven't
inferred `inner` yet, we will create two unbound variables.
And naturally, we will link the type of `[]` to being of the same type
as `xs` -- which is still unbound at this point. The return type of
`inner` is given by the annotation, so all-in-all, the unification
will work without ever having to commit to a type of `[]`.
It is only later, when `inner` is inferred, that we will generalise
the unbound type of `xs` to a generic which the same as `b` in the
annotation. At this point, `[]` is also typed with this same generic,
which has a different id than `a` in `list` since it comes from
another type definition.
This is unfortunate and will cause issues down the line for the code
generation. The problem doesn't occur when `inner`'s arguments are
properly annotated or, when `inner` is actually inferred first.
Hence, I saw two possible avenues for fixing this problem:
1. Detect the presence of 'uncongruous generics' in definitions after
they've all been inferred, and raise a user error asking for more
annotations.
2. Infer definitions in dependency order, with definitions used in
other inferred first.
This commit does (2) (although it may still be a good idea to do (1)
eventually) since it offers a much better user experience. One way to
do (2) is to construct a dependency graph between function calls, and
ensure perform a topological sort.
Building such graph is, however, quite tricky as it requires walking
through the AST while maintaining scope etc. which is more-or-less
already what the inferrence step is doing; so it feels like double
work.
Thus instead, this commit tries to do a deep-first inferrence and
"pause" inferrence of definitions when encountering a call to fully
infer the callee first. To achieve this properly, we must ensure that
we do not infer the same definition again, so we "remember" already
inferred definitions in the environment now.
Before this commit, we would parse 'Pair' as a user-defined
data-types, and thus piggybacking on that whole record system. While
perhaps handy for some things, it's also semantically wrong and
induces a lot more complexity in codegen which now needs to
systematically distinguish every data-type access between pairs, and
others.
So it's better to have it as a separate expression, and handle it
similar to tuples (since it's fundamentally a 2-tuple with a special
serialization).
Also slightly extended the check test 'framework' to allow registering side-dependency and using them from another module. This allows to check the interplay between opaque type from within and outside of their host module.
Discard pattern are _dangerous_ is used recklessly. The problem comes
from maintenance and when adding new fields. We usually don't get any
compiler warnings which may lead to missing spots and confusing
behaviors.
So I have, in some cases, inline discard to explicitly list all
fields. That's a bit more cumbersome to write but hopefully will catch
a few things for us in the future.
The main trick here was transforming Assignment
to contain `Vec<UntypedPattern, Option<Annotation>>`
in a field called patterns. This then meant that I
could remove the `pattern` and `annotation` field
from `Assignment`. The parser handles `=` and `<-`
just fine because in the future `=` with multi
patterns will mean some kind of optimization on tuples.
But, since we don't have that optimization yet, when
someone uses multi patterns with an `=` there will be an
error returned from the type checker right where `infer_seq`
looks for `backpassing`. From there the rest of the work
was in `Project::backpassing` where I only needed to rework
some things to work with a list of patterns instead of just one.
The 3rd kind of assignment kind (Bind) is gone and now reflected through a boolean parameter. Note that this parameter is completely erased by the type-checker so that the rest of the pipeline (i.e. code-generation) doesn't have to make any assumption. They simply can't see a backpassing let or expect.