This is debatable, but I would argue that it's been sufficiently
annoying for people and such a low-hanging fruit that we ought to do
something about it.
The strategy here is simple: when we find a sequence of expression
that ends with an assignment (let or expect), we can simply desugar it
into two expressions: the assignment followed by either `Void` or a
boolean.
The latter is used when the assignment pattern is itself a boolean;
the next boolean becomes the expected value. The former, `Void`, is
used for everything else. So said differently, any assignment
implicitly _returns Void_, except for boolean which return the actual
patterned bool.
<table>
<thead><tr><th>expression</th><th>desugar into</th></tr></thead>
<tbody>
<tr>
<td>
```aiken
fn expect_bool(data: Data) -> Void {
expect _: Bool = data
}
```
</td>
<td>
```aiken
fn expect_bool(data: Data) -> Void {
expect _: Bool = data
Void
}
```
</td>
</tr>
<tr>
<td>
```aiken
fn weird_maths() -> Bool {
expect 1 == 2
}
```
</td>
<td>
```aiken
fn weird_maths() -> Bool {
expect True = 1 == 2
True
}
```
</td>
</tr>
</tbody>
</table>
Using 'pallas' as a dependency brings utxo-rpc other annoying dependencies such as _tokyo_. This not only makes the overall build longer, but it also prevents it to even work when targetting wasm.
This commit introduces a new feature into
the parser, typechecker, and formatter.
The work for code gen will be in the next commit.
I was able to leverage some existing infrastructure
by making using of `AssignmentPattern`. A new field
`is` was introduced into `IfBranch`. This field holds
a generic `Option<Is>` meaning a new generic has to be
introduced into `IfBranch`. When used in `UntypedExpr`,
`IfBranch` must use `AssignmentPattern`. When used in
`TypedExpr`, `IfBranch` must use `TypedPattern`.
The parser was updated such that we can support this
kind of psuedo grammar:
`if <expr:condition> [is [<pattern>: ]<annotation>]`
This can be read as, when parsing an `if` expression,
always expect an expression after the keyword `if`. And then
optionally there may be this `is` stuff, and within that you
may optionally expect a pattern followed by a colon. We will
always expect an annotation.
This first expression is still saved as the field
`condition` in `IfBranch`. If `pattern` is not there
AND `expr:condition` is `UntypedExpr::Var` we can set
the pattern to be `Pattern::Var` with the same name. From
there shadowing should allow this syntax sugar to feel
kinda magical within the `IfBranch` block that follow.
The typechecker doesn't need to be aware of the sugar
described above. The typechecker looks at `branch.is`
and if it's `Some(is)` then it'll use `infer_assignment`
for some help. Because of the way that `is` can inject
variables into the scope of the branch's block and since
it's basically just like how `expect` works minus the error
we get to re-use that helper method.
It's important to note that in the typechecker, if `is`
is `Some(_)` then we do not enforce that `condition` is
of type `Bool`. This is because the bool itself will be
whether or not the `is` itself holds true given a PlutusData
payload.
When `is` is None, we do exactly what was being done
previously so that plain `if` expressions remain unaffected
with no semantic changes.
The formatter had to be made aware of the new changes with
some simple changes that need no further explanation.
There's no reasons for this to be a property of only ArgName::Named to begin with. And now, with the extra indirection introduced for arg_name, it may leads to subtle issues when patterns args are used in validators.
While we agree on the idea of having some ways of emitting events, the
design hasn't been completely fleshed out and it is unclear whether
events should have a well-defined format independent of the framework
/ compiler and what this format should be.
So we need more time discussing and agreeing about what use case we
are actually trying to solve with that.
Irrespective of that, some cleanup was also needed on the UPLC side
anyway since the PR introduced a lot of needless duplications.
Temporarily using the 'specialize-dict-key' branch from the stdlib
which makes use of Pair where relevant. Once this is merged back into
'main' we should update the acceptance test toml files to keep getting
them automatically upgraded.
This commit also fixes an oversight in the reification of data-types
now properly distinguishing between pairs and 2-tuples.
Co-authored-by: Microproofs <kasey.white@cardanofoundation.org>
Before this commit, we would parse 'Pair' as a user-defined
data-types, and thus piggybacking on that whole record system. While
perhaps handy for some things, it's also semantically wrong and
induces a lot more complexity in codegen which now needs to
systematically distinguish every data-type access between pairs, and
others.
So it's better to have it as a separate expression, and handle it
similar to tuples (since it's fundamentally a 2-tuple with a special
serialization).
The main trick here was transforming Assignment
to contain `Vec<UntypedPattern, Option<Annotation>>`
in a field called patterns. This then meant that I
could remove the `pattern` and `annotation` field
from `Assignment`. The parser handles `=` and `<-`
just fine because in the future `=` with multi
patterns will mean some kind of optimization on tuples.
But, since we don't have that optimization yet, when
someone uses multi patterns with an `=` there will be an
error returned from the type checker right where `infer_seq`
looks for `backpassing`. From there the rest of the work
was in `Project::backpassing` where I only needed to rework
some things to work with a list of patterns instead of just one.
The 3rd kind of assignment kind (Bind) is gone and now reflected through a boolean parameter. Note that this parameter is completely erased by the type-checker so that the rest of the pipeline (i.e. code-generation) doesn't have to make any assumption. They simply can't see a backpassing let or expect.
This is more holistic and less awkward than having monadic bind working only with some pre-defined type. Backpassing work with _any_ function, and can be implemented relatively easily by rewriting the AST on-the-fly.
Also, it is far easier to explain than trying to explain what a monadic bind is, how its behavior differs from type to type and why it isn't generally available for any monadic type.
This was a mess to say to the least. The mess started when we wanted
to make all definitions in codegen use immutable maps of references --
which was and still is a good idea. Yet, the population of the data
types and functions definitions was done somehow in a separate step,
in a rather ad-hoc manner.
This commit changes that to ensure the project's data_types and
functions are populated while type checking the AST such that we need
not to redo it after.
The code for registering the data type definitions and function
definitions was also duplicated in at least 3 places. It is now a
method of the TypedModule.
Note: this change isn't only just cosmetic, it's also necessary for
the commit that follows which aims at adding tests to the set of
available function definitions, thus allowing to make property tests
callable.
Those end-to-end tests are useful. Both for controlling the behavior of the shrinker, but also to double check the reification of Plutus Data back into untyped expressions.
I had to work-around a few things to get opaque type and private types play nice. Also found a weird bug due to how we apply parameters after unique debruijn indexes have been also applied. A work-around is to re-intern the program.
- Add support to the formatter for these doc comments
- Add a new field to `Arg` `doc: Option<String>`
- Don't attach docs immediately after typechecking a module
- instead we should do it on demand in docs, build, and lsp
- the check command doesn't need to have any docs attached
- doing it more lazily defers the computation until later making
typechecking feedback a bit faster
- Add support for function arg and validator param docs in
`attach_module_docs` methods
- Update some snapshots
- Add put_doc to Arg
closes#685
The main goal is to make the parser more reusable to be used for when-clauses, instead of the expression parser. A side goal has been to make it more readable by moving the construction of some untyped expression as method on UntypedExpr. Doing so, I got rid of the extra temporary 'ParseArg' type and re-used the generic 'CallArg' instead by simply using an Option<UntypedExpr> as value to get the same semantic as 'ParseArg' (which would distinguish between plain call args and holes). Now the chained parser is in a bit more reusable state.
We do not actually every parse negative values in there, as a negative value is a combination of a 'Negate' and 'UInt' expression.
However, for patterns and constant, it'll be simpler to parse whole Int values as there's no ambiguity with arithmetic operations
there. To avoid confusion of having some 'Int' constructors containing only non-negative values, and some being on the whole range,
I've renamed the constructor to 'UInt' to make this more obvious.
This was a bit more tricky than anticipated but played out nicely in
the end. Now we have one holistic way of parsing todos and errors
instead of it being duplicated between when/clause and sequence. The
error/todo parser has been moved up to the expression part rather than
being managed when parsing sequences. Not sure what motivated that to
begin with.
Fixes#621.
This is simply a syntactic sugar which desugarize to a function call with two arguments mapped to the specified binary operator.
Only works for '>' at this stage as a PoC, extending to all binop in the next commit.