This leads to more consistent formatting across entire Aiken programs.
Before that commit, only long expressions would be formatted on a
newline, causing non-consistent formatting and additional reading
barrier when looking at source code.
Programs also now take more vertical space, which is better for more
friendly diffing in version control systems (especially git).
It is now possible to leave a hole in a type annotation and have the compiler fill-in the expected type of us.
This is a pretty useful debugging tool when playing with complex functions.
The difference between 'FlexBreak' and 'Break(Mode::Strict/Flexible)' as always confused me; and turned out that the 'FlexBreak' thingy is never used. This is dead-code, so I removed it.
Rules are now as follows:
- If a pipeline contains a newline, then the entire pipeline is formatted over multiple lines.
- If it doesn't, then it's formatted as a single-line UNLESS it cannot fit; in which case, we fallback to multiline again.
This was a bit tricky and I ended up breaking things down a lot and
trying different path. This commit is the result of the most
satisfying one.
It introduces a new 'concept' and types: Definitions and Reference.
These elements are meant to reflect JSON pointers and JSON-schema
definitions which we now use for pretty much all user-defined
data-types.
In fact, Schemas are no longer inlined, but are always referencing
some schema under "definitions".
This indirection is necessary in order to cope with recursive types.
And while it's only truly necessary for recursive types, using it
consistently makes it both easier to produce and easier to consume.
---
The blueprint generation for recursive types here also works thanks to
the 'Definitions' data-structure wrapper around a BTreeMap. This uses
a strategy where:
(1) schemas are only generated if they haven't been seen before
(2) schemas are marked as seen BEFORE actually being generated (to
effectively stop a recursive generation).
This relies on one important aspect: the key must be uniquely
identifying a given schema. Which means that we have to monomorphize
data-types with generic parameters also here, and use keys that are
specialized in one data-type.
---
In this large overhaul we've also lost one thing which I didn't bother
re-introducing yet to keep the work manageable: title for record
fields. Before, we use to pull those from record constructor when
available, yet now, every record constructor has been replaced by a
`$ref`. We could theoritically attach a title to the reference. I'll
try to quickly add that in a later commit.
Having the data's schema be optional at the level of the 'Schema' did not allow to represent cases where there would be an opaque data at an arbitrary nesting. So I introduced a new variant 'Opaque' on 'Data' to fill that gap.
These functions relied on the same dependency and had the same scope. So insertion was by encounter rather than order determined by dependency handling. Now we switched to dependency order is prioritized to prevent free unique.
-Builitins IR now acts like Record IR in terms of argument consumption
-UnConstrData returns as Pair(Data,Data) to conform with how pairs are treated behind the scenes.
This has been removed from the CIP-0057 specification since validators
are often re-used for multiple purposes (especially validators with
arity 2). It's misleading to assign a validator a purpose since the
purpose distinction actually happens _within_ the validator itself.
This has been bothering me and the more I thought of it the more I
disliked the idea of a warning. The rationale being that in this very
context, there's absolutely no ambiguity. So it is only frustrating
that the parser is even able to make the exact suggestion of what
should be fixed, but still fails.
I can imagine it is going to be very common for people to type:
```
trace "foo"
```
...yet terribly frustrating if they have to remember each time that
this should actually be a string. Because of the `trace`, `todo` and
`error` keywords, we know exactly the surrounding context and what to
expect here. So we can work it nicely.
However, the formatter will re-format it to:
```
trace @"foo"
```
Just for the sake of remaining consistent with the type-system. This
way, we still only manipulate `String` in the AST, but we conveniently
parse a double-quote utf-8 literal when coupled with one of the
specific keywords.
I believe that's the best of both worlds.
This will probably save people minutes/hours of puzzled debugging. This is only a warning because there may be cases where one do actually want to specify an hex-encoded bytearray. In which case, they can get rid of the warning by using the plain bytearray syntax (i.e. as an array of bytes).
The core observation is that **in the context of Aiken** (i.e. on-chain logic)
people do not generally want to use String. Instead, they want
bytearrays.
So, it should be easy to produce bytearrays when needed and it should
be the default. Before this commit, `"foo"` would parse as a `String`.
Now, it parses as a `ByteArray`, whose bytes are the UTF-8 bytes
encoding of "foo".
Now, to make this change really "fool-proof", we now want to:
- [ ] Emit a parse error if we parse a UTF-8 bytearray literal in
place where we would expect a `String`. For example, `trace`,
`error` and `todo` can only be followed by a `String`.
So when we see something like:
```
trace "foo"
```
we know it's a mistake and we can suggest users to use:
```
trace @"foo"
```
instead.
- [ ] Emit a warning if we ever see a bytearray literals UTF-8, which
is either 56 or 64 character long and is a valid hexadecimal string.
For example:
```
let policy_id = "29d222ce763455e3d7a09a665ce554f00ac89d2e99a1a83d267170c6"
```
This is _most certainly_ a mistake, as this generates a ByteArray of
56 bytes, which is effectively the hex-encoding of the provided string.
In this scenario, we want to warn the user and inform them they probably meant to use:
```
let policy_id = #"29d222ce763455e3d7a09a665ce554f00ac89d2e99a1a83d267170c6"
```
This is not supported by the code generation, so it's a bit of a lie
to have them in the language in the first place. There's arguably not
even any use for constant records, list and tuples to begin with. So
this cleans this up everywhere for the sake of moving forward with the
alpha release.
This now reduces constants to:
- Integer
- ByteArray
- String
Anything else can be declared via a function anyway. We can revisit
this choice later.... or not.
Tracing is now turn OFF by default when:
- building project
- building documentation
- building dependencies
It can be turned ON only when building project using `--keep-traces`.
That means it's not possible to build dependencies with traces. The
address `--rebuild` flag will also rebuild without traces.
Tracing is however turn ON by default when:
- checking the project (and running tests).
In this scenario, tracing can be disabled using `--no-traces` (if for
example, one want to analyze the execution units of specific functions
without having to manually remove traces from code).
This caused me some trouble. In my first approach, I ended up having
multiple traces because nested values would be evaluated twice; once
as condition, and once as part of the continuation.
To prevent this, we can simply evaluate the condition once, and return
plain True / False boolean as outcome. So this effectively transforms any
expression:
```
expr
```
as
```
if expr { True } else { trace("...", False) }
```
Interestingly enough, chumsky seems to fail when given a 'choice' with
more than 25 elements. That's why this commit groups together some of
the choices as another nested 'choice'.
The goal is to handle this without bothering the code generation down the line. That is, we can handle it when transforming from the untyped AST to the typed one. That's why there's no 'TraceIfFalse' constructor in the typed AST. It has disappeared during type-check.
We want the lookup to yield a result when there's only a single
validator; and no title is provided. So that users can simply do
'aiken address' in their project if it's unambiguous. The validator's
name is only required to disambiguate between multiple validators.
I also noticed that the order of arguments in with_validator was
wrong. Somehow.
Todo is fundamentally just a trace and an error. The only reason we kept it as a separate element in the AST is for the formatter to work out whether it should format something back to a todo or something else.
However, this introduces redundancy in the code internally and makes the AIR more complicated than it needs to be. Both todo and errors can actually be represented as trace + errors, and we only need to record their preferred shape when parsing so that we can format them back to what's expected.
We now parse errors as a combination of a trace plus and error term. This is a baby step in order to simplify the code generation down the line and the internal representation of todo / errors.
This however enforces that the argument unifies to a `String`. So this
is more flexible than the previous form, but does fundamentally the
same thing.
Fixes#378.
Not sure what this special case was trying to achieve, but it's not right. There's no need to handle function call with a single argument differently than the others.
List Clauses patterns handle var cases
Fixed Tuple Clauses issue with last clause not being a tuple
Redid how zero arg functions and dependencies are handled. Tough one lol
And also return a structured output as JSON, so it's more easily used
by other tools.
```
Parsing script context
Simulating 78ec148ea647cf9969446891af31939c5d57b275a2455706782c6183ef0b62f1
Redeemer Spend → 0
{"mem":151993,"cpu":58180696}
```
The current implementation assumed that ALL withdrawals present in a
transaction had to be locked by a script and failed otherwise. But a
transaction can actually be composed of both. So instead of failing,
we should rather just ignore withdrawals that can't be referenced by
redeemers.
There's arguably no use case ever for that in the context of on-chain
Plutus. Strings are really just meant to be used for tracing. They
aren't meant to be manipulated as heavily as in classic programming
languages.
Before that commit, the type-checker would allow unsafe list patterns
such as:
```
let [x] = xs
when xs is {
[x] -> ...
[x, ..] -> ...
}
```
This is quite unsafe and can lead to confusing situations. Now at
least the compiler warns about this. It isn't perfect though,
especially in the presence of clause guards. But that's a start.
Whoopsie... || and && were treated with the same precedence, causing very surprising behavior down the line.
I noticed this because of the auto-formatter adding parenthesis where it really shouldn't. The problem came actually from the parser and how it constructed the AST.
fix conversion from inner opaque type for when and assignment
This fixes Clause being used in cases where ListClause or TupleClause should be used
Reset defined and zero arg functions between each code gen
Fixes for optimizations when encountering shadowed variables
This is still a bit clunky as the interface is expecting parameters in UPLC form and we don't do any kind of verification. So it is easy to shoot oneself in the foot at the moment (for example, to apply an integer into something that should have received a data). To be improved later.
Without that, we have no way to distinguish between fully applied
validators and those that still require some hard-coded parameters.
Next steps is to make it easier to apply parameters to those, as well
as forbid the creation of addresses of validators that aren't fully
qualified.
* fix assert on pattern Var
* fix tuple index unwrapping closes#334
* allow wrapping when casting with let
* allow wrapping when casting via function call
I decided to invert how I'm doing it. I'm passing
in a new argument to unify in environment called
allow_cast: bool and essentially at various
unification sites I can control whether or not I
want to allow casting to even occur. So we can
assume it's false by default always and then we
turn it on in a few places vs. just opening the
flood gates and locking it down at various sites
as they come up# Please enter the commit message
for your changes. Lines starting
* you cannot cast FROM Data with a `let`
* you cannot cast FROM Data by passing
Data to none Data when calling a function
* you MUST use `assert` to cast from data
* you can cast INTO Data with a `let`
* you can cast INTO Data by passing none Data
to Data when calling a function
* You cannot assert cast Data without an
annotation
Weirdly enough, we got the parsing wrong for byte literals in expressions (but did okay in constants). But got the formatting wrong in constants (yet did okay for formatting expressions). I've factored out the code in both cases to avoid the duplication that led to this in the first place. Plus added test coverage to make sure this doesn't happen in the future.
This calculates a validator's address from validators found in a blueprint. It also provides a convenient way to attach a delegation part to the validator if needs be. The command is meant to provide a nice user experience and works 'out of the box' for projects that have only a single validator. Just call 'aiken address' to get the validator's address.
Note that the command-line doesn't provide any option to configure the target network. This automatically assumes testnet, and will until we deem the project ready for mainnet. Those brave enough to run an Aiken's program on mainnet will find a way anyway.
Here's a trick though: I got lazy (a bit) and did not write a full deserializer for Schema because this is busywork and not at all necessary at this stage. Instead, I've made the blueprint parameterized by a generic type <T>; which represents the type of the underlying blueprint's schema. When deserializing from JSON, we can default to 'Value' to get a free deserializer. Since all we're interested about is the program and the metadata (purpose and title) of a validator, it works nicely.
Serialization however expects a Blueprint<Schema>, and most of the functions operates over a Blueprint<Schema> anyway.
This will be useful to re-use this behavior in other structure that contains a Program<DeBruijn> without having to manually serialize or deserialize the entire structure.
In an ideal world, I should have handlded that directly at the conflicting commit in the rebase, but this would have bubbled up through all commits... which I wasn't really quite keen on going through. So here's an extra ugly commit that comes and 'fix the rebase'.
This is quite something, because now we have a testing pipeline that
can also be used for testing other compiler-related stuff such as the
type-checker or the code generator.
This also now introduce two levels of representable types (because it's needed at least for tuples):
Plutus Data (a.k.a Data) and UPLC primitives / constants (a.k.a Schema).
In practice, we don't want to specify blueprints that use direct UPLC primitives because there's little support for producing those in the ecosystem. So we should aim for producing only Data whenever we can. Yet we don't want to forbid it either in case people know what they're doing. Which means that we need to capture that difference well in the type modelling (in Rust and in the CIP-0057 specification).
I've also simplified the error type for now, just to provide some degree of feedback while working on this. I'll refine it later with proper errors.