229 lines
7.8 KiB
Markdown
229 lines
7.8 KiB
Markdown
Aims:
|
|
|
|
- Describe the pipeline, and components getting from aiken to uplc.
|
|
|
|
## Preface
|
|
|
|
Aiken is undergoing active development.
|
|
This post was started Aiken ~v1.14.
|
|
With Aiken v1.15, there were already reasonably significant changes to the compilation pipeline.
|
|
The word is that there aren't as big changes in the near future, but
|
|
this article will undoubtably begin to diverge from the current codebase even before publishing.
|
|
|
|
## Aiken build
|
|
|
|
Tracing `aiken build`, the pipeline is roughly something like:
|
|
```
|
|
. -> Project::read_source_files ->
|
|
Vec<Source> -> Project::parse_sources ->
|
|
ParsedModules -> Project::type_check ->
|
|
CheckedModules -> CodeGenerator::build ->
|
|
AirTree -> AirTree::to_vec ->
|
|
Vec<Air> -> CodeGenerator::uplc_code_gen ->
|
|
Program / Term<Name> -> serialize ->
|
|
.
|
|
```
|
|
We'll pick our way through these steps
|
|
|
|
At a high level we are trying to do something straightforward: reformulate aiken code as uplc.
|
|
Some aiken expressions are relatively easy to handle for example an aiken `Int` goes to an `Int` in uplc.
|
|
Some aiken expressions require more involved handling, for example an aiken `If... If Else... Else `
|
|
must have the branches "nested" in uplc.
|
|
|
|
### The Preamble
|
|
|
|
#### cli handling
|
|
|
|
The cli enters at `aiken/src/cmd/mod.rs` which parses the command.
|
|
With some establishing of context, the program enters `Project::build` (`crates/aiken-project/src/lib.rs`),
|
|
which in turn calls `Project::compile`.
|
|
|
|
#### File crawl
|
|
|
|
The program looks for aiken files in both `./lib` and `./validator` subdirs.
|
|
For each it walks over all contents (recursively) looking for `.ak` extensions.
|
|
It treats these two sets of files a little differently.
|
|
Only validator files can contain the special validator functions.
|
|
|
|
#### Parse and Type check
|
|
|
|
`Project::parse_sources` parses the module source code.
|
|
The heavy lifting is done by `aiken_lang::parser::module`, which is evaluated on each file.
|
|
It produces a `Module` containing a list of parsed definitions of the file: functions, types _etc_,
|
|
together with "metadata" like docstrings and the file path.
|
|
|
|
`Project::type_check` inspects the parsed modules and, as the name implies, checks the types.
|
|
It flags type level warnings and errors.
|
|
It constructs a hash map of `CheckedModule`s.
|
|
|
|
#### Code generator
|
|
|
|
The code generator `CodeGenerator` (`aiken-lang/src/gen_uplc.rs`) is given
|
|
the definitions found from the previous step,
|
|
together with the plutus builtins.
|
|
It has additional fields for things like debugging.
|
|
|
|
This is handed over to a `Blueprint` (`aiken-project/src/blueprint/mod.rs`).
|
|
A blueprint does little more than find the validators on which to run the code gen.
|
|
The heavy lifting is done by `CodeGenerator::generate`.
|
|
|
|
We are now ready to take the source code and create plutus.
|
|
|
|
### Up in the air
|
|
|
|
Things become a bit intimidating at this point in terms of sheer lines of code:
|
|
`gen_uplc.rs` and three modules in `gen_uplc/` totals > 8500 LoC.
|
|
|
|
Aiken has its own _intermediate representation_ called `air` (as in Aiken Intermediate Representation).
|
|
These are common in compiled languages.
|
|
`Air` is defined in `aiken-lang/src/gen_uplc/air.rs`.
|
|
Unsurprisingly, it looks little bit like a language between aiken and plutus.
|
|
|
|
In fact, Aiken has another intermediate representation: `AirTree`.
|
|
This is constructed between the `TypedExpr` and `Vec<Air>` ie between parsed aiken and air.
|
|
|
|
#### AirTree
|
|
|
|
Within `CodeGenerator::generate`, `CodeGenerator::build` is called on the function body.
|
|
This constructs and returns an `AirTree`.
|
|
More on what an airtree is and its construction below.
|
|
At the same time `self` is treated as `mut`, so we need to keep an eye on this too.
|
|
The method which is called and uses this mutability of self is `self.assignment`.
|
|
It does so by
|
|
```sample
|
|
self.assignment >> self.expect_type_assign >> self.code_gen_functions.insert
|
|
```
|
|
and thus is creating a hashmap of all the functions that appear in the definition.
|
|
(`self.handle_each_clause` is also called with `mut` which in turn calls `self.build` for which `mut` it is needed.
|
|
`self.clause_pattern` is called with `mut` but it isn't used.)
|
|
|
|
###### Codegen assignment
|
|
|
|
~200 LoC
|
|
|
|
###### Codegen expect type assign
|
|
|
|
~400 LoC
|
|
|
|
###### ... Back to build
|
|
|
|
Validators in aiken are boolean functions while in uplc they are unit-valued (aka void-valued) functions.
|
|
Thus the airtree is wrapped such that `false` results in an error (`wrap_validator_condition`).
|
|
(Ed: I don't know why there is a prevailing thought that boolean functions are preferable than functions
|
|
that simply error if anything is wrong.)
|
|
|
|
`check_validator_args` again extends the airtree from the previous step,
|
|
and again calls `self.assignment` mutating self.
|
|
Something interesting is happening here.
|
|
Script context is the final argument of a validator - for any script purpose.
|
|
`check_validator_args` treats the script context like it is an unused argument.
|
|
We'll circle back to how this works later on.
|
|
|
|
Next we encounter
|
|
```rust
|
|
AirTree::no_op().hoist_over(validator_args_tree);
|
|
```
|
|
Its not very apparent why we need to do this. Let's look ahead and consider this later.
|
|
|
|
The final airtree to step(s) are in `self.hoist_functions_to_validator`.
|
|
TODO: What happens here?!
|
|
|
|
|
|
|
|
|
|
Note that `AirTree` and its methods aren't fully typesafe.
|
|
For example `hoist_over` will throw an error if called on an `Expression`.
|
|
As `AirTree` is for internal use only, the scope for potential problems is reasonably contained.
|
|
|
|
|
|
|
|
|
|
The AirTree has the following definition
|
|
```rust
|
|
pub enum AirTree {
|
|
Statement {
|
|
statement: AirStatement,
|
|
hoisted_over: Option<Box<AirTree>>,
|
|
},
|
|
Expression(AirExpression),
|
|
UnhoistedSequence(Vec<AirTree>),
|
|
}
|
|
```
|
|
We can see it has a tree-like structure, as the name suggests.
|
|
|
|
`AirExpression` has multiple constructors. These include (non-exhaustive)
|
|
- air primitives (including all the ones that appear in plutus)
|
|
- constructors `Call` and `Fn` to handle functions
|
|
- binary and unary operators
|
|
- handling when and if
|
|
- error and tracing
|
|
|
|
`AirStatement` also has multiple constructors.
|
|
|
|
|
|
|
|
|
|
|
|
for handling functions, `plutus primitives, along with
|
|
An `AirStatement`
|
|
|
|
|
|
|
|
## Down to uplc
|
|
|
|
|
|
|
|
## Air
|
|
|
|
Aiken compiles aiken code to uplc via _air_:
|
|
Aiken Intermediate Representation.
|
|
|
|
## Trace
|
|
|
|
Running `aiken build`...
|
|
|
|
The cli (See `aiken/src/cmd/mod.rs`) parses the command,
|
|
finds the context and calls `Project::build` (`crates/aiken-project/src/lib.rs`),
|
|
which in turn calls `Project::compile`.
|
|
|
|
#### `Project::compile`
|
|
|
|
1. Check dependencies are available _eg_ aiken stdlib.
|
|
2. Read source files.
|
|
1. Walk over `./lib` and `./validators` and push aiken modules onto `Project.sources`.
|
|
3. Parse each source in sources:
|
|
1. Generate a `ParsedModule` containing the `ast`, `docs`, _etc_.
|
|
The `ast` here is an `UntypedModule`, which contains untyped definitions.
|
|
4. Type check each parsed module.
|
|
1. For each untyped module, create a `CheckedModule`.
|
|
This includes typed definitions.
|
|
5. `compile` forks into two depending on whether it's been called with `build` or `check`.
|
|
6. From `CheckModules` construct a `CodeGenerator`
|
|
7. Pass the generator to construct a new `Blueprints`.
|
|
1. Blueprints finds validators from checked modules.
|
|
2. From each it constructs a `Validator` with the constructor `Validator::from_checked_module` (which returns a vector of validators)
|
|
1. Its here that the magic happens: The method `generator.generate(def)` is called,
|
|
where `def` is the typed validator(s).
|
|
This method outputs a `Program<Name>` which contains the UPLC.
|
|
2. These are collected together.
|
|
3. The rest is collecting and handling the errors and warnings and writing the blueprint.
|
|
|
|
|
|
#### `CodeGenerator::generate`
|
|
|
|
1. Create a new `AirStack`.
|
|
|
|
|
|
#### `AirStack`
|
|
|
|
Consists of:
|
|
1. An Id
|
|
2. A `Scope`
|
|
3. A vector of `Air`
|
|
|
|
The Scope keeps track of ... [TODO]
|
|
|
|
#### Air
|
|
|
|
Air is a typed language... [TODO]
|