edits to tracing aiken
This commit is contained in:
		
							parent
							
								
									f3b88b8446
								
							
						
					
					
						commit
						38e68b5316
					
				| 
						 | 
					@ -0,0 +1,275 @@
 | 
				
			||||||
 | 
					Aims: 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					> Describe the pipeline and components getting from aiken to uplc. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The Preface
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Motivations
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The motivation for writing this came from a desire to add additional features to aiken not yet available.
 | 
				
			||||||
 | 
					One such feature would evaluate an arbitrary function in aiken callable from javascript. 
 | 
				
			||||||
 | 
					This would help a lot with testing trying to align on and off-chain code. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Another more pipe dreamy, adhoc function extraction - from a span of code, generate a function.
 | 
				
			||||||
 | 
					A digression to answer _why would this be at all helpful?!_
 | 
				
			||||||
 | 
					Validator logic often needs a broad context throughout.
 | 
				
			||||||
 | 
					How then to best factor code?
 | 
				
			||||||
 | 
					Possible solutions: 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					1. Introduce types / structs 
 | 
				
			||||||
 | 
					2. Have functions with lots of arguments
 | 
				
			||||||
 | 
					3. Don't
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The problems are:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					1. Requires relentless constructing and deconstructing across the function call.
 | 
				
			||||||
 | 
					And this is adds costs in aiken. 
 | 
				
			||||||
 | 
					2. Becomes tedious aligning the definition and function call.  
 | 
				
			||||||
 | 
					3. End up with very long validators which are hard to unit test. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					My current preferred way is to accept that validator functions are long.
 | 
				
			||||||
 | 
					Adhoc function extraction would allow for sections of code to be tested without needing to be factored out.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					To do either of these, we need to get to grips with the aiken compilation pipeline.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### This won't age well 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Aiken is undergoing active development. 
 | 
				
			||||||
 | 
					This post was started life with Aiken ~v1.14. 
 | 
				
			||||||
 | 
					With Aiken v1.15, there were already reasonably significant changes to the compilation pipeline. 
 | 
				
			||||||
 | 
					The word is that there aren't as big changes in the near future, 
 | 
				
			||||||
 | 
					but this article will undoubtably begin to diverge from the current codebase even before publishing.  
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Limitations of narating code
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Narating code becomes a compromise between being honest and accurate, and being readable and digestable. 
 | 
				
			||||||
 | 
					Following the command `aiken build` covers well in excess of 10,000 LoC.
 | 
				
			||||||
 | 
					The writing of this post ground slowly to a halt as it progressed deeper into the code 
 | 
				
			||||||
 | 
					with the details seeming to increase in importance. 
 | 
				
			||||||
 | 
					At some point I had to draw a line and resign to fact that some parts will remain black boxes for now. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Aiken build
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Tracing `aiken build`, the pipeline is roughly: 
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					  .               -> Project::read_source_files -> 
 | 
				
			||||||
 | 
					  Vec<Source>     -> Project::parse_sources ->
 | 
				
			||||||
 | 
					  ParsedModules   -> Project::type_check ->
 | 
				
			||||||
 | 
					  CheckedModules  -> CodeGenerator::build ->  
 | 
				
			||||||
 | 
					  AirTree         -> AirTree::to_vec -> 
 | 
				
			||||||
 | 
					  Vec<Air>        -> CodeGenerator::uplc_code_gen -> 
 | 
				
			||||||
 | 
					  Program / Term<Name> -> serialize -> 
 | 
				
			||||||
 | 
					  .
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					We'll pick our way through these steps
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At a high level we are trying to do something straightforward: reformulate aiken code as uplc.
 | 
				
			||||||
 | 
					Some aiken expressions are relatively easy to handle for example an aiken `Int` goes to an `Int` in uplc. 
 | 
				
			||||||
 | 
					Some aiken expressions require more involved handling, for example an aiken `If... If Else... Else ` 
 | 
				
			||||||
 | 
					must have the branches "nested" in uplc.
 | 
				
			||||||
 | 
					Aiken also have lots of nice-to-haves like pattern matching, modules, and generics.
 | 
				
			||||||
 | 
					Uplc has none of these.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### The Preamble 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#### Cli handling
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The cli enters at `aiken/src/cmd/mod.rs` which parses the command. 
 | 
				
			||||||
 | 
					With some establishing of context, the program enters `Project::build` (`crates/aiken-project/src/lib.rs`),
 | 
				
			||||||
 | 
					which in turn calls `Project::compile`. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#### File crawl
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The program looks for aiken files in both `./lib` and `./validator` subdirs. 
 | 
				
			||||||
 | 
					For each it walks over all contents (recursively) looking for `.ak` extensions. 
 | 
				
			||||||
 | 
					It treats these two sets of files a little differently. 
 | 
				
			||||||
 | 
					For example, only validator files can contain the special validator functions.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#### Parse and Type check
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					`Project::parse_sources` parses the module source code.
 | 
				
			||||||
 | 
					The heavy lifting is done by `aiken_lang::parser::module`, which is evaluated on each file. 
 | 
				
			||||||
 | 
					It produces a `Module` containing a list of parsed definitions of the file: functions, types _etc_,
 | 
				
			||||||
 | 
					together with metadata like docstrings and the file path. 
 | 
				
			||||||
 | 
					 
 | 
				
			||||||
 | 
					`Project::type_check` inspects the parsed modules and, as the name implies, checks the types. 
 | 
				
			||||||
 | 
					It flags type level warnings and errors and constructs a hash map of `CheckedModule`s.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#### Code generator
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The code generator `CodeGenerator` (`aiken-lang/src/gen_uplc.rs`) is given 
 | 
				
			||||||
 | 
					the definitions found from the previous step, 
 | 
				
			||||||
 | 
					together with the plutus builtins. 
 | 
				
			||||||
 | 
					It has additional fields for things like debugging. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This is handed over to a `Blueprint` (`aiken-project/src/blueprint/mod.rs`).
 | 
				
			||||||
 | 
					The blueprint does little more than find the validators on which to run the code gen. 
 | 
				
			||||||
 | 
					The heavy lifting is done by `CodeGenerator::generate`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We are now ready to take the source code and create plutus. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### In the air
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Things become a bit intimidating at this point in terms of sheer lines of code:
 | 
				
			||||||
 | 
					`gen_uplc.rs` and three modules in `gen_uplc/` totals > 8500 LoC.  
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Aiken has its own _intermediate representation_ called `air` (as in Aiken Intermediate Representation). 
 | 
				
			||||||
 | 
					These are common in compiled languages.
 | 
				
			||||||
 | 
					`Air` is defined in `aiken-lang/src/gen_uplc/air.rs`. 
 | 
				
			||||||
 | 
					Unsurprisingly, it looks little bit like a language between aiken and plutus. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In fact, Aiken has another intermediate representation: `AirTree`. 
 | 
				
			||||||
 | 
					This is constructed between the `TypedExpr` and `Vec<Air>` ie between parsed aiken and air. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#### Climbing the AirTree 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Within `CodeGenerator::generate`, `CodeGenerator::build` is called on the function body. 
 | 
				
			||||||
 | 
					This takes a `TypedExpr` and constructs and returns an `AirTree`.
 | 
				
			||||||
 | 
					The construction is recursive as it traverses the recursive `TypedExpr` data structure.
 | 
				
			||||||
 | 
					More on what an airtree is and its construction below.
 | 
				
			||||||
 | 
					At the same time `self` is treated as `mut`, so we need to keep an eye on this too.
 | 
				
			||||||
 | 
					The method which is called and uses this mutability of self is `self.assignment`. 
 | 
				
			||||||
 | 
					It does so by
 | 
				
			||||||
 | 
					```sample 
 | 
				
			||||||
 | 
					  self.assignment >> self.expect_type_assign >> self.code_gen_functions.insert
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					and thus is creating a hashmap of all the functions that appear in the definition.
 | 
				
			||||||
 | 
					From the call to return of `assign` covers > 600 LoC so we'll leave this as otherwise a black box.
 | 
				
			||||||
 | 
					(`self.handle_each_clause` is also called with `mut` which in turn calls `self.build` for which `mut` it is needed.) 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Validators in aiken are boolean functions while in uplc they are unit-valued (aka void-valued) functions.
 | 
				
			||||||
 | 
					Thus the airtree is wrapped such that `false` results in an error (`wrap_validator_condition`). 
 | 
				
			||||||
 | 
					I don't know why there is a prevailing thought that boolean functions are preferable than functions 
 | 
				
			||||||
 | 
					that error if anything is wrong - which is what validators are.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					`check_validator_args` again extends the airtree from the previous step, 
 | 
				
			||||||
 | 
					and again calls `self.assignment` mutating self.
 | 
				
			||||||
 | 
					Something interesting is happening here. 
 | 
				
			||||||
 | 
					Script context is the final argument of a validator - for any script purpose.
 | 
				
			||||||
 | 
					`check_validator_args` treats the script context like it is an unused argument. 
 | 
				
			||||||
 | 
					The importance of this is not immediate, and I've still yet to appreciate why this happens.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's take a look at what AirTree actually is
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					pub enum AirTree {
 | 
				
			||||||
 | 
					    Statement {
 | 
				
			||||||
 | 
					        statement: AirStatement,
 | 
				
			||||||
 | 
					        hoisted_over: Option<Box<AirTree>>,
 | 
				
			||||||
 | 
					    },
 | 
				
			||||||
 | 
					    Expression(AirExpression),
 | 
				
			||||||
 | 
					    UnhoistedSequence(Vec<AirTree>),
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					Note that `AirStatement` and `AirExpression` are mutually recusive definitions with `AirTree`. 
 | 
				
			||||||
 | 
					Otherwise, it would be unclear from first inspection how tree-like this really is. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					`AirExpression` has multiple constructors. These include (non-exhaustive)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- air primitives (including all the ones that appear in plutus)
 | 
				
			||||||
 | 
					- constructors `Call` and `Fn` to handle anonymous functions
 | 
				
			||||||
 | 
					- binary and unary operators
 | 
				
			||||||
 | 
					- handling when and if
 | 
				
			||||||
 | 
					- handling error and tracing
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					`AirStatement` also has multiple constructors. These include 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- let assignments and named function definitions
 | 
				
			||||||
 | 
					- handling expect assignments 
 | 
				
			||||||
 | 
					- pattern matching 
 | 
				
			||||||
 | 
					- unwrapping datastructures
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Note that `AirTree` has many methods that are partial functions, 
 | 
				
			||||||
 | 
					as in there are possible states that are not considered legitimate 
 | 
				
			||||||
 | 
					at different points of its construction and use.
 | 
				
			||||||
 | 
					For example `hoist_over` will throw an error if called on an `Expression`.
 | 
				
			||||||
 | 
					As `AirTree` is for internal use only, the scope for potential problems is reasonably contained.
 | 
				
			||||||
 | 
					It seems likely this is to avoid similar-yet-different IRs between steps.
 | 
				
			||||||
 | 
					However, the trade off is that it partially obsufucates what is a valid state where. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					What is hoisting? hoisting gives the airtree depth. 
 | 
				
			||||||
 | 
					The motivation is that by the time we hit uplc it is "generally better"
 | 
				
			||||||
 | 
					that 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- function defintions appear once rather than being inlined multiple times
 | 
				
			||||||
 | 
					- the definition appears as close to use as possible 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Hoisting creates tree paths. 
 | 
				
			||||||
 | 
					The final airtree to airtree step is`self.hoist_functions_to_validator` traverses the paths.
 | 
				
			||||||
 | 
					There is a lot of mutating of self, making it quite hard to keep a handle on things. 
 | 
				
			||||||
 | 
					In all this (several thousand?) LoC, it is essentially ascertaining in which node of the tree
 | 
				
			||||||
 | 
					to insert each function definiton. 
 | 
				
			||||||
 | 
					In a resource constrained environment like plutus, this effort is warranted.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					At the same time this function deals with 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- monomophisation - no more generics
 | 
				
			||||||
 | 
					- erasing opaque types
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Neither of which exist at the uplc level. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#### Into Air
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The `to_vec : AirTree -> Vec<Air>` is much easier to digest. 
 | 
				
			||||||
 | 
					For one, it is not evaluated in the context of the CodeGenerator,
 | 
				
			||||||
 | 
					and two, there is no mutation of the airtree. 
 | 
				
			||||||
 | 
					The function recursively takes nodes of the tree and maps them to entries in a mutable vector.
 | 
				
			||||||
 | 
					It flattens the tree to a vec.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Down to uplc 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Next we go from `Vec<Air> -> Term<Name>`.
 | 
				
			||||||
 | 
					This step is a little more involved than the previous. 
 | 
				
			||||||
 | 
					For one, this is executed in the context of the code generator. 
 | 
				
			||||||
 | 
					Moreover, the code generatore is treated mutable - ouch.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					On further inspection we see that the only mutation is setting `self.needs_field_access = true`.
 | 
				
			||||||
 | 
					This flag informs the compiler that, if true, additional terms must be added in one of the final steps
 | 
				
			||||||
 | 
					(see `CodeGenerator::finalize`).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As noted above, some of the mappings from air to terms are immediate like `Air::Bool -> Term::bool`.  
 | 
				
			||||||
 | 
					Others are less so.
 | 
				
			||||||
 | 
					Some examples:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- `Air::Var` require 100 LoC to do case handling on different constructors. 
 | 
				
			||||||
 | 
					- Lists in air have no immediate analogue in uplc
 | 
				
			||||||
 | 
					- builtins, as in built-in functions (standard shorthand), have to mediated 
 | 
				
			||||||
 | 
					with some combination of `force` and `delay` in order to behave as they should.
 | 
				
			||||||
 | 
					- user functions must be "uncurried", ie treated as a sequence of single argument functions, 
 | 
				
			||||||
 | 
					and recursion must be handled
 | 
				
			||||||
 | 
					- Do some magic in order to efficiently allow "record updates".
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#### Cranking the Optimizer
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There is a sequence of operations perfromed on the uplc mapping `Term<Name> -> Term<Name>`.
 | 
				
			||||||
 | 
					These remove inconsequential parts of the logic which will appear.
 | 
				
			||||||
 | 
					These include: 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- removing application of the identity function
 | 
				
			||||||
 | 
					- directly substituting where apply lambda is applied to a constant or builtin
 | 
				
			||||||
 | 
					- inline or simplify where apply lambda is applied to a param that appears once or not at all
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Each of these optimizing methods has a its own relatively narrow focus, 
 | 
				
			||||||
 | 
					and so although there is a fair number of LoC, it's reasonably straightforward to follow.
 | 
				
			||||||
 | 
					Some are applied multiple times. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### The End 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The generated program can now be serialized and included in the blueprint.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Plutus Core Signposting
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					All this fuss is to get us to a point where we can write uplc - and good uplc at that. 
 | 
				
			||||||
 | 
					Note that there's many ways to generate code and most of them are bad.  
 | 
				
			||||||
 | 
					The various design decisions and compilation steps make more sense 
 | 
				
			||||||
 | 
					when we have a better understanding of the target language. 
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Uplc is a lambda calculus. 
 | 
				
			||||||
 | 
					For a comprehensive definition on uplc checkout the specification found 
 | 
				
			||||||
 | 
					[here](https://github.com/input-output-hk/plutus/#specifications-and-design) from the plutus github repo. 
 | 
				
			||||||
 | 
					(I imagine this link will be maintained longer than the current actual link.)
 | 
				
			||||||
 | 
					If you're not at all familiar with lambda calculus I recommend 
 | 
				
			||||||
 | 
					[an unpacking](https://crypto.stanford.edu/~blynn/lambda/) by Ben Lynn.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### What next?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					I think it would be helpful to have some examples... Watch this space.
 | 
				
			||||||
| 
						 | 
					@ -1,228 +0,0 @@
 | 
				
			||||||
Aims: 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
- Describe the pipeline, and components getting from aiken to uplc. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
## Preface
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Aiken is undergoing active development. 
 | 
					 | 
				
			||||||
This post was started Aiken ~v1.14. 
 | 
					 | 
				
			||||||
With Aiken v1.15, there were already reasonably significant changes to the compilation pipeline. 
 | 
					 | 
				
			||||||
The word is that there aren't as big changes in the near future, but 
 | 
					 | 
				
			||||||
this article will undoubtably begin to diverge from the current codebase even before publishing.  
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
## Aiken build
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Tracing `aiken build`, the pipeline is roughly something like: 
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
  .               -> Project::read_source_files -> 
 | 
					 | 
				
			||||||
  Vec<Source>     -> Project::parse_sources ->
 | 
					 | 
				
			||||||
  ParsedModules   -> Project::type_check ->
 | 
					 | 
				
			||||||
  CheckedModules  -> CodeGenerator::build ->  
 | 
					 | 
				
			||||||
  AirTree         -> AirTree::to_vec -> 
 | 
					 | 
				
			||||||
  Vec<Air>        -> CodeGenerator::uplc_code_gen -> 
 | 
					 | 
				
			||||||
  Program / Term<Name> -> serialize -> 
 | 
					 | 
				
			||||||
  .
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
We'll pick our way through these steps
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
At a high level we are trying to do something straightforward: reformulate aiken code as uplc.
 | 
					 | 
				
			||||||
Some aiken expressions are relatively easy to handle for example an aiken `Int` goes to an `Int` in uplc. 
 | 
					 | 
				
			||||||
Some aiken expressions require more involved handling, for example an aiken `If... If Else... Else ` 
 | 
					 | 
				
			||||||
must have the branches "nested" in uplc.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
### The Preamble 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
#### cli handling
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The cli enters at `aiken/src/cmd/mod.rs` which parses the command. 
 | 
					 | 
				
			||||||
With some establishing of context, the program enters `Project::build` (`crates/aiken-project/src/lib.rs`),
 | 
					 | 
				
			||||||
which in turn calls `Project::compile`. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
#### File crawl
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The program looks for aiken files in both `./lib` and `./validator` subdirs. 
 | 
					 | 
				
			||||||
For each it walks over all contents (recursively) looking for `.ak` extensions. 
 | 
					 | 
				
			||||||
It treats these two sets of files a little differently. 
 | 
					 | 
				
			||||||
Only validator files can contain the special validator functions.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
#### Parse and Type check
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`Project::parse_sources` parses the module source code.
 | 
					 | 
				
			||||||
The heavy lifting is done by `aiken_lang::parser::module`, which is evaluated on each file. 
 | 
					 | 
				
			||||||
It produces a `Module` containing a list of parsed definitions of the file: functions, types _etc_,
 | 
					 | 
				
			||||||
together with "metadata" like docstrings and the file path. 
 | 
					 | 
				
			||||||
 
 | 
					 | 
				
			||||||
`Project::type_check` inspects the parsed modules and, as the name implies, checks the types. 
 | 
					 | 
				
			||||||
It flags type level warnings and errors. 
 | 
					 | 
				
			||||||
It constructs a hash map of `CheckedModule`s.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
#### Code generator
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The code generator `CodeGenerator` (`aiken-lang/src/gen_uplc.rs`) is given 
 | 
					 | 
				
			||||||
the definitions found from the previous step, 
 | 
					 | 
				
			||||||
together with the plutus builtins. 
 | 
					 | 
				
			||||||
It has additional fields for things like debugging. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This is handed over to a `Blueprint` (`aiken-project/src/blueprint/mod.rs`).
 | 
					 | 
				
			||||||
A blueprint does little more than find the validators on which to run the code gen. 
 | 
					 | 
				
			||||||
The heavy lifting is done by `CodeGenerator::generate`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
We are now ready to take the source code and create plutus. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
### Up in the air
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Things become a bit intimidating at this point in terms of sheer lines of code:
 | 
					 | 
				
			||||||
`gen_uplc.rs` and three modules in `gen_uplc/` totals > 8500 LoC.  
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Aiken has its own _intermediate representation_ called `air` (as in Aiken Intermediate Representation). 
 | 
					 | 
				
			||||||
These are common in compiled languages.
 | 
					 | 
				
			||||||
`Air` is defined in `aiken-lang/src/gen_uplc/air.rs`. 
 | 
					 | 
				
			||||||
Unsurprisingly, it looks little bit like a language between aiken and plutus. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
In fact, Aiken has another intermediate representation: `AirTree`. 
 | 
					 | 
				
			||||||
This is constructed between the `TypedExpr` and `Vec<Air>` ie between parsed aiken and air. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
#### AirTree 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Within `CodeGenerator::generate`, `CodeGenerator::build` is called on the function body. 
 | 
					 | 
				
			||||||
This constructs and returns an `AirTree`.
 | 
					 | 
				
			||||||
More on what an airtree is and its construction below.
 | 
					 | 
				
			||||||
At the same time `self` is treated as `mut`, so we need to keep an eye on this too.
 | 
					 | 
				
			||||||
The method which is called and uses this mutability of self is `self.assignment`. 
 | 
					 | 
				
			||||||
It does so by
 | 
					 | 
				
			||||||
```sample 
 | 
					 | 
				
			||||||
  self.assignment >> self.expect_type_assign >> self.code_gen_functions.insert
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
and thus is creating a hashmap of all the functions that appear in the definition.
 | 
					 | 
				
			||||||
(`self.handle_each_clause` is also called with `mut` which in turn calls `self.build` for which `mut` it is needed.
 | 
					 | 
				
			||||||
`self.clause_pattern` is called with `mut` but it isn't used.) 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
###### Codegen assignment 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
~200 LoC 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
###### Codegen expect type assign
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
~400 LoC 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
###### ... Back to build 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Validators in aiken are boolean functions while in uplc they are unit-valued (aka void-valued) functions.
 | 
					 | 
				
			||||||
Thus the airtree is wrapped such that `false` results in an error (`wrap_validator_condition`). 
 | 
					 | 
				
			||||||
(Ed: I don't know why there is a prevailing thought that boolean functions are preferable than functions 
 | 
					 | 
				
			||||||
that simply error if anything is wrong.)
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`check_validator_args` again extends the airtree from the previous step, 
 | 
					 | 
				
			||||||
and again calls `self.assignment` mutating self.
 | 
					 | 
				
			||||||
Something interesting is happening here. 
 | 
					 | 
				
			||||||
Script context is the final argument of a validator - for any script purpose.
 | 
					 | 
				
			||||||
`check_validator_args` treats the script context like it is an unused argument. 
 | 
					 | 
				
			||||||
We'll circle back to how this works later on.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Next we encounter 
 | 
					 | 
				
			||||||
```rust
 | 
					 | 
				
			||||||
  AirTree::no_op().hoist_over(validator_args_tree);
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
Its not very apparent why we need to do this. Let's look ahead and consider this later.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The final airtree to step(s) are in `self.hoist_functions_to_validator`.
 | 
					 | 
				
			||||||
TODO: What happens here?!
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Note that `AirTree` and its methods aren't fully typesafe.
 | 
					 | 
				
			||||||
For example `hoist_over` will throw an error if called on an `Expression`.
 | 
					 | 
				
			||||||
As `AirTree` is for internal use only, the scope for potential problems is reasonably contained.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The AirTree has the following definition
 | 
					 | 
				
			||||||
```rust
 | 
					 | 
				
			||||||
pub enum AirTree {
 | 
					 | 
				
			||||||
    Statement {
 | 
					 | 
				
			||||||
        statement: AirStatement,
 | 
					 | 
				
			||||||
        hoisted_over: Option<Box<AirTree>>,
 | 
					 | 
				
			||||||
    },
 | 
					 | 
				
			||||||
    Expression(AirExpression),
 | 
					 | 
				
			||||||
    UnhoistedSequence(Vec<AirTree>),
 | 
					 | 
				
			||||||
}
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
We can see it has a tree-like structure, as the name suggests. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`AirExpression` has multiple constructors. These include (non-exhaustive)
 | 
					 | 
				
			||||||
- air primitives (including all the ones that appear in plutus)
 | 
					 | 
				
			||||||
- constructors `Call` and `Fn` to handle functions
 | 
					 | 
				
			||||||
- binary and unary operators
 | 
					 | 
				
			||||||
- handling when and if
 | 
					 | 
				
			||||||
- error and tracing
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
`AirStatement` also has multiple constructors. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
for handling functions, `plutus primitives, along with 
 | 
					 | 
				
			||||||
An `AirStatement` 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
## Down to uplc 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
## Air 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Aiken compiles aiken code to uplc via _air_: 
 | 
					 | 
				
			||||||
Aiken Intermediate Representation. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
## Trace
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Running  `aiken build`...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The cli (See `aiken/src/cmd/mod.rs`) parses the command, 
 | 
					 | 
				
			||||||
finds the context and calls `Project::build` (`crates/aiken-project/src/lib.rs`),
 | 
					 | 
				
			||||||
which in turn calls `Project::compile`. 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
#### `Project::compile`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1. Check dependencies are available _eg_ aiken stdlib. 
 | 
					 | 
				
			||||||
2. Read source files.  
 | 
					 | 
				
			||||||
  1. Walk over `./lib` and `./validators` and push aiken modules onto `Project.sources`.
 | 
					 | 
				
			||||||
3. Parse each source in sources: 
 | 
					 | 
				
			||||||
  1. Generate a `ParsedModule` containing the `ast`, `docs`, _etc_.
 | 
					 | 
				
			||||||
  The `ast` here is an `UntypedModule`, which contains untyped definitions.
 | 
					 | 
				
			||||||
4. Type check each parsed module.
 | 
					 | 
				
			||||||
  1. For each untyped module, create a `CheckedModule`. 
 | 
					 | 
				
			||||||
  This includes typed definitions. 
 | 
					 | 
				
			||||||
5. `compile` forks into two depending on whether it's been called with `build` or `check`. 
 | 
					 | 
				
			||||||
6. From `CheckModules` construct a `CodeGenerator`
 | 
					 | 
				
			||||||
7. Pass the generator to construct a new `Blueprints`.
 | 
					 | 
				
			||||||
  1. Blueprints finds validators from checked modules. 
 | 
					 | 
				
			||||||
  2. From each it constructs a `Validator` with the constructor `Validator::from_checked_module` (which returns a vector of validators)
 | 
					 | 
				
			||||||
      1. Its here that the magic happens: The method `generator.generate(def)` is called, 
 | 
					 | 
				
			||||||
        where `def` is the typed validator(s). 
 | 
					 | 
				
			||||||
        This method outputs a `Program<Name>` which contains the UPLC.
 | 
					 | 
				
			||||||
      2. These are collected together.
 | 
					 | 
				
			||||||
  3. The rest is collecting and handling the errors and warnings and writing the blueprint.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
#### `CodeGenerator::generate`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1. Create a new `AirStack`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
#### `AirStack`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Consists of:
 | 
					 | 
				
			||||||
1. An Id
 | 
					 | 
				
			||||||
2. A `Scope`
 | 
					 | 
				
			||||||
3. A vector of `Air` 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The Scope keeps track of ... [TODO]
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
#### Air 
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Air is a typed language... [TODO]
 | 
					 | 
				
			||||||
		Loading…
	
		Reference in New Issue