tracing aiken build: proofread

2023-09-02 20:05:48 +00:00 · 2023-09-02 20:05:48 +00:00 · b340cfd2f0
parent e2db317b30
commit b340cfd2f0
1 changed files with 21 additions and 23 deletions
--- a/content/drafts/tracing-aiken-build.md
+++ b/content/drafts/tracing-aiken-build.md
@ -8,7 +8,7 @@ Aims:

 The motivation for writing this came from a desire to add additional features to Aiken not yet available.
 One such feature would evaluate an arbitrary function in Aiken callable from JavaScript. 
-This would help a lot with testing trying to align on and off-chain code. 
+This would help a lot with testing and when trying to align on and off-chain code. 

 Another more pipe dreamy, ad-hoc function extraction - from a span of code, generate a function.
 A digression to answer _why would this be at all helpful?!_
@ -23,9 +23,9 @@ Possible solutions:
 The problems are:

 1. Requires relentless constructing and deconstructing across the function call.
-And this is adds costs in Aiken. 
+This adds costs. 
 2. Becomes tedious aligning the definition and function call.  
-3. End up with very long validators which are hard to unit test. 
+3. Ends up with very long validators which are hard to unit test. 

 My current preferred way is to accept that validator functions are long.
 Ad-hoc function extraction would allow for sections of code to be tested without needing to be factored out.
@ -35,18 +35,17 @@ To do either of these, we need to get to grips with the Aiken compilation pipeli
 ### This won't age well 

 Aiken is undergoing active development. 
-This post was started life with Aiken ~v1.14. 
-With Aiken v1.15, there were already reasonably significant changes to the compilation pipeline. 
-The word is that there aren't as big changes in the near future, 
+This post started life with Aiken ~v1.14. 
+Aiken v1.15 introduced reasonably significant changes to the compilation pipeline. 
+The word is that there aren't any more big changes in the near future, 
 but this article will undoubtedly begin to diverge from the current code-base even before publishing.  

 ### Limitations of narrating code

 Narrating code becomes a compromise between being honest and accurate, and being readable and digestible. 
-Following the command `aiken build` covers well in excess of 10,000 LoC.
-The writing of this post ground slowly to a halt as it progressed deeper into the code 
-with the details seeming to increase in importance. 
-At some point I had to draw a line and resign to fact that some parts will remain black boxes for now. 
+The command `aiken build` covers well in excess of 10,000 LoC.
+The writing of this post ground to a halt as it reached deeper into the code-base.
+To redeem it, some (possibly large) sections remain black boxes.

 ## Aiken build

@ -67,7 +66,7 @@ At a high level we are trying to do something straightforward: reformulate Aiken
 Some Aiken expressions are relatively easy to handle for example an Aiken `Int` goes to an `Int` in Uplc. 
 Some Aiken expressions require more involved handling, for example an Aiken `If... If Else... Else ` 
 must have the branches "nested" in Uplc.
-Aiken also have lots of nice-to-haves like pattern matching, modules, and generics.
+Aiken has lots of nice-to-haves like pattern matching, modules, and generics;
 Uplc has none of these.

 ### The Preamble 
@ -114,9 +113,9 @@ Things become a bit intimidating at this point in terms of sheer lines of code:
 `gen_uplc.rs` and three modules in `gen_uplc/` totals > 8500 LoC.  

 Aiken has its own _intermediate representation_ called `air` (as in Aiken Intermediate Representation). 
-These are common in compiled languages.
+Intermediate representations are common in compiled languages.
 `Air` is defined in `aiken-lang/src/gen_uplc/air.rs`. 
-Unsurprisingly, it looks little bit like a language between Aiken and plutus. 
+Unsurprisingly, it looks a little bit like a language between Aiken and plutus. 

 In fact, Aiken has another intermediate representation: `AirTree`. 
 This is constructed between the `TypedExpr` and `Vec<Air>` ie between parsed Aiken and air. 
@ -134,12 +133,12 @@ It does so by
  self.assignment >> self.expect_type_assign >> self.code_gen_functions.insert
 ```
 and thus is creating a hashmap of all the functions that appear in the definition.
-From the call to return of `assign` covers > 600 LoC so we'll leave this as otherwise a black box.
+From the call to return of `assign` covers > 600 LoC so we'll leave this as a black box.
 (`self.handle_each_clause` is also called with `mut` which in turn calls `self.build` for which `mut` it is needed.) 

 Validators in Aiken are boolean functions while in Uplc they are unit-valued (aka void-valued) functions.
 Thus the air tree is wrapped such that `false` results in an error (`wrap_validator_condition`). 
-I don't know why there is a prevailing thought that boolean functions are preferable than functions 
+I don't know why there is a prevailing thought that boolean functions are preferable to functions 
 that error if anything is wrong - which is what validators are.

 `check_validator_args` again extends the airtree from the previous step, 
@ -186,7 +185,7 @@ As `AirTree` is for internal use only, the scope for potential problems is reaso
 It seems likely this is to avoid similar-yet-different IRs between steps.
 However, the trade off is that it partially obfuscates what is a valid state where. 

-What is hoisting? hoisting gives the airtree depth. 
+What is hoisting? Hoisting gives the airtree depth. 
 The motivation is that by the time we hit Uplc it is "generally better"
 that 

@ -194,7 +193,7 @@ that
 - the definition appears as close to use as possible 

 Hoisting creates tree paths. 
-The final airtree to airtree step is`self.hoist_functions_to_validator` traverses the paths.
+The final airtree to airtree step, `self.hoist_functions_to_validator`, traverses these paths.
 There is a lot of mutating of self, making it quite hard to keep a handle on things. 
 In all this (several thousand?) LoC, it is essentially ascertaining in which node of the tree
 to insert each function definition. 
@ -220,7 +219,7 @@ It flattens the tree to a vec.
 Next we go from `Vec<Air> -> Term<Name>`.
 This step is a little more involved than the previous. 
 For one, this is executed in the context of the code generator. 
-Moreover, the code generator is treated mutable - ouch.
+Moreover, the code generator is treated as mutable - ouch.

 On further inspection we see that the only mutation is setting `self.needs_field_access = true`.
 This flag informs the compiler that, if true, additional terms must be added in one of the final steps
@ -232,7 +231,7 @@ Some examples:

 - `Air::Var` require 100 LoC to do case handling on different constructors. 
 - Lists in air have no immediate analogue in uplc
- builtins, as in built-in functions (standard shorthand), have to mediated 
+- builtins, as in built-in functions (standard shorthand), have to be mediated 
 with some combination of `force` and `delay` in order to behave as they should.
 - user functions must be "uncurried", ie treated as a sequence of single argument functions, 
 and recursion must be handled
@ -240,9 +239,8 @@ and recursion must be handled

 #### Cranking the Optimizer

-There is a sequence of operations performed on the Uplc mapping `Term<Name> -> Term<Name>`.
-These remove inconsequential parts of the logic which will appear.
-These include: 
+There is a sequence of operations performed on the Uplc, mapping `Term<Name> -> Term<Name>`.
+This removes inconsequential parts of the logic which have been generated, including: 

 - removing application of the identity function
 - directly substituting where apply lambda is applied to a constant or builtin
@ -259,7 +257,7 @@ The generated program can now be serialized and included in the blueprint.
 ### Plutus Core Signposting

 All this fuss is to get us to a point where we can write Uplc - and good Uplc at that. 
-Note that there's many ways to generate code and most of them are bad.  
+Note that there are many ways to generate code and most of them are bad.  
 The various design decisions and compilation steps make more sense 
 when we have a better understanding of the target language.