aiken

History

KtorZ a124bdbb05 Infer callee first in function call The current inferrence system walks expressions from "top to bottom". Starting from definitions higher in the source file, and down. When a call is encountered, we use the information known for the callee definition we have at the moment it is inferred. This causes interesting issues in the case where the callee doesn't have annotations and in only partially known. For example: ``` pub fn list(fuzzer: Option<a>) -> Option<List<a>> { inner(fuzzer, []) } fn inner(fuzzer, xs) -> Option<List<b>> { when fuzzer is { None -> Some(xs) Some(x) -> Some([x, ..xs]) } } ``` In this small program, we infer `list` first and run into `inner`. Yet, the arguments for `inner` are not annotated, so since we haven't inferred `inner` yet, we will create two unbound variables. And naturally, we will link the type of `[]` to being of the same type as `xs` -- which is still unbound at this point. The return type of `inner` is given by the annotation, so all-in-all, the unification will work without ever having to commit to a type of `[]`. It is only later, when `inner` is inferred, that we will generalise the unbound type of `xs` to a generic which the same as `b` in the annotation. At this point, `[]` is also typed with this same generic, which has a different id than `a` in `list` since it comes from another type definition. This is unfortunate and will cause issues down the line for the code generation. The problem doesn't occur when `inner`'s arguments are properly annotated or, when `inner` is actually inferred first. Hence, I saw two possible avenues for fixing this problem: 1. Detect the presence of 'uncongruous generics' in definitions after they've all been inferred, and raise a user error asking for more annotations. 2. Infer definitions in dependency order, with definitions used in other inferred first. This commit does (2) (although it may still be a good idea to do (1) eventually) since it offers a much better user experience. One way to do (2) is to construct a dependency graph between function calls, and ensure perform a topological sort. Building such graph is, however, quite tricky as it requires walking through the AST while maintaining scope etc. which is more-or-less already what the inferrence step is doing; so it feels like double work. Thus instead, this commit tries to do a deep-first inferrence and "pause" inferrence of definitions when encountering a call to fully infer the callee first. To achieve this properly, we must ensure that we do not infer the same definition again, so we "remember" already inferred definitions in the environment now.	2024-05-06 15:17:01 -04:00
..
src	Infer callee first in function call	2024-05-06 15:17:01 -04:00
Cargo.toml	chore: Release	2024-03-25 22:09:37 -04:00

KtorZ a124bdbb05 Infer callee first in function call

The current inferrence system walks expressions from "top to bottom".
  Starting from definitions higher in the source file, and down. When a
  call is encountered, we use the information known for the callee
  definition we have at the moment it is inferred.

  This causes interesting issues in the case where the callee doesn't
  have annotations and in only partially known. For example:

  ```
  pub fn list(fuzzer: Option<a>) -> Option<List<a>> {
    inner(fuzzer, [])
  }

  fn inner(fuzzer, xs) -> Option<List<b>> {
    when fuzzer is {
      None -> Some(xs)
      Some(x) -> Some([x, ..xs])
    }
  }
  ```

  In this small program, we infer `list` first and run into `inner`.
  Yet, the arguments for `inner` are not annotated, so since we haven't
  inferred `inner` yet, we will create two unbound variables.

  And naturally, we will link the type of `[]` to being of the same type
  as `xs` -- which is still unbound at this point. The return type of
  `inner` is given by the annotation, so all-in-all, the unification
  will work without ever having to commit to a type of `[]`.

  It is only later, when `inner` is inferred, that we will generalise
  the unbound type of `xs` to a generic which the same as `b` in the
  annotation. At this point, `[]` is also typed with this same generic,
  which has a different id than `a` in `list` since it comes from
  another type definition.

  This is unfortunate and will cause issues down the line for the code
  generation. The problem doesn't occur when `inner`'s arguments are
  properly annotated or, when `inner` is actually inferred first.

  Hence, I saw two possible avenues for fixing this problem:

  1. Detect the presence of 'uncongruous generics' in definitions after
     they've all been inferred, and raise a user error asking for more
     annotations.

  2. Infer definitions in dependency order, with definitions used in
     other inferred first.

  This commit does (2) (although it may still be a good idea to do (1)
  eventually) since it offers a much better user experience. One way to
  do (2) is to construct a dependency graph between function calls, and
  ensure perform a topological sort.

  Building such graph is, however, quite tricky as it requires walking
  through the AST while maintaining scope etc. which is more-or-less
  already what the inferrence step is doing; so it feels like double
  work.

  Thus instead, this commit tries to do a deep-first inferrence and
  "pause" inferrence of definitions when encountering a call to fully
  infer the callee first. To achieve this properly, we must ensure that
  we do not infer the same definition again, so we "remember" already
  inferred definitions in the environment now.

2024-05-06 15:17:01 -04:00

src

Infer callee first in function call

2024-05-06 15:17:01 -04:00

Cargo.toml

chore: Release

2024-03-25 22:09:37 -04:00