Commit Graph

76 Commits

Author SHA1 Message Date
KtorZ d7ec2131ef
Automatically merge import lines from same module.
I slightly altered the way we parse import definitions to ensure we
  merge imports from the same modules (that aren't aliased) together.

  This prevents an annoying warning with duplicated import lines and
  makes it just more convenient overall.

  As a trade-off, we can no longer interleave import definitions with
  other definitions. This should be a minor setback only since the
  formatter was already ensuring that all import definitions would be
  grouped at the top.

  ---

  Note that, I originally attempted to implement this in the formatter
  instead of the parser. As it felt more appropriate there. However, the
  formatter operates on (unmutable) borrowed definitions, which makes it
  annoyingly hard to perform any AST manipulations. The `Document`
  returns by the format carries a lifetime that prevents the creation of
  intermediate local values.

  So instead, slightly tweaking the parser felt like the right thing to
  do.
2024-06-04 10:48:42 +02:00
KtorZ 8ffa68d2f0
Fix && and || associativity.
Somehow, these have always been right-associative, when the natural thing to expect is left-associativity. It now matters when trying to crawl down binary tree to display them properly.
2024-03-07 01:17:05 +01:00
KtorZ 59c784778e
Convert span's start to line number + col
This requires to make line numbers a first-class citizen in the module
  hierarchy but it is fortunately not _too involved_.
2024-01-19 14:30:15 +01:00
KtorZ a306d6e9f2
Move chain and chained parsing into their own submodule
Alleviate a bit more the top-level expression parser. Note that we
probably need a bit more disciplined in what we export and at what level
because there doesn't seem to be much logic as for whether a parser is
private, exported to the crate only or to the wide open. I'd be in favor
of exporting everything by default.
2023-07-05 15:18:07 +02:00
KtorZ 66296df9c3
Move parsing of literals under new 'literal' parser module group
Also moved the logic for 'int' and 'string' there though it is trivial. Yet, for bytearray, it tidies things nicely by removing them from the 'utils' module.
2023-07-05 14:37:29 +02:00
rvcas 5e8edcb340
test(parser): finish moving tests to their correct modules 2023-07-04 17:48:48 -04:00
rvcas 47567c5e6f
test(parser): some adjustments after rebase with @ktorz fix 2023-07-04 17:19:30 -04:00
rvcas b25db429be
test(parser): anon binop and ambiguous sequence 2023-07-04 17:19:30 -04:00
rvcas bd8c13c372
test(parser): move over the validator tests and some misc tests to parser 2023-07-04 17:19:29 -04:00
rvcas 6b05d6a91e
test(parser): rename definitions to definition and more tests 2023-07-04 17:19:29 -04:00
rvcas f878ef7cef
feat: move some token processing to the lexer 2023-07-04 17:19:28 -04:00
rvcas 2226747dc1
feat: finish splitting up parsers 2023-07-04 17:19:28 -04:00
rvcas eea94fc9a4
feat: move anon fn, let, and expect 2023-07-04 17:19:28 -04:00
rvcas e3ed5d3b00
feat: move expr_parser and remove module.rs to definitions 2023-07-04 17:19:28 -04:00
rvcas 3339d41fdd
feat: finish moving definitions and start exprs 2023-07-04 17:19:27 -04:00
rvcas fc580d4fa0
feat(parser): move definitions to their own modules 2023-07-04 17:19:27 -04:00
KtorZ 5a6cc855e6 Use byte count for token span in the lexer.
Somehow, miette doesn't play well with spans when using chars indices.
  So we have to count the number of bytes in strings / chars, so that
  spans align accordingly.
2023-07-04 16:51:59 -04:00
KtorZ 91f03abb7b
Support all binary operator in the anonymous binop parser. 2023-06-17 08:44:45 +02:00
KtorZ d0b4c1c3b5
Add remaining boolean comparison operator to anon binop parser.
Nothing to see here as they all have the same signature. Implementing
  arithmetic bin-operators and boolean logic operators will require some
  more logic.
2023-06-17 07:57:37 +02:00
KtorZ ec94230294
Extend parser to accept anonymous binop as expressions.
This is simply a syntactic sugar which desugarize to a function call with two arguments mapped to the specified binary operator.
  Only works for '>' at this stage as a PoC, extending to all binop in the next commit.
2023-06-17 07:36:11 +02:00
KtorZ ba911d48ea
Refactor 'is_capture' field on function expressions.
Refactored into an enum to make it easier to extend with a new variant to support binary operators.
2023-06-17 07:26:46 +02:00
KtorZ 79a2174f0a
Extend parser to support int as hexadecimal and numeric underscore.
We only allow numeric underscore for decimal numbers as I am not sure how we can define it for non-decimal numbers?
2023-06-08 15:33:50 +02:00
rvcas 3fc9c8e0db
chore: re-add empty line handling by @KtorZ
Co-authored-by: KtorZ
2023-06-07 17:21:04 -04:00
rvcas 2860bac4c6
fix: bad parsing for module select type annotations closes #550 2023-05-30 10:39:49 -04:00
rvcas 7b3e1c6952
feat: adjust failing test syntax
* also add a formatter test
2023-05-25 18:21:12 -04:00
rvcas a124a16a61
feat(tests): implement a way to express that tests can fail 2023-05-25 16:54:53 -04:00
rvcas d8cbcde61d feat(validators): unused param warning
Params being unused were being incorrectly reported.
This was because params need to be initialized
at a scope above both the validator functions. This
manifested when using a multi-validator where one of
the params was not used in both validators.

The easy fix was to add a field called
`is_validator_param` to `ArgName`. Then
when infering a function we don't initialize args
that are validator params. We now handle this
in a scope that is created before in the match branch for
validator in the `infer_definition` function. In there
we call `.in_new_scope` and initialize params for usage
detection.
2023-03-30 21:15:27 -04:00
KtorZ 8a2af4cd2e
Fix lexer throwing errors when parsing a too large tuple index. 2023-03-18 16:13:50 +01:00
KtorZ bc690c5410
Generated wrapped schemas for multi-validators' redeemers 2023-03-17 18:40:49 -04:00
Lucas 22880a300f
Update crates/aiken-lang/src/parser.rs
Co-authored-by: Matthias Benkort <5680256+KtorZ@users.noreply.github.com>
2023-03-17 18:40:48 -04:00
Lucas 950598b534
Update crates/aiken-lang/src/parser.rs
Co-authored-by: Matthias Benkort <5680256+KtorZ@users.noreply.github.com>
2023-03-17 18:40:48 -04:00
rvcas ed92869fb9
feat(validator): parsing and typechecking for double validators 2023-03-17 18:38:24 -04:00
KtorZ 28a3844d54
Cleanup implementation from multi-subjects when/is 2023-03-17 13:06:39 +01:00
KtorZ c113582404 Support multi-clause patterns as syntactic sugar
And disable multi-patterns clauses. I was originally just controlling
  whether we did disable that from the parser but then I figured we
  could actually support multi-patterns clauses quite easily by simply
  desugaring a multi-pattern into multiple clauses.

  This is only a syntactic sugar, which means that the cost of writing
  that on-chain is as expensive as writing the fully expanded form; yet
  it seems like a useful shorthand; especially for short clause
  expressions.

  This commit however disables multi-pattern when clauses, which we do
  not support in the code-generation. Instead, one pattern on tuples for
  that.
2023-03-16 19:45:41 -04:00
KtorZ 1311d9bd27 Support flexible pipe operator formatting
Rules are now as follows:

  - If a pipeline contains a newline, then the entire pipeline is formatted over multiple lines.
  - If it doesn't, then it's formatted as a single-line UNLESS it cannot fit; in which case, we fallback to multiline again.
2023-03-14 16:47:43 -04:00
KtorZ f307e214c3
Remove parse error on bytearray literals for trace, todo & error, parse as String instead.
This has been bothering me and the more I thought of it the more I
  disliked the idea of a warning. The rationale being that in this very
  context, there's absolutely no ambiguity. So it is only frustrating
  that the parser is even able to make the exact suggestion of what
  should be fixed, but still fails.

  I can imagine it is going to be very common for people to type:

  ```
  trace "foo"
  ```

  ...yet terribly frustrating if they have to remember each time that
  this should actually be a string. Because of the `trace`, `todo` and
  `error` keywords, we know exactly the surrounding context and what to
  expect here. So we can work it nicely.

  However, the formatter will re-format it to:

  ```
  trace @"foo"
  ```

  Just for the sake of remaining consistent with the type-system. This
  way, we still only manipulate `String` in the AST, but we conveniently
  parse a double-quote utf-8 literal when coupled with one of the
  specific keywords.

  I believe that's the best of both worlds.
2023-02-19 10:10:42 +01:00
KtorZ d72e13c7c8
Emit parse error when finding a ByteArray literal instead of String literal. 2023-02-19 10:10:42 +01:00
KtorZ 53fb821b62
Use double-quotes for utf-8 bytearrays, and @"..." for string literals
The core observation is that **in the context of Aiken** (i.e. on-chain logic)
  people do not generally want to use String. Instead, they want
  bytearrays.

  So, it should be easy to produce bytearrays when needed and it should
  be the default. Before this commit, `"foo"` would parse as a `String`.
  Now, it parses as a `ByteArray`, whose bytes are the UTF-8 bytes
  encoding of "foo".

  Now, to make this change really "fool-proof", we now want to:

  - [ ] Emit a parse error if we parse a UTF-8 bytearray literal in
    place where we would expect a `String`. For example, `trace`,
    `error` and `todo` can only be followed by a `String`.

    So when we see something like:

    ```
    trace "foo"
    ```

    we know it's a mistake and we can suggest users to use:

    ```
    trace @"foo"
    ```

    instead.

  - [ ] Emit a warning if we ever see a bytearray literals UTF-8, which
    is either 56 or 64 character long and is a valid hexadecimal string.
    For example:

    ```
    let policy_id = "29d222ce763455e3d7a09a665ce554f00ac89d2e99a1a83d267170c6"
    ```

    This is _most certainly_ a mistake, as this generates a ByteArray of
    56 bytes, which is effectively the hex-encoding of the provided string.

    In this scenario, we want to warn the user and inform them they probably meant to use:

    ```
    let policy_id = #"29d222ce763455e3d7a09a665ce554f00ac89d2e99a1a83d267170c6"
    ```
2023-02-19 10:09:22 +01:00
KtorZ 98b89f32e1
Preserve bytearray format choice from input. 2023-02-19 10:09:22 +01:00
KtorZ cd4ceb219c
Remove complex and compound constants.
This is not supported by the code generation, so it's a bit of a lie
  to have them in the language in the first place. There's arguably not
  even any use for constant records, list and tuples to begin with. So
  this cleans this up everywhere for the sake of moving forward with the
  alpha release.

  This now reduces constants to:

  - Integer
  - ByteArray
  - String

  Anything else can be declared via a function anyway. We can revisit
  this choice later.... or not.
2023-02-17 17:31:15 +01:00
KtorZ 6a50bde666 Implement parser & formater for 'TraceIfFalse'
Interestingly enough, chumsky seems to fail when given a 'choice' with
  more than 25 elements. That's why this commit groups together some of
  the choices as another nested 'choice'.
2023-02-16 20:29:41 -05:00
rvcas b057d27465 fix: some updates from latest main 2023-02-16 00:05:55 -05:00
rvcas a88a193383 fix: properly lex new token and adjust parsed spans 2023-02-16 00:05:55 -05:00
rvcas a044c3580e feat: typecheck validators 2023-02-16 00:05:55 -05:00
rvcas 2e7fe191db feat(definitions):
* add parsing for new validator defs
* start adding typechecking
* add a unit test for parsing
2023-02-16 00:05:55 -05:00
KtorZ 56258dc815
Fix todo/error parser on when clauses. 2023-02-16 00:40:49 +01:00
KtorZ 808ff97c68
Preserve trace, error & todo formatting. 2023-02-15 23:19:07 +01:00
KtorZ 6525f21712
Remove 'Todo' from the AST & AIR
Todo is fundamentally just a trace and an error. The only reason we kept it as a separate element in the AST is for the formatter to work out whether it should format something back to a todo or something else.

  However, this introduces redundancy in the code internally and makes the AIR more complicated than it needs to be. Both todo and errors can actually be represented as trace + errors, and we only need to record their preferred shape when parsing so that we can format them back to what's expected.
2023-02-15 21:57:08 +01:00
KtorZ 7b676643bd
Lift 'error' up one level in the parser and remove its label.
We now parse errors as a combination of a trace plus and error term. This is a baby step in order to simplify the code generation down the line and the internal representation of todo / errors.
2023-02-15 21:09:03 +01:00
KtorZ 7abd76b6ad
Allow to trace expressions (and not only string literals)
This however enforces that the argument unifies to a `String`. So this
  is more flexible than the previous form, but does fundamentally the
  same thing.

  Fixes #378.
2023-02-15 21:07:56 +01:00