This is outside of the scope of the sophia parser, but is a simple generalization to
'sophia terms' to make them able to represent any FATE term anonymously.
We also parse these anonymous variant expressions without type info, since it is convenient
for users to copy the output of one call into another call.
Anonymous parsing of None and Some was also added, since new users would be shocked if this
doesn't work, and advanced users will greatly appreciate that it does. The resulting FATE
terms are still rendered as variant([0, 1], ...), since user defined types can also have [0, 1]
as their arity list, and since automation and tooling programmers hate special case exceptions like that.
Anonymous parsing of other Chain and AENS terms are not added, since anonymous variants already cover those types,
so very little is gained by hard-coding such complex types into the term parser. Complex, version-specific compiler
types are already supported by hakuzaru, in the form of the ACI/AACI; parsing without AACI, on the other hand, is
intended to support language-agnostic communication using the primitives of FATE, and in general, variants
in FATE are anonymous.
This one was much simpler to do than hz_aaci, since it doesn't introduce any new types.
I could have made note of some of the conventions, where a type can be represented in multiple
ways in Sophia syntax, or where these functions are actually more lenient than the compiler, but
it isn't as easy to break those notes up from the basic function usage, like it was in hz_aaci,
where those aforementioned new types are used.
Any error reasons or paths are just term() still, and ACI doesn't have a defined spec in the compiler, so whatever, but the AACI types, the erlang representation of terms, and the four different kinds of coerce function are all spec'd now.
Also some internal type substitution functions were given types, just in the hopes of catching some errors, but dyalizer doesn't seem to complain at all no matter how badly I break my code. Strange approach to making a type system, but oh well.
Sophia bitstrings aren't really something you initialize manually, so we have to make up a literal format for them. Failing that, we just accept arbitrary integers and bytearrays as bitstrings.
There are four major fixes here:
1. some eof tokens were being pattern matched with the wrong arity
2. tuples that are too long actually speculatively parse as an untyped tuple, and then complain that there were too many elements,
3. singleton tuples with a trailing comma are now handled differently to grouping parentheses, consistently between typed and untyped logic
4. the extra return values used to detect untyped singleton tuples are also used to pass the close paren position, so that too_many_elements can report the correct file position too.
Point 4. also completely removes the need for tracking open paren positions that I was doing, and that I thought I would need to do even more of in the ambiguous-open-paren-stack case.
This seemed like it was going to be insanely insanely complex, but
then it turns out the compiler doesn't accept spaces in qualified
names, so I can just dump periods in the lexer and hit it with
string:split/3. Easy.
This saves some effort and probably some performance for things like integers, but I'm mainly doing this in anticipation of string literals, because it would just be ridiculous to read code that lexes string literals twice.
Now tests compare the literal parser against the output of the
compiler. The little example contracts we are compiling for the
AACI already had the FATE value in them, in the form of the
instruction
{'RETURNR', {immediate, FateValue}}
so we just extract that and use it for the tests.
We tokenize, and then do the simplest possible recursive descent.
We don't want to evaluate anything, so infix operators are out,
meaning no shunting yard or tree rearranging or LR(1) shenanigans
are necessary, just write the code.
If we want to 'peek', just take the next token, and pass it around
from that point on, until it can actually be consumed.