Skip to content

Commit

Permalink
Update docs for Tolk v0.7
Browse files Browse the repository at this point in the history
  • Loading branch information
tolk-vm committed Jan 13, 2025
1 parent 0e53e25 commit 2fca78d
Show file tree
Hide file tree
Showing 6 changed files with 218 additions and 31 deletions.
4 changes: 4 additions & 0 deletions cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@
"hehehe",
"highlevel",
"highload",
"Hindley",
"hmac",
"howto",
"HOWTO",
Expand All @@ -201,6 +202,7 @@
"keyblocks",
"keypair",
"keystream",
"Kolya",
"leaderboard",
"leaderboards",
"libmicrohttpd",
Expand All @@ -215,6 +217,7 @@
"mintless",
"micropayment",
"micropayments",
"Milner",
"mintable",
"moddiv",
"multichain",
Expand All @@ -228,6 +231,7 @@
"nanotons",
"newkeypair",
"nextra",
"Nikolay",
"nmon",
"nonexist",
"nonfinal",
Expand Down
27 changes: 25 additions & 2 deletions docs/v3/documentation/smart-contracts/tolk/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,17 @@
When new versions of Tolk are released, they will be mentioned here.


## v0.7

1. Under the hood: refactor compiler internals; AST-level semantic analysis kernel
2. Under the hood: rewrite the type system from Hindley-Milner to static typing
3. Clear and readable error messages on type mismatch
4. Generic functions `fun f<T>(...)` and instantiations like `f<int>(...)`
5. The `bool` type; type casting via `value as T`

More details [on GitHub](todo).


## v0.6

The first public release. Here are some notes about its origin:
Expand All @@ -20,7 +31,7 @@ For several months, I have worked on Tolk privately. I have implemented a giant
And it's not only about the syntax. For instance, Tolk has an internal AST representation, completely missed in FunC.

On TON Gateway, on 1-2 November in Dubai, I had a speech presenting Tolk to the public, and we released it the same day.
Once the video is available, I'll attach it here.
The video is available [on YouTube](https://www.youtube.com/watch?v=Frq-HUYGdbI).

Here is the very first pull request: ["Tolk Language: next-generation FunC"](https://github.com/ton-blockchain/ton/pull/1345).

Expand All @@ -29,4 +40,16 @@ The first version of the Tolk Language is v0.6, a metaphor of FunC v0.5 that mis

## Meaning of the name "Tolk"

I'll update this section after announcing Tolk on TON Gateway.
"Tolk" is a very beautiful word.

In English, it's consonant with *talk*. Because, generally, what do we need a language for? We need it *to talk* to computers.

In all slavic languages, the root *tolk* and the phrase *"to have tolk"* means "to make sense"; "to have deep internals".

But actually, **TOLK** is an abbreviation.
You know, that TON is **The Open Network**.
By analogy, TOLK is **The Open Language K**.

What is K, will you ask? Probably, "kot" — the nick of Nikolay Durov? Or Kolya? Kitten? Kernel? Kit? Knowledge?
The right answer — none of this. This letter does not mean anything. It's open.
*The Open Letter K*
23 changes: 10 additions & 13 deletions docs/v3/documentation/smart-contracts/tolk/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -133,24 +133,21 @@ Anyway, no matter what language you use, you should cover your contracts with te
The first released version of Tolk is v0.6, emphasizing [missing](/v3/documentation/smart-contracts/tolk/changelog#how-tolk-was-born) FunC v0.5.

Here are some (yet not all and not ordered in any way) points to be investigated:
- type system improvements: boolean type, nullability, dictionaries
- structures, with auto-packing to/from cells, probably integrated with message handlers
- structures with methods, probably generalized to cover built-in types
- some integrations with TL scheme, either syntactical or via code generation
- human-readable compiler errors
- type system improvements: nullability, fixed-size integers, union types, dictionaries
- structures and generics
- auto-pack structures to/from cells, probably integrated with message handlers
- methods for structures, generalized to cover built-in types
- easier messages sending
- better experience for common use-cases (jettons, nft, etc.)
- gas and stack optimizations, AST inlining
- extending and maintaining stdlib
- think about some kind of ABI (how explorers "see" bytecode)
- think about gas and fee management in general
- some kind of ABI (how explorers "see" bytecode)
- gas and fee management in general

Note, that most of the points above are a challenge to implement.
At first, FunC kernel must be fully refactored to "interbreed" with abilities it was not designed for.

Also, I see Tolk evolution partially guided by community needs.
It would be nice to talk to developers who have created interconnected FunC contracts,
to absorb their pain points and discuss how things could be done differently.
The next strategic goal for **Tolk v1.0** is **structures with auto-serialization into cells**.
This will eliminate the need for manual manipulations with builders and slices, allowing data and messages to be described declaratively.
Closely related to this is the **ABI (interface) of contracts**.
Well-designed structures actually make up the majority of an ABI.


## Issues and Contacts
Expand Down
189 changes: 175 additions & 14 deletions docs/v3/documentation/smart-contracts/tolk/tolk-vs-func/in-detail.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -260,11 +260,9 @@ fun do_smth(c, n)
fun do_smth(c: cell, n: int)
```

There is an `auto` type, so `fun f(a: auto)` is valid, though not recommended.

If parameter types are mandatory, return type is not (it's often obvious of verbose). If omitted, it means `auto`:
If parameter types are mandatory, return type is not (it's often obvious of verbose). If omitted, it's auto inferred:
```tolk
fun x() { ... } // auto infer return
fun x() { ... } // auto infer from return statements
```

For local variables, types are also optional:
Expand Down Expand Up @@ -316,15 +314,180 @@ fun send(msg: cell) {
✅ Changes in the type system
</h3>

Type system in the first Tolk release is the same as in FunC, with the following modifications:
- `void` is effectively an empty tensor (more canonical to be named `unit`, but `void` is more reliable); btw, `return` (without expression) is actually `return ()`, a convenient way to return from void functions
FunC's type system is based on Hindley-Milner. This is a common approach for functional languages, where types are inferred from usage through unification.

In Tolk v0.7, the type system is rewritten from scratch.
In order to add booleans, fixed-width integers, nullability, structures, and generics, we must have a static type system (like TypeScript or Rust).
Because Hindley-Milner will clash with structure methods, struggle with proper generics, and become entirely impractical for union types (despite claims that it was "designed for union types").

We have the following types:
- `int`, `bool`, `cell`, `slice`, `builder`, untyped `tuple`
- typed tuple `[T1, T2, ...]`
- tensor `(T1, T2, ...)`
- callables `fun(TArgs) -> TResult`
- `void` (more canonical to be named `unit`, but `void` is more reliable)
- `self`, to make chainable methods, described below; actually it's not a type, it can only occur instead of return type of a function

The type system obeys the following rules:
- variable types can be specified manually or are inferred from declarations, and never change after being declared
- function parameters must be strictly typed
- function return types, if unspecified, inferred from return statements similar to TypeScript; in case of recursion (direct or indirect), the return type must be explicitly declared somewhere
- generic functions are supported


<h3 className="cmp-func-tolk-header">
✅ Clear and readable error messages on type mismatch
</h3>

In FunC, due to Hindley-Milner, type mismatch errors are very hard to understand:
```
error: previous function return type (int, int)
cannot be unified with implicit end-of-block return type (int, ()):
cannot unify type () with int
```

In Tolk, they are human-readable:
```
1) can not assign `(int, slice)` to variable of type `(int, int)`
2) can not call method for `builder` with object of type `int`
3) can not use `builder` as a boolean condition
4) missing `return`
...
```


<h3 className="cmp-func-tolk-header">
✅ <code>bool</code> type, casting <code>boolVar as int</code>
</h3>

Under the hood, **`bool` is still -1 and 0 at TVM level**, but from the type system's perspective, `bool` and `int` are now different.

Comparison operators `== / >= /...` return `bool`. Logical operators `&& ||` return `bool`. Constants `true` and `false` have the `bool` type.
Lots of stdlib functions now return `bool`, not `int` (having -1 and 0 at runtime):
```tolk
fun setContractData(c: cell): void
asm "c4 POP";
var valid = isSignatureValid(...); // bool
var end = cs.isEndOfSlice(); // bool
```

Operator `!x` supports both `int` and `bool`. Condition of `if` and similar accepts both `int` (!= 0) and `bool`.
Logical `&&` and `||` accept both `bool` and `int`, preserving compatibility with constructs like `a && b` where `a` and `b` are integers (!= 0).

Arithmetic operators are restricted to integers, only bitwise and logical allowed for bools:
```tolk
valid && end; // ok
valid & end; // ok, bitwise & | ^ also work if both are bools
if (!end) // ok
if (~end) // error, use !end
valid + end; // error
8 & valid; // error, int & bool not allowed
```

Note, that logical operators `&& ||` (missed in FunC) use IF/ELSE asm representation always.
In the future, for optimization, they could be automatically replaced by `& |` when it's safe (example: `a > 0 && a < 10`).
To manually optimize gas consumption, you can still use `& |` (allowed for bools), but remember, that they are not short-circuit.

**`bool` can be cast to `int` via `as` operator**:
```tolk
var i = boolValue as int; // -1 / 0
```

There are no runtime transformations. `bool` is guaranteed to be -1/0 at TVM level, so this is type-only casting.
But generally, if you need such a cast, probably you're doing something wrong (unless you're doing a tricky bitwise optimization).


<h3 className="cmp-func-tolk-header">
✅ Generic functions and instantiations like <code>f&lt;int&gt;(...)</code>
</h3>

In FunC, there were "forall" functions:
```func
forall X -> tuple tpush(tuple t, X value) asm "TPUSH";
```
- `auto` mean "auto infer"; in FunC, `_` was used for that purpose; note, that if a function doesn't specify return type, it's `auto`, not `void`
- `self`, to make chainable methods, described below; actually it's not a type, it can only occur instead of return type of a function
- `cont` renamed to `continuation`

Tolk introduces properly made generic functions. Their syntax reminds mainstream languages:
```tolk
fun tuplePush<T>(mutate self: tuple, value: T): void
asm "TPUSH";
```

When `f<T>` is called, `T` is detected (in most cases) by provided arguments:
```tolk
t.tuplePush(1); // detected T=int
t.tuplePush(cs); // detected T=slice
t.tuplePush(null); // error, need to specify "null of what type"
```

The syntax `f<int>(...)` is also supported:
```tolk
t.tuplePush<int>(1); // ok
t.tuplePush<int>(cs); // error, can not pass slice to int
t.tuplePush<int>(null); // ok, null is "null of type int"
```

User-defined functions may also be generic:
```tolk
fun replaceLast<T>(mutate self: tuple, value: T) {
val size = self.tupleSize();
self.tupleSetAt(value, size - 1);
}
```

Having called `replaceLast<int>` and `replaceList<slice>` will result in TWO generated asm (fift) functions.
Actually, they mostly remind "template" functions. At each unique invocation, function's body is fully cloned under a new name.

There may be multiple generic parameters:
```tolk
fun replaceNulls<T1, T2>(tensor: (T1, T2), v1IfNull: T1, v2IfNull: T2): (T1, T2) {
var (a, b) = tensor;
return (a == null ? v1IfNull : a, b == null ? v2IfNull : b);
}
```

A generic parameter `T` may be something complex.
```tolk
fun duplicate<T>(value: T): (T, T) {
var copy: T = value;
return (value, copy);
}
duplicate(1); // duplicate<int>
duplicate([1, cs]); // duplicate<[int, slice]>
duplicate((1, 2)); // duplicate<(int, int)>
```

Or even functions, it also works:
```tolk
fun callAnyFn<TObj, TResult>(f: fun(TObj) -> TResult, arg: TObj) {
return f(arg);
}
fun callAnyFn2<TObj, TCallback>(f: TCallback, arg: TObj) {
return f(arg);
}
```

Note, that while generic `T` are mostly detected from arguments, there are not so obvious corner cases, when `T` does not depend from arguments:
```tolk
fun tupleLast<T>(self: tuple): T
asm "LAST";
var last = t.tupleLast(); // error, can not deduce T
```

To make this valid, `T` should be provided externally:
```tolk
var last: int = t.tupleLast(); // ok, T=int
var last = t.tupleLast<int>(); // ok, T=int
var last = t.tupleLast() as int; // ok, T=int
someF(t.tupleLast()); // ok, T=(paremeter's declared type)
return t.tupleLast(); // ok if function specifies return type
```

Also note, that `T` for asm functions must occupy 1 stack slot (otherwise, asm body is unable to handle it properly), whereas for a user-defined function, `T` could be of any shape.

In the future, when structures and generic structures are implemented, all the power of generic functions will come into play.


<h3 className="cmp-func-tolk-header">
Expand Down Expand Up @@ -807,9 +970,7 @@ Tolk supports logical operators. They behave exactly as you get used to (right c

Keywords `ifnot` and `elseifnot` were removed, since now we have logical not (for optimization, Tolk compiler generates `IFNOTJMP`, btw). Keyword `elseif` was replaced by traditional `else if`.

Note, that it does NOT mean that Tolk language has `bool` type. No, comparison operators still return an integer. A `bool` type support will be available someday, after hard work on the type system.

Remember, that `true` is -1, not 1. Both in FunC and Tolk. It's a TVM representation.
Remember, that a boolean `true`, transformed `as int`, is -1, not 1. It's a TVM representation.


<h3 className="cmp-func-tolk-header">
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ get currentCounter(): int { ... }
7. A function can be called even if declared below; forward declarations not needed; the compiler at first does parsing, and then it does symbol resolving; there is now an AST representation of source code
8. stdlib functions renamed to ~~verbose~~ clear names, camelCase style; it's now embedded, not downloaded from GitHub; it's split into several files; common functions available always, more specific available with `import "@stdlib/tvm-dicts"`, IDE will suggest you; here is [a mapping](/v3/documentation/smart-contracts/tolk/tolk-vs-func/stdlib)
9. No `~` tilda methods; `cs.loadInt(32)` modifies a slice and returns an integer; `b.storeInt(x, 32)` modifies a builder; `b = b.storeInt()` also works, since it not only modifies, but returns; chained methods work identically to JS, they return `self`; everything works exactly as expected, similar to JS; no runtime overhead, exactly same Fift instructions; custom methods are created with ease; tilda `~` does not exist in Tolk at all; [more details here](/v3/documentation/smart-contracts/tolk/tolk-vs-func/mutability)
10. Clear and readable error messages on type mismatch
11. `bool` type support

#### Tooling around
- JetBrains plugin exists
Expand Down
4 changes: 2 additions & 2 deletions src/theme/prism/prism-tolk.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@
}
],

'type-hint': /\b(type|enum|int|cell|void|bool|auto|slice|tuple|builder|continuation)\b/,
'type-hint': /\b(type|enum|int|cell|void|bool|slice|tuple|builder|continuation)\b/,

'boolean': /\b(false|true|null)\b/,

'keyword': /\b(do|if|try|else|while|break|throw|catch|return|assert|repeat|continue|asm|builtin|import|export|true|false|null|redef|mutate|tolk|global|const|var|val|fun|get|struct)\b/,
'keyword': /\b(do|if|as|try|else|while|break|throw|catch|return|assert|repeat|continue|asm|builtin|import|export|true|false|null|redef|mutate|tolk|global|const|var|val|fun|get|struct)\b/,

'self': /\b(self)\b/,

Expand Down

0 comments on commit 2fca78d

Please sign in to comment.