With your IDE example you need the full parser and type checker to be "tolerant"...

DmitrySoshnikov · on Oct 27, 2020

Great details, thanks!

> A benefit of a syntax with indentation-defined block structure is that you don't need to rely on balanced grouping tokens like { ... }

In fact from the lexer perspective there is no big difference, the matching indent-dedent is the same token type as would be { and }

psykotic · on Oct 27, 2020

Indeed, the difference is that the lexer offers guarantees about the synthetic INDENT/DEDENT tokens. From an error sync perspective, the benefit is that the programmer (redundantly) re-asserts the block level every line by the amount of indentation. As a small addendum on Python's suspension of indentation tracking when nesting > 0, when I designed the syntax for indentation-based block structure in another language, I required such nested code to always have indentation beyond the current block's level even though no INDENT/DEDENT/NEWLINE tokens are emitted in this state. So this was legal:

    x = (1 +
        2)

    x = (1 +
            2)

    x = (1 +
      2)

    x = (1 +
     2)

But this was illegal:

    x = (1 +
    2)

The legal variants are all identical to

    x = (1 + 2)

from the parser's perspective. Adding this restriction (which is already the idiomatic way to indent nested multi-line expressions) means that you can reliably sync to block levels even when recovering from an error in a nested state. If your lexer already strips leading indentation from multi-line string literals you could add a similar constraint for them.

The moral of a lot of these tricks is that by turning idioms and conventions into language enforced constraints you can detect programmer errors more reliably and you can do a better job of error recovery. That said, even in a curly brace language like C# you could still use the indentation structure as a heuristic guide for error recovery--it's just going to be less reliable.

DmitrySoshnikov · on Oct 27, 2020

Yeah, this makes sense, thanks.