I find this to be a good example where a language with well defined "foundations" can really shine.
If a programming language supports union types and type inference then there is no need for such a "special" language feature as described here. The return type will simply have the form "goodresult | error" and the compiler will infer it based on the function-body alone.
At the same time, the following problem of zig is automatically a non-issue:
> A serious challenge with Zig's simple approach to errors that our errors are nothing more than enum values.
If the result is "goodresult | error" then I can choose of what type "error" is and I'm not bound to any predefined types that the language forces upon me.
In fact, the article then shows how to work around the issue with a tagged union. The problem is that this creates another issue: the composition of different error types.
E.g. having function foo calling bar and baz and bar can fail with barError and baz can fail with bazError. Then, without having to do anything, the compiler should infer that foo's return type should be "goodresult | barError | bazError". This won't work if tagged unions are used like in the example - they will have to be mapped by hand all the time, creating a lot of noise.
While Zig's errorhandling is not bad and miles above Go's, it makes me sad to see that we still lack ergonomy of errorhandling in so many languages despite knowing pretty much all of the language-features that are necessary to make it much easier.
"goodResult | barError | bazError" method just don't scale; eventually, especially if using third-party libraries, you may end up with dozens of not hundreds of error types this way.
The way Rust does it with its try operator (?), you have to tell it if it's barError (in which case it's up to you to tell it how bazError is convertible to it), or perhaps bazError, or something else altogether (e.g. sum type like you suggested, but conversion still needs to be supplied). There's also some libraries that provide error types for the lazy that are convertible from everything, so you can use try operator without much thinking, but for libraries you almost never use them.
I use exactly that way of error handling and I find it scales very very well and makes refactoring a breeze.
Nothing stops you from defining certain "borders" in your application where you wrap errors into something meaningful. E.g. if I write a file-IO library, I will return a custom "CannotOpenFile" error in my public methods which will encapsulate the underlying error(s) instead of returning a list of potential low-level errors had caused it.
The difference is that with uniontypes it is me who decides the level of abstraction and granularity, whereas without it, I'm forced into manually handling errors types pretty much everywhere.
Rust is an example where you are always forced to either align error types or where you lose information about what kind of errors it could be by using a more abstract error type. (which is the last thing you mentioned)
I'm confused because I didn't say that Rust makes refactoring a breeze right? In fact, I said it does the opposite. So I don't really understand your response.
Ah, I read "I use exactly that way of error handling..." as "I use exactly that way of error handling [in Rust]..." (given the post you were replying to), but I guess you were talking about Zig. Sorry for the misunderstanding.
You are right. This result of the modeling of tagged unions in terms of enums. Hare [0] avoids this issue by using global tags: every type has a unique global tag. This allows merging tagged unions.
Wouldn't that mean that every type always has to include the type tag as an extra byte, making every type larger overall? Or is it only included if a union type is created from that type?
No, the tag is only part of the union.
The tag is deterministically and globally computed by the compiler. Tags are encoded in a fixed number of bytes.
> The return type will simply have the form "goodresult | error" and the compiler will infer it based on the function-body alone.
I'm 0% convinced that inferred return types are a feature that a production language should have, ever. I've used it before and it's infuriating on top of being bad code.
If a function is annotated f: T -> U where U is not an error, the contract with the caller is that f does not error, and cannot be changed to produce an error, and the compiler upholds this contract for all callers and definitions or redefinitions of f. It also means that another compilation unit can refer to f without having to type check its body, which sure, is a leaky abstraction, but empowers parallel compilation.
Changing the signature is a breaking change and as such should have a higher amount of friction. Changing the error type doubly so.
And on top of all that, errors should not be non discriminated union types. The example is Result<int, int> - you need the error to have a discriminant to distinguish between a success and an error code.
> Changing the signature is a breaking change and as such should have a higher amount of friction.
Why?
I'm all for very rigorous static type checking. But if we can make refactorings and changes frictionless — let's. Having the ability to move things around and reorder them to work better results in more iterations and better code.
As long as each iteration is very rigorously type-checked, of course.
It seems that you're talking about a code that is exposed to some third-party clients where you need to maintain compatibility. This is, of course, a valid concern, and you're absolutely right that this layer, exposed to third party clients absolutely does need increased level of friction.
However, most applications don't have any such exposure, all of their code is consumed internally. And even when they do, their external API surface which is actually exposed as some REST or ABI is minuscule in comparison to all the internal APIs — method calls and types that are never exposed anywhere.
I think that this problem, which is certainly a valid one, should be solved with methods which would not negatively affect developer experience of working with 99% of all the completely private APIs that don't need to be backwards compatible.
Yes, certain language designs make it easier to break APIs for clients, just like the one mentioned above. I was pointing out that this is bad and we should try to prevent that from happening
No, the one mentioned above makes it easier to not break clients. A language that exports an API without checked exceptions can change the implementation and break clients silently in production. A language with checked exceptions will not break clients silently, ever. That's strictly superior from a robustness perspective.
If, as a library author, you want to define a strict contract with clients that you never break, then you explicitly specify the exception signature and you don't let the exceptions be inferred. Then if the implementation changes in a way that introduces a new exception that doesn't match the client contract, then the API author gets the error at compile-time.
That depends a bit on the definition I guess - let me explain why:
If you call a method in e.g. Java which is annotated to return a string, then by definition you have to expect the following possible outcomes:
- The method returns a string
- The method throws an exception (or rather: a Throwable)
- The method does not return at all
That's just how it is in Java and also in most other languages. Therefore, throwing an exception must always be expected. Only if there were only checked exceptions and no unchecked exceptions this would be different - but then how do you handle things like Stackoverflow or OOM exceptions?
In fact, if a methods says it returns a string, what that really means is that that it will not return something else than a string. That is the real meaning. It cannot e.g. return an integer. But that's about all the guarantees that it gives.
So you see, it's tricky.
There are however languages where you can really rely on that they do return what they claim. But those languages then force you to prove that a method actually returns within a finite amount of time and such things. Not very practical to use in many cases.
As others have been said, I think there is a misunderstanding here.
First, methods can (and often should) be annotated with their return types. And if a method is annotated as "T -> U" then it cannot return an error and trying to do so would fail the compilation - so I totally agree with you on that.
But inside of (non public) application or library code, there are a lot of small methods calling each other before there is some higher-up public method that will be called not by me but someone else.
And for those kind of methods type inference is great when it comes to refactoring. Even if I make a mistake and make such a method return an error even though it shouldn't, this mistake will be caught a bit higher up where the return type is annotated. In my experience this has almost no drawbacks and drastically simplifies refactorings.
I don't think there's a misunderstanding, I stand by what I said.
Inferred return types don't make refactoring easier. In my experience they make it much more difficult, because you have to look at function implementation to understand what it does. Particularly for error handling, you always want to see what errors a function may return even in library code.
And all that said, there's no distinction to me between internals and externally facing code when it comes to quality and standards. If you have a function, even if it is called once, it should always have its return type annotated. It is more correct than leaving it to be inferred by the compiler, because you are explicitly annotating the intended behavior of the abstraction. This makes it easier to write correct code and for verifying it by a reviewer.
> Particularly for error handling, you always want to see what errors a function may return even in library code.
Any half-decent tooling should do that, since GP was talking about compile-time type inference. Of course there are some situations where the type hints aren't available, but then we're back to a general discussion about type inference.
FWIW, I don't actually agree with the idea that set-theoretic unions solve error handling, but I'm fully on-board with opt-in type inference for errors.
> Inferred return types don't make refactoring easier. In my experience they make it much more difficult, because you have to look at function implementation to understand what it does.
My editor/IDE does show the inferred type to me though. I don't have to look at the function implementation. If I had to, I would agree with you.
I think locking use of a language into particular editors is a step several decades backward into Borland-land. I'm not at all confident that "just stop writing types in source code because the magic editor will show them anyway" is a better posture than "just have the magic editor auto-fill the types".
I totally get your point and it's valid. But I think this ship has sailed a long time ago.
With modern languages having adhoc polymorphism, inheritance and different kinds of dispatch as well as extension methods etc., without an IDE you are already lost anyways.
E.g. in rust you won't know where a trait is implemented and how it will behave, in kotlin you won't know if a method exists on the object or comes from somewhere else etc.
The trend is going to more supportive but also more complex languages and tooling is naturally adapting to that.
> If a function is annotated f: T -> U where U is not an error, the contract with the caller is that f does not error, and cannot be changed to produce an error, and the compiler upholds this contract for all callers and definitions or redefinitions of f
The OP is not suggesting that. They're suggesting that U can be an extensible union type that can be inferred from the code. Of course, if you declare it to not be such a union, then you guarantee no errors.
Just think of it like checked exceptions from Java, except the exception signature is inferred from the function's code. If you explicitly declare "does not throw", then the compiler produces a type error that you have not handled all exceptions.
> there is no need for such a "special" language feature as described here
I think the special language feature is the point, not an unfortunate side effect of something lacking in the language design.
The purpose of errors is to pass an indication of control-flow from the called function to the caller. That’s why the language has “catch”, “try”, “errdefer”, etc. This is about the language providing good flow control tools for the error path.
Sure. My point is: if a language has certain other fundamental language features, then there is no or less need for those "specific" error features like errdefer.
But that isn't true. Every language with ergonomic error handling (i.e. something more than C or Go) has special language features for handling errors. What 'fundamental language features' would replace Zig's specific language features like errdefer?
The problem errdefer solves is “resource passthrough” where a function needs to acquire a resource, do some faillible processing, and return the resource.
In case the processing fails, the resource needs to be released, which an unconditional “defer” is not great for (now you need extra flags).
But RAII solves the issue, in case of failure you just don’t return the resource, and it gets released. Likewise linear types, if you don’t return the resource you will have to explicitly release it.
Errdefer only adds a workaround, because now every time you acquire a resource you need to question how it interacts with the call stack and whether you need to defer or errdefer it, and any change to the function requires reconsidering this question.
RAII uses destructors. Destructors are a language feature that is even more ad-hoc than errdefer. Instead of writing general-purpose functions, you need special functions for construction and destruction. That's bad design.
errdefer is not just used for destroying objects either, no more than functions are only used for constructing objects.
That seems loaded enough in your mind that it aliases to untrue. RAII uses some sort of type-associated operation. How that type association is reified is up to the language.
> you need special functions for construction and destruction
That is not true. You don't need any special function for construction, and for destruction you only need a hook. Which is no more special (and arguably less so) than the hook you need for either defer or errdefer.
> Destructors are a language feature that is even more ad-hoc than errdefer.
I'm not sure you understand what the expression "ad-hoc" means. "errdefer" is an ad-hoc feature, it has a very narrow and specific purpose. RAII is a much more generic feature, for instance it subsumes both defer and errdefer, as well as other patterns of resource management.
> errdefer is not just used for destroying objects either
Good news: neither is RAII. Object destruction is the hook onto which RAII attaches, not the purpose.
Exactyl! It is RAII (or linear types) with the addition of a "contextual computation" and syntax for it, so that multiple calls can be done in an ergonomic way.
Ah so we need three or four additional language features to do it, and PhDs in category theory.
...or you could accept that not every language needs to be a Haskell clone, and there is more than one good way to design a language. Choices have tradeoffs. The choices Haskell and Rust make are not the only valid choices. They have costs.
Not really, you don't even need a specific language feature to model RAII, except for some basic ones like function lambdas to make it ergonomic to use.
It also doesn't really have anything to do with Haskell, it really is language agnostic.
> If a programming language supports union types and type inference then there is no need for such a "special" language feature
Laughs in Go.
> While Zig's errorhandling is not bad and miles above Go's
Laughs again in Go. Go has no "foundations" regarding errors, it's just a convention. It has no union types. It doesn't have weird corner cases. It's just a returned value you can handle. Or not.
Of all the error handling paradigms I've seen, Go's requires the least amount of "specialized thinking" (try/catch or whatnot)--it's just becomes another variable.
Go's lack of sum types mean that there is no static check for whether the error has actually been handled or not. Go's designers went to all the trouble of having a static type system, but then failed to properly reap the benefits.
Sum types are the mathematical dual of product types. It makes sense for any modern language to include them.
Ever since I learned of sum types, they have ruined my enjoyment of programming languages which don't have them. I sorely miss them in C++ for example (and std::variant is not a worthy alternative).
I don't understand why any new language wouldn't have them.
Pedantic typechecking is like learning to spot improper kerning, you think it’s a good thing but you spend your entire life cringing at the world around you.
std::variant is a good example of many things bad with c++ improvement process, as a language.
If you want to just pattern match on type of visitor there is “another convenience helper” that you need to bring, and result still looks not pleasant.
Introduced in like c++17, even in c++23 you still need to write a std::visit to process it. Committee members waste time on yak shaving that std::print
In theory the lack of sum types sounds like a drawback for Go error handling, in practice it does not matter at all IMO.
So far I have never worked a Go project without a strict linter enabled on the pipeline checking that you handled the case when err != nil. I don't care if it is the compiler or the linter doing it, the end result in practice is that there actually is no chance of forgetting to check the error, and works just as well as a stronger type system while also making the code stupidly obvious to read.
> no chance of forgetting to check the error, and works just as well as a stronger type system
A linter-based syntactic check is no substitute for a proper type system. A type system gives a machine checked proof. A heuristic catches some but not all failures to handle errors, it will also give false positives.
Error handling via sum types only enforces the rather weak constraint that you cannot access a non-error return value in the case where the function returns an error. It certainly doesn’t catch all failures to handle errors. For example, in Rust you are perfectly free to call a function which returns a Result and then ignore its return value (hence ignoring the case where an error occurred). Go’s error checking linters impose stricter constraints in some respects than the constraints on error handling imposed by Rust’s type system.
> only enforces the rather weak constraint that you cannot access a non-error return value in the case where the function returns an error.
This "rather weak constraint" as you put it, completely solves Tony Hoare's "billion dollar mistake": null pointer exceptions. Something Go also suffers from due to lack of Sum types. With regard to your Rust example, the compiler will give a warning that can be turned into an error to completely prevent this, if desired.
As the parent said, sum types are "foundational" and have many applications for writing safe statically checked code. Eradicating null pointers and enabling chainable result types are only the tip of the iceberg.
> This "rather weak constraint" as you put it, completely solves Tony Hoare's "billion dollar mistake": null pointer exceptions.
Yes. However, it doesn’t do what you seemed to be suggesting that it does, which is “catch all failures to handle errors”. You correctly note that Go’s linters can’t always do this, but also seem to erroneously suggest that Rust’s type system somehow can. This is backwards. Go’s error linters catch most instances of ignored error values, whereas Rust’s type system doesn’t do anything, in and of itself, to ensure that error values are not ignored. Of course there are compiler warnings to catch unused errors in Rust, but that’s fundamentally the same thing as the warnings you get from Go linters, and has nothing really to do with sum types. Whether or not an error is ‘ignored’ or ‘used’ in any interesting sense is not a formal property of a program that can be formally verified. (Yes, linear types, etc. etc., but you can formally use an error value in that sense while in practice ignoring the error.)
By the way, you don’t have to preach to me (or really anyone on HN) about the virtues of sum types. I’ve written a fair amount of Haskell and Rust code. My issue here is not with the utility of sum types, but with the erroneous claim that they somehow remove the need for linters or compiler warnings that flag unhandled errors.
> but also seem to erroneously suggest that Rust’s type system somehow can.
I didn't bring Rust into this discussion, it is hardly a model implementation of sum types and using them for proofs, but it is certainly a step in the right direction.
> My issue here is not with the utility of sum types, but with the erroneous claim that they somehow remove the need for linters or compiler warnings
You are misrepresenting my posts, I am responding to an erroneous claim that linters are a satisfactory substitute for sum types.
>> no chance of forgetting to check the error, and works just as well as a stronger type system
>A linter-based syntactic check is no substitute for a proper type system. A type system gives a machine checked proof. A heuristic catches some but not all failures to handle errors, it will also give false positives.
Thread B:
> Go's lack of sum types mean that there is no static check for whether the error has actually been handled or not.
>>I dunno, my IntelliJ calls out unhandled errors. I imagine go-vet does as well.
>>>A simple syntactic check will only ever work as a heuristic. Heuristics don't work for all cases and can be noisy. The point is, no modern language should need such hacks. This problem was completely solved in the 70s with sum types. [emphasis mine]
What you said in these two threads seemed to suggest that the functionality of Go error linters can be subsumed by a 'proper type system' that includes sum types, and that such a type system would catch all failures to handle errors. But sum types by themselves do nothing to force handling of errors. You could add sum types to Go and you'd still need a linter to perform the exact same error handling checks that Go linters currently perform. Whatever kind of value a fallible function returns, you can always ignore that value as far as the type checker is concerned (unless you add something like linear types to your type system, which are orthogonal to sum types).
Apologies for the confusion re Rust. For Rust just read 'a language that handles errors using sum types'.
> Sum types by themselves do nothing to force handling of errors,
If you have Haskell experience, then have you ever wondered how it is considered "null safe" and does not throw null pointer exceptions? Perhaps it is because optional "Maybe" types (the simplest form of error) must be explicitly unpacked? Yes, Haskell, being an old language without a sound type system, permits "fromJust" and its exceptions (a side effect) are not tracked like other effects. But despite this, are you seriously claiming that sum types "do nothing" to achieve this null safety?
If you want to understand the full proving power of sum types, I do not suggest Rust or Haskell as a model example. Coq, Agda, Idris or ATS will be better examples.
>But despite this, are you seriously claiming that sum types "do nothing" to achieve this null safety?
No. I said nothing about null safety. What I said is that “sum types by themselves do nothing to force handling of errors”. In fact I imagine that’s one of the reasons that Haskell uses exceptions for error handling in the IO monad. If Haskell had a non-raising function like
then you could of course attempt to open a file without checking for an error:
main :: IO ()
main = do
openFile “/foo/bar” ReadMode
putStrLn “Did the file open successfully? No idea”
Sum types (by themselves) cannot be used to prove that all errors have been handled. In fact, formal proofs of this property using other type system features (such as linear types) are of fairly limited practical value, given that merely 'using' an error value in some type-theoretic sense doesn't necessarily entail actually taking appropriate action to handle it.
> No. I said nothing about null safety. What I said is that “sum types by themselves do nothing to force handling of errors”.
Maybe (null) types are the simplest form of error type, with null pointer exceptions being the simplest from of unhandled error. They are therefore the easiest example to illustrate my point. You cannot simply choose to ignore them and remain credible. Haskell's broken old IO APIs are hardly a model example. Your Haskell code will at least give a compiler warning for ignoring the output. I would configure the compiler to turn this into an error.
>with null pointer exceptions being the simplest from of unhandled error.
I don't know of any practical language that forces you to handle the Nothing condition of a Maybe. Haskell has fromJust (as you note), Rust has unwrap. I suppose Idris could become practical one day, but it's not there yet. More fundamentally, without something like linear types, nothing in the type system can force you to check that a value of a particular sum type instantiates a particular variant. You're always free to ignore values, which means that you're free to ignore error conditions.
> Haskell's broken old IO APIs are hardly a model example of anything
I don't quite see what you're getting at here. My example function isn't part of Haskell's IO API. It's an example of what Haskell's IO API might look like if it used sum types for error handling rather than throwing exceptions. I fail to see how there can be anything inherently 'broken' about a hypothetical function that opens a file and returns either a file handle or an error.
>Your Haskell code will at least give a compiler warning for ignoring the output. I would configure the compiler to turn this into an error.
So you're saying that you'd configure the compiler to do exactly the same checks that Go error linters do...none of which have anything to do with sum types.
> It's an example of what Haskell's IO API might look like if it used sum types for error handling rather than throwing exceptions.
Apologies I missed that bit, it is indeed a perfectly reasonable API.
> So you're saying that you'd configure the compiler to do exactly the same checks that Go error linters do...none of which have anything to do with sum types.
We are arguing semantics as to what constitutes a "handled error". If a user chooses to explicitly throw away the error and not use the value, then you are arguing it is not handled. I am arguing that it has been handled (and checked as such). Either way sum types are a step in the right direction, despite all the shortcomings and unsound type systems of "practical" languages.
Rust compiler warns about ignoring a result of functions annotated with #[must_use]. This is optional because if a function has no side effects, then ignoring its return value is not a problem and shouldn't be warned about.
Yes, but that’s just the sort of linting you can also get vía Go tools. It’s not something that’s possible because of sum types.
> if a function has no side effects
A property which of course is not tracked by Rust’s type system. My only point here is that neither sum types in general nor Rust’s specific implementation of them provide any means of ensuring that errors are handled. They do other nice things, just not that.
No language has a 'static check for whether the error has actually been handled or not'. In Rust, for example, you can just 'unwrap' an error. In Haskell you can use 'fromJust'. And in Go you can just ignore 'err' and assume it is 'nil'.
Sum types might be the 'mathematical dual of product types' but programming languages are not mathematics. The possibly implementations of sum types are quite varied. It makes sense in low-level languages for the programmer to use what makes sense in the particular situation.
Unwrap and fromJust can be disallowed if need be, they are "unsafe" convenience functions whose use can and should be tracked. Not all languages with sum types will permit them. Rust also has "unsafe" code blocks, should we also claim it is therefore not memory safe? Some would try to do so, but at least this unsafe code is tracked and not idiomatic.
> programming languages are not mathematics
This may be how you choose to view them. But many of us seeking to build safer and more correct software aim to make programming more like mathematics. Mathematics tells us how to compose and tells us how to prove. Both things the software industry is currently failing at.
>Unwrap and fromJust can be disallowed if need be, they are "unsafe" convenience functions whose use can and should be tracked.
And with the same sort of third-party tools you use to 'track' those and ensure they're not used, you can track unused error returns in Go or C.
> Not all languages with sum types will permit them.
All do.
>Rust also has "unsafe" code blocks, should we also claim it is therefore not memory safe? Some would try to do so, but at least this unsafe code is tracked and not idiomatic.
Absolutely we should! Rust fanatics try to claim it is a memory-safe language when it isn't. Real memory-safe languages like Java have existed for far longer.
>This may be how you choose to view them. But many of us seeking to build safer and more correct software aim to make programming more like mathematics. Mathematics tells us how to compose and tells us how to prove. Both things the software industry is currently failing at.
He still has a point. In theory you might need a language like Idris/Agda, but in praxis it still makes a big difference.
It is true that you will see that a function can return an error and that you choose to ignore it. It's also true that you can do the same in many other languages that use sumtypes.
But it is still different. Because while ignoring an error in go is as easy as putting an underscore next to the happy-case, in languages with sumtypes that doesn't work.
The equivalent in other languages would be to return a struct and then just access one value and ignore the other one. In that case, the practical implications would be the same.
But when using a sumtype, a few things change.
First, you can not just access the happy-case value, you are (or at least can be) forced to also "access" the unhappy-case value. Either in a pattern match, in a fold-function and so on.
You know have to return something, even if it is an empty value our "escaping" by throwing an error.
On top of that, what happens if a function can partially succeed? Take a graphql request as a practical example where this is quite common.
With Go's style of error handling, how do you model that? I.e. say you need to redactor a function that previously either succeeded or failed into one that now can partially succeed and fail.
In a language with sumtypes I would now switch from a sumtype Success|Error to more complex type Success|Error|PartialSuccess which makes it a breeze to refactor my code because the compiler will tell me all the places where I have to consider a new case and what it is.
I'm genuinely curious, how would you model that in Go and what implication would such a refactoring have on existing code?
You can always implement tagged unions in any language with untagged unions, so in a broad sense you can emulate sum types in situations where they make sense but use simpler code elsewhere. I might do that in C. Depends on the situation. It also obviously isn't a proper answer, I am sure you will agree, to just emulate the feature that I am saying is unnecessary. That works in Lisp where you can elegantly add language features with proper macros. In C, you cannot.
I would probably simply do in C the same thing as usual:
int
function1(int arg1, int arg2, int *out1, struct foo *out2)
{
if (part1(arg1, out1))
return 1;
if (part2(arg2, *out1, out2))
return 1;
return 0;
}
// Oh hmm, some callers can do something useful with a partial result.
// Assume the internals are more complex, because obviously in this simplified example you
// could just make them call part1 directly.
enum { SUCCESS, PARTIAL_SUCCESS, FAILURE }
function2(int arg1, int arg2, int *out1, struct foo *out2)
{
if (part1(arg1, out1))
return FAILURE;
if (part2(arg2, *out1, out2))
return PARTIAL_SUCCESS;
return SUCCESS;
}
This is compatible with old callers, even, who treat any nonzero result as failure and any zero result as complete success (the normal pattern in C).
Yes, the caller needs to check the result and avoid looking at out2 if you dont get SUCCESS and avoid looking at out1 if you get FAILURE. But this sort of thing is de rigeur in C. Your compiler (or a linter, and optional warning flags are essentially linters anyway) will warn you if you ignore the result and if you switch on the result will warn you if you ignore a case.
But obviously it is left up to you to avoid the "dont touch X if Y" stuff. Eh, that is in my experience not the hard bit of writing C. The hard bit is anything involving dynamic lifetimes or shared mutable state. The nice thing is that you can avoid this in C! Most people don't. The easy path is calling malloc everywhere and getting yourself into a muddle. The simple path, which is better in the long run, is to use values and sequential, imperative code. And if you do that, you realise that C's design makes way more sense. That is how it was designed to be used. Dynamic lifetimes of objects? It is like trying to use Rust to represent linked lists. People that say 'Rust sucks because double linked lists lol' are morally equivalent to people that say 'C sucks because malloc and free lol', it is like.... Yeah you aren't meant to do that!
> You can always implement tagged unions in any language with untagged unions, so in a broad sense you can emulate sum types in situations where they make sense but use simpler code elsewhere
I'm a bit confused now, since I don't see how this is related to the point I was making. You are right - with the exception that sumtypes are still more powerful since you cannot e.g. emulate GADTs with tagged unions, but for most cases in practice, I agree. Still, what's the point?
I also think we have a general misunderstanding, since you are saying:
> That works in Lisp where you can elegantly add language features with proper macros
But Lisp is dynamically typed, so talking about union types is meaningless in a dynamically typed language. That doesn't make any sense to me in this context.
And about C (which is statically typed): C does not have union types (and hence also no tagged union types). What C does have are (untagged) enums, but that's not the same thing. The crucial difference is that union types are ad-hoc whereas enums are statically defined. I think it is a bit confusing since C calls them unions - but in the context of this discussion it's important that they are very different things.
E.g., with union types you can do:
type union1 = string | int
type union2 = string | boolean
type union3 = union1 | union2 | float
// same as type union3 = string | int | boolean | float
The compiler must be able to resolve those things automatically. I'm hope I'm not completely mistaken here, But I believe there is no way to combine unions like this in C at the type-level. You would have to write those out by hand or generate the code. But if there is, please correct me.
> This is compatible with old callers, even, who treat any nonzero result as failure and any zero result as complete success (the normal pattern in C).
The idea or moviation was though that in a language with sumtypes (or tagged unions) the old callers would not be compatible. Trying to compile code against `function2` should fail. But it should not fail in an arbitrary way - it should fail by the compiler saying "hey, look, you handled the error case and the success case, but you also have to handle the partial-success case; and here is how the data you need to handle looks if it is partial-success: ...". That is what sumtypes give you, and I find that this enormously useful in practice. In a language without sumtypes you will not get this level of support by the compiler - that is the point I was trying to make.
A simple syntactic check will only ever work as a heuristic. Heuristics don't work for all cases and can be noisy. The point is, no modern language should need such hacks. This problem was completely solved in the 70s with sum types.
Cries in go. I segfaulted go while learning it in the first 5 minutes. Its a a solved problem, unfit for general purpose programming on this problem class alone
The issue isn't as simple as just having better error unions with payloads. Constructing the error diagnostics object needs to be taken into account and the costs it brings (e.g does it allocate? take up a lot of stack space? waste resources when one doesn't need the extra information?). Such is a design choice for the programmer not the language as only they know which approach will be more effective given they have a clear view of the surrounding context.
An alternative is to allow an additional parameter which optionally points to a diagnostics object to be updated upon error. Returning an error here signals that the object contains additional information while not harming composition as much and giving the choice of where to store such information. This is arguably a better pattern in a language where resource allocation is explicit.
I'm not really experienced with programming language design or with compilers, but it seems to me the design of a systems programming language has to compromise on the side of performance. If the implementation of the design requires additional space or cpu time, it may not be a good fit for the language. As such, it's not orthogonal.
How would errdefer work in the general union setting?
Having errors as a first class construct in the language allows things like errdefer to be very simple and easy to use. It looks needlessly specialised at first, but I think it’s actually a really good design.
That's a very good question! Most advanced languages have some way of defining the concept of a "computation within a context". For example, all languages that support a notion of Monads do have that kind of support. Examples would be Haskell, Scala, F#, ...
In those languages there are (or would be) generally two ways of achieving the same thing as errdefer:
1.) having a common interface/tag for errors
In that case, if you have a return type "success | error1 | error2" then error1 and error2 must implement a common global interface ("error") so that you can "chain" computations that have a return type of the shape of "success | error". "success | error1 | error2" would follow that shape because the type "error1 | error2" is then a subtype of "error".
2.) Having some kind of result type.
This would be similar to how it works in rust or in the example in the article here. So you would have a sumtype like "Result = either success A or failure B" and the errors that are stored in the result-failure (B) would then be uniontypes.
The chaining would then just be a function implemented on the result-type. This is personally what I use in Scala for my error handling.
Just to make it clear, this "chaining" is not specific for error-handling but a very general concept.
> How would errdefer work in the general union setting?
Well it could not be, and some would argue that would be better.
But you could also have a blessed type, or a more general concept of "abortion" which any type could be linked to.
Or you could have a different statement for failure returns that way you can distinguish a success and a failure without necessarily imposing a specific representation of those failures.
Sum types are still a poor error handling strategy compared to exceptions, which actually implement the most common error handling pattern automatically for you (add context and bubble up). Having the language treat errors as regular values fails to separate these fundamentally different paths through the code. It also makes the programmer re-implement the same code pattern over and over again.
This cost may be necessary in manual memory management languages, where the vast majority of functions have to do some cleanup. There, having multiple exit points from a function through exceptions (or through early return statements) makes it harder to make sure all resources are properly cleaned up (especially when the cleanup of one resource can itself throw an error).
But in managed memory languages, there's just no reason to manually re-implement this pattern, either through Go style regular values or through sum types. And note that the Result monad is not a good substitute for exceptions, as it doesn't actually add the necessary context of what the code was doing when it hit an error case.
Well, yes and no. What you say also applies if two errors have the same type but you want to differentiate where it comes from. In all these cases you can wrap the result or error into something, which essentially means tagging it (but without actually needing sumtypes as a language concept.
Sumtypes sure are useful, but I believe that except for GADTs, you can emulate sumtypes with uniontypes, but not vice versa.
You can stuff your error payload in an in-out parameter in an options tuple. This way you don't lose errdefer and if you don't want it the code for it doesn't exist (deleted at comptime)
I tend to check the error handling capabilities of a language to see how expressive and easy it is to return errors. I personally prefer result types over optionales (e.g. the mentioned added context instead of a nothing). It seems that zigs shoots in the middle here. I like the implicit errors shown but also feel that the lack of context isn’t great. What is missing for me is chaining and a simple message. That being said I‘m also not super happy how rust manages errors. At the moment I write my tools and crates with „thiserror“ and „anyerror“ which also illustrates how errors most of the times are uses. I say most since it really depends. In libraries I tend to introduce an error type per bigger module. And wrap low level errors so it results in a chain of errors (similar how Java does it with their exceptions) This gives me the most context and a kind of breadcrumb. I hate when you execute something and a naked IO error ala „Permission Error“ is returned. But this all means a lot of extra code that starts to hide the simple implementation under the hood. For simple apps where any error means „Sorry I‘m out of luck“ I use anyerror to not write custom FromError implementations. Again this all illustrates a bit that I‘m not super happy in rust. But I actually never encountered an error system which can keep it simple and fast. The Zig examples are nearly there though! Just the missing error context …
Everyone complains about checked exceptions, blames them on Java, while they actually came up on CLU, Mesa, Modula-3 and C++.
And at the end of the day, it turns out forced error checking is an idea everyone keeps reinventing, because it is actually useful to know what errors can be "thrown".
The hard part of checked exceptions is when they start interacting with higher order code- in Java this often happens when an interface method wants to throw. Each implementation may want to throw its own set of exceptions, but the code that just works with the interface doesn't care about any of them, and existing languages don't really give you a way to express this.
What you really want here is polymorphism in the exception specification, and a way for consumers to allow these new exceptions types to "tunnel" through out to the caller that knows about them. And for that to work while still providing any guarantees that exceptions are handled, it turns out you need something like a borrow checker, to prevent the interface object from escaping that outer scope!
Not really what I'm talking about here- that issue comes up when you want to combine multiple concrete error types in a single monomorphic call site, rather than when dealing with polymorphism.
Rust (and other languages that use something like `Result` for errors) handles the polymorphic case a bit better than Java, because you can already use the language's usual polymorphism support for the error type. But it's still not quite as smooth as first-class polymorphic checked exceptions would be, since the interface (or trait) and its callers have to do a bit more manual plumbing in some cases.
For example, a simple `impl Fn() -> T` can be instantiated with `T = Result<X, E>`, but only if the caller is just going to return the `T` directly- otherwise the error won't be propagated immediately. A slightly more annoying situation is when you have some `I: Iterator` that can fail- often you fall back to `I: Iterator<Item = Result<X, E>>`, which is not quite right, and expect the consumer to stop calling `next` if it gets an `Err`.
With polymorphic checked exceptions, you could use `I: Iterator<Item = X>`, with an additional annotation that `next` may fail with an `E`. Error-oblivious combinators like `map` or `fold` would continue to work directly with `X` values, but automatically propagate `E`s to the eventual caller that knows the concrete type of `I`.
(And again, crates like anyhow/thiserror don't really address this problem- they're solving a different issue entirely.)
For my work, I usually catch general exceptions and handle their aftermath, like putting the error in an error queue to be manually fixed and replayed or showing an error page. While there are domains like power plants or car software where every error must be meticulously handled, my approach suits my domain.
When I see code making me catch numerous unique exceptions, it often hints at an API design issue. A more refined design might encapsulate such information in a response, maybe through a discriminated union or an enum with a message. If I can, I'd refactor it to match this ideal. If not, I'd use adapters to convert diverse exceptions into standardized errors.
Exceptions should be exceptional: situations like memory shortages, database disconnects, or accessing a disposed resource. For these unforeseen events, control flow is typically the same as they're unexpected and beyond my control.
> And at the end of the day, it turns out forced error checking is an idea everyone keeps reinventing, because it is actually useful to know what errors can be "thrown".
The problem of java's checked exceptions is not that being explicit is bad, it's that java's implementation is terrible.
Because there isn't a one size fits all solution to error handling?
Because before anyhow/thiserror became popular there were 5 other popular crates for handling errors and if Rust added one of those to std we would be stuck with a subpar solution forever?
Because cargo is so easy to use that your preferred error handling solution is one `cargo add` command away?
Because for simple cases the tools the standard library offers are good enough?
I'm not really talking about the "errors" ergonomics, but more about error handling.
The methods like .map .map_err .and_then I feel are way easier to reason about and also often shorter that what would happen in a control flow breaking catch block.
In rust, before anyhow and thiserror, you’d see some pretty shitty hacks for the inflexible error system, such as just making all errors just a string.
It is clear that having all the errors in a list is actually good now, but that doesn’t stop programmers from hating writing boilerplate.
Again, before anyhow, if you did error properly, your errors.rs had huge swathes of From implementations. Error.rs boilerplate often outstripped your actual code.
The complaints that it’s hard to change interfaces is bad, as it’s difficult to change interface methods regardless.
It’s not partially designed so much as the type system demands it for rust.
Very unfortunately for rust, making errors not just maddening boilerplate forces you to trade compile time for reasonable errors (although, honestly, anyhow “feels” hacks to me). Compile time is already a place rust struggles as it is.
I wouldn’t bank on rust style languages having any semblance of good ergonomics for errors. But at the same time “you can just ignore it” is really not great either.
Zig errors are actually pretty nice to work with, but as is pointed out, they struggle with producing really good messages, or giving more information back.although, I will say that I nearly never need to send more information back, and there are patterns to help with that.
Still, if there was a language concept for it, that would be nicer. It’s actually not an easy problem for zig and the core foundations of the language. Just like it’s not an easy problem for rust and its core foundations.
Errors are just really shitty and, as yet, I don’t think there exists good ergonomics. I personally haven’t seen a language that does them well.
Not sure what Rust style languages are supposed to be, however ML derived languages, and Swift, do it much more ergonomically, without needing third party crates.
The trouble with Java exceptions is that the forced error checking is rarely handled at the call site, making the code no longer linear. Calling var a = foo();bar(a); does not necessarily imply bar() will be called. Using a more functional approach, the exception can be even more explicit making it easier to reason about.
I also thought checked exceptions in Java were fantastic. They are a form of statically checked effects. I would like to see more of this sort of thing not less.
Perhaps a controversial opinion, but I think exceptions should only be thrown when something exceptional happens. Languages, like Java and Python, overuse them e.g. failing to open a file is not exceptional, but running out of memory is.
Creating a Result as in this blog post is an anti-pattern, for the reasons pointed out in the article. The best pattern that is available today is to continue to use an error union for the return type, and accept a mutable "diagnostics" parameter where it stores extra useful information about failures.
It's actually rare that I end up needing extra stuff attached to errors in zig. When I do, I end up thinking of a higher unit of work where I can stick some context data in case of errors.
For example, the article mentions a JSON parser. In zig I'd end up writing a class with minimal state and a few methods. I'd call parser.parse(...), and if that goes bad I just fill a parser.errdata field before returning an error. This is similar to the old errno.h way of doing it in C, but it's not global.
That's the pattern I use, if I need detailed error information, I make sure to provide a pointer where they can get written. And it turns out I haven't needed that too often...
I was thinking more about what happens if you’re combining already-written code that uses different naming conventions, or even different types. Suppose you get an error in one format and want to pass it on in a different format?
Writing a converter probably isn’t so hard, but it’s tedious, like having multiple kinds of strings.
But it’s probably too soon to worry about that when Zig doesn’t have a package manager yet.
Good points raised. In Go, I've struggled to attach stack traces to errors. I have a plan (a custom error type, and type checking for that type) but it's not "bulletproof". Seems similar. Seems like there's opportunity for some language evolution here.
IIRC, part of the language proposal to make errors wrappable in Go’s stdlib included a stack trace. And indeed github.com/pkg/errors includes that. But it never made it into the stdlib implementation
It's not: because error sets are a built-in and completely bespoke feature, and they get handled via built-in constructs exclusively (try / catch), zig can actually attach / generate error traces: https://ziglang.org/documentation/0.3.0/#Error-Return-Traces
If a programming language supports union types and type inference then there is no need for such a "special" language feature as described here. The return type will simply have the form "goodresult | error" and the compiler will infer it based on the function-body alone.
At the same time, the following problem of zig is automatically a non-issue:
> A serious challenge with Zig's simple approach to errors that our errors are nothing more than enum values.
If the result is "goodresult | error" then I can choose of what type "error" is and I'm not bound to any predefined types that the language forces upon me.
In fact, the article then shows how to work around the issue with a tagged union. The problem is that this creates another issue: the composition of different error types.
E.g. having function foo calling bar and baz and bar can fail with barError and baz can fail with bazError. Then, without having to do anything, the compiler should infer that foo's return type should be "goodresult | barError | bazError". This won't work if tagged unions are used like in the example - they will have to be mapped by hand all the time, creating a lot of noise.
While Zig's errorhandling is not bad and miles above Go's, it makes me sad to see that we still lack ergonomy of errorhandling in so many languages despite knowing pretty much all of the language-features that are necessary to make it much easier.