> Inlining functions also has the benefit of not making it possible to call the ...

grumbel · on Oct 9, 2024

That's not an improvement, as it screws up the code flow. The point of inline blocks is that you can read the code the same way as it is executed. No surprised that code might be called twice or that a function call could be missed or reordered. Adding real functions causes exactly the indirection that one wanted to avoid in the first place. If the block has no name you know that it will only be executed right where it is written.

gorgoiler · on Oct 9, 2024

Yeah that’s a valid point. I tend to have in mind that as soon as I pull any of the inner functions out to the publicly visible module level I can say goodbye to ever trying to stop people reusing the code when I don’t really want them to.

For example, if your function has an implicit, undocumented contract such as assuming the DB is only a few milliseconds away, but they then reuse the code for logging to DBs over the internet, then they find it’s slow and speed it up with caching. Now your DB writing code has to suffer their cache logic bugs when it didn’t have to.

scbrg · on Oct 9, 2024

Not sure I believe the benefit of this approach outweighs the added difficulty wrt testing, but I certainly agree that Python needs a yikes keyword :-)

nuancebydefault · on Oct 9, 2024

What is the benefit of such a yikes? Or do you consider it a yikes language as a whole?

Personally I like that functions can be inside functions, as a trade off between inlining and functional seperation in C++.

The scope reduction makes it easier to track bugs while it has the benefits of separation of concern.

scbrg · on Oct 9, 2024

> What is the benefit of such a yikes? Or do you consider it a yikes language as a whole?

None, it was just a simple joke based on the typo in the post I replied to. I like Python, and have in fact been happily using it as my main language for over 20 years.

gorgoiler · on Oct 9, 2024

Ahhh, now I (top level author) get it :)

toenail · on Oct 9, 2024

> Inlining functions also has the benefit of not making it possible to call the function from other places.

Congrats, you've got an untestable unit.

ahoka · on Oct 9, 2024

Congratulations, you are writing test for things that would not need test if weren't put behind a under-defined interface. Meanwhile sprint goals are not met and overall product quality is embarrassing, but you have 100% MC/DC coverage of your addNumbersOrThrowIfAbove(a, b, c).

jeltz · on Oct 9, 2024

Which is usually a positive. Testing tiny subunits usually just makes refactoring and adding new features hard while not improving test quality.

hansvm · on Oct 9, 2024

Testing is a tool that sometimes makes your life easier. IME, many (not all) tiny subunits do actually have better tests when examined at that level. You just want to avoid tests which will need to be updated for unrelated changes, and try to avoid writing code which propagates that sort of minutia throughout the codebase:

> while not improving test quality

The big wins from fine-grained testing are

1. Knowing _where_ your program is broken

2. Testing "rare" edge cases

Elaborating on (2), your code probably works well enough on some sort of input or you wouldn't ship it. Tests allow you to cheaply test all four Turkish "i"s and some unicode combining marks, test empty inputs, test what happens when a clock runs backward ever or forward too slowly/quickly, .... You'll hit some of those cases eventually in prod, where pressures are high and debugging/triaging is slow, and integration tests won't usually save you. I'm also a huge fan of testing timing-based logic with pure functions operating on the state being passed in (so it's tested, better than an integration test would accomplish, and you never have to wait for anything godawful like an actual futex or sleep or whatever).

> makes refactoring and adding new features hard

What you're describing is a world where accomplishing a single task (refactoring, adding a new feature) has ripple effects through the rest of the system, or else the tests are examining proxy metrics rather than invariants the tiny subunits should actually adhere to. Testing being hard is a symptom of that design, and squashing the symptom (avoiding tests on tiny subunits) won't fix any of the other problems it causes.

If you're stuck in some codebase with that property and without the ability to change it, by all means, don't test every little setup_redis_for_db_payment_handling_special_case_hulu method. Do, however, test things with sensible, time-invariant names -- data structures, algorithms, anything that if you squint a bit looks kind of like parsing or serialization, .... If you have a finicky loop with a bunch of backoff-related state, pull the backoff into its own code unit and test how it behaves with clocks that run backward or other edge cases. The loop itself (or any other confluence of many disparate coding concepts) probably doesn't need to be unit tested for the reasons you mention, but you usually can and should pull out some of the components into testable units.

bluGill · on Oct 9, 2024

The problem is there is rarely a clear interface for your subunit. As such you will want to refactor that interface in ways that break tests in the future. If you are writing another string you can probably come up with a good interface and then write good tests that won't make refactoring hard - but string should be a solved problem for most of us (unless you are writing a new programming language) and instead we are working on problems that are not as clear and only our competitors work on so we can't even learn from others.

ninetyninenine · on Oct 9, 2024

Not according to jon carmack. He stated he switched to pure functional programming in the intro which is basically stating all his logic is in the form of unit testable pure functions.

foldr · on Oct 9, 2024

Nothing about pure functional programming requires unit testing all of your functions. You can decide to unit test larger or smaller units of code, just as you can in any other paradigm.

ninetyninenine · on Oct 9, 2024

In pure functional programming a pure function is unit testable by definition of what a pure function is. I never said it requires functions to be tested. Just that it requires functions to be testable.

In other paradigms do not do this. As soon as a module touches IO or state it becomes entangled with that and NOT unit testable.

Is it still testable? Possibly. But not as a unit.

int_19h · on Oct 9, 2024

How do you unit test a local function that is a closure in pure functional code?

ninetyninenine · on Oct 9, 2024

Of course you can't unit test things with restricted scope.

f(x) = x + 2 + 4

How do you unit test x + 2 or (+ 4) even if the operation is pure? You can't. Because it's not callable. It's the same thing with the closure.

The only things that are testable are things on unrestricted scope. AKA global scope. Think about what happens if you have a "closure" on global scope.

If you really want to test it then your "unit tests" which typically live on global scope, need to be moved to local scope. That's just the rules of scope.

There is one special case here. If the parent function returns the local function as a value. But even in this case the parent and local function have to be treated as a unit. The unit test will involve first calling the parent, then calling the local. The parent and child function form a "unit" thanks to shared state and the parent is essentially "moving" the local function into global scope.

Generally best practice is to use combinators if you want to maximize the granularity in which you can modularize your logic. I would even argue that closures stradle the line between pure and impure, so I actually avoid closures whenever possible.

nuancebydefault · on Oct 9, 2024

I found this [] article of Carmack. While reading, I understood there is a large set of gray shades to pureness of "pure functional" code. He calls being functional a useful abstraction, a function() is never purely functional.

[] https://web.archive.org/web/20120501221535/http://gamasutra....

ninetyninenine · on Oct 9, 2024

When people say pure functional programming they never mean the entire program is like this.

Because if it were your program would have no changing state and no output.

What they mean is that your code is purely functional as much as possible. And there is high segregation between functional code and non functional code in the sense that state and IO is segregated as much as possible away into very small very general functionality.

tightbookkeeper · on Oct 9, 2024

> pure fp

No he didn’t.

sunshowers · on Oct 9, 2024

Like most things being talked about here, so much depends on the specifics.

I think developers should generally try and aim for, at every scale, the outputs of a system to be pure functions of the inputs (whether by reducing the scope of the system or expanding the set of things considered inputs). Beyond that there are so many decisions at the margin that are going to be based on personal inclination.

jayd16 · on Oct 9, 2024

Ideally, you've just moved the unit boundary to where it logically should be instead of many small implementation details that should not be exposed.

xboxnolifes · on Oct 9, 2024

The unit here is the email, not the email's link or subjects. Those are implementation details.

mgsouth · on Oct 9, 2024

What do you use unit tests for, other than verifying implementation details?

Perhaps we have a difference in definition. To me, a unit test for a function such as "parse_news_email" would explore variations in parameters and states. Because of combinatorial explosion, that often means at least some white-box testing. I'm not going to generate random subjects and senders, and received-froms, I'm going to target based on internal details. Are we doing smart things with the message ID hostname? Then what happens if two messages come in with the same message ID but from different relays? The objective is that the unit test wrings out the implementation details, and the caller's unit test doesn't need to exhaustively test them.

This white-box texting may require directly poking at or mocking internal functions or at least abusing how they're called. For example, parsing the news item might entail pulling up and modifying conversation thread cache entries or state. For some of the tests you may need hand-crafted cache state, it's not feasible to create unique states for each parameter combination you're testing, and testing a combination will pollute the state for the following combinations. Or maybe the function depends upon an external resource you can't beat to death with a million identical requests. So the least-bad, simplest solution may be to freeze or back out part of the normal state update in the unit test. Which would usually involve directly invoking the internal routines.

Can this lead to fragile, false-positve to the point of useless tests? You betcha. That's where entertaining two contrary viewpoints is needed :) Use experience and good judgement about pros and cons in the particular situation.

randomdata · on Oct 9, 2024

Unit tests are for documenting the API contract for the user. You are going to target based on what you are willing to forevermore commit to for those who will use what you have created. Indeed, what happens when two messages come in with the same message ID is something the user needs to be aware of and how it functions needs to remain stable no matter what you do behind the scenes in the future, so you would absolutely want to document that behaviour. How it is implemented is irrelevant, though. The only thing that matters is that, from the public API perspective, it is handled appropriately.

There is a time and place for other types of tests, of course. You are right that unit tests are not the be all and end all. A good testing framework will allow you to mark for future developers which tests are "set in stone" and which are deemed throwaway.

mgsouth · on Oct 9, 2024

We're in general agreement about the purpose of unit tests. I disagree on a couple of points.

Tests do not document the API. No test is complete, and for that reason alone can't completely document anything. For example, a good API might specify that "the sender must be non-null, and must be valid per RFC blah." There's no way to test that inclusively, to check all possible inputs. You can't use the test cases to deduce "we must meet RFC blah." You might suspect it, but you'd be risking undefined behavior it you stray from input that doesn't exactly match the test cases. And before anyone objects "the API docs can be incomplete too," well, that true. But the point is that a written API has vastly more descriptive power than a set of test cases. (The same applies to "self-documenting code". Bah humbug.) There's also the objection "but you can't guarentee cases you don't test!" Also true. That's reality. _You can never test all your intended behavior._ You pick your test cases to do the best you can, and change your cases as problems pop up.

The other thing I would shy away from is including throwaway tests in the framework. Throwaways are a thing, developers use them all the time, but don't make them unwanted stepchildren--poorly (incompletely?) designed, slapped together, limited scope, confusing for another developer (including time-traveling self) to wade through and decide whether this is a real failure or just bogus test. They're tech debt. Less frequently used tests are another matter. For example, release-engineering tests that only get run against release candidates. But these should be just as much real, set in stone, as any other deliverable.

Which I guess is a third viewpoint nuance difference. I treat tests as being part of the package just as much as any other deliverable. They morph and shift as APIs change, or dependencies mutate, or bugs are found. They aren't something that can be put to the side and left to vegetate.

randomdata · on Oct 9, 2024

> There's no way to test that inclusively, to check all possible inputs.

Which means the RFC claim is false and should not be asserted in the first place. The API may incidentally accept valid RFC input, but there is no way to know that it does for sure for all inputs. You might suspect it conforms to the RFC, but to claim that it does with certainty is incorrect. Only what is documented in the tests is known to be true.

Everything else is undefined behaviour. Even if you do happen to conform to an RFC in one version, without testing to verify that continues to hold true, it probably won’t.

This is exactly why unit tests are the expected documentation by users. It prevents you, the author, from make spurious claims. If you try, the computer will catch you in your lies.

> The other thing I would shy away from is including throwaway tests in the framework.

What does that mean? I suspect you are thinking of something completely different as this doesn't quite make sense with respect to what I said. It probably makes sense in another context, and if I have inferred that context correctly, I'd agree... But, again, untreated to our discussion.

mgsouth · on Oct 9, 2024

OK, one more round. An API spec is a contract, not a guarentee of correctness. You, as the client, are free to pass me any data that fits the spec. If my parsing library does the wrong thing, then I've got a bug and need to fix it. My tests are also defective and need to be adjusted.

If you passed 3.974737373 to cos(x), and got back 200.0, would you be mollified if the developers told you "that value clearly isn't in the unit test cases, so you're in undefined behavior"? Of course not. The spec might be "x is a single-float by value, 0.0 <= x < 2.0 * PI, result is the cosine of X as a single-float." That's a contract, an intent--an API.

The same for a mail parser. If my library croaks with a valid (per RFC) address then I've got a problem. If I try to provide some long, custom, set of cases I will or won't support, then my customer developers are going to be rightfully annoyed. What are they supposed to do when they get a valid but unsupported address? Note we're not talking about carving out broad exceptions reasonable in context ("RFC 5322 except we don't support raw IP addresses foo@[1.2.3.4]", "we treat all usernames as case-insensitive"). And we're not talking about "Our spec (intent) is foo, but we've only tested blah blah blah."

Early in my career I would get pretty frustrated by users who were not concerned with arranging their data and procedures the right way, clueless about what they really were doing. OK, so I still get frustrated by stupid :) But it's gradually seeped into my head that what matters is the user's intentions. Specs are an imperfect simplificaton of those very complex things, APIs are imperfect simplifcations of the specs, and our beautiful code and distributed clusters and redundant networks are extremely limited and imperfect implementations of the APIs. Some especially harmful potential flaws get extra attention during arch, implementation, and testing. When things get too far out we fix them.

Spivak · on Oct 9, 2024

> What do you use unit tests for, other than verifying implementation details?

1. Determining when the observable behavior of the program changes.

2. Codifying only the specific behaviors that are known to be relied on by callers.

3. Preventing regressions after bugs are fixed.

Failing tests are alarm bells, when do you want them to grab your attention?

mgsouth · on Oct 9, 2024

Excellent points, violently agree, my question was poorly worded. The purpose of units tests is to verify the contracted API is actually being provided by the implementation details. A clearer question might have been "what are unit tests for if not to exercise the implementation details, verifying they adhere to the API?" Unit tests validate implementation details, integration tests validate APIs.

To me, a good unit test beats the stuffing out of the unit. It's as much a part of the unit as the public functions, so should take full advantage of internal details (keeping test fragility in mind); of course that implies the unit test needs ongoing maintenance just as much as the public functions. If you're passing a small set of inputs and checking the outputs, well that's a smoke test, not a unit test.

To answer your last question, I want the alarm bells to ring whenever the implementation details don't hold up. That's whether the function code changed, a code or state dependency changed, or the testing process itself changed. If at all feasible all the unit tests run every time the the complete suite is run, in full meat-grinder mode. "Complete suite" is hand-wavy; e.g. it might be the suite for a major library, but not the end-to-end application.

int_19h · on Oct 9, 2024

All that doesn't mean that you have to consider artificial boundaries that you yourself have introduced for convenience when deciding on the proper boundaries for what constitutes a "unit". Not every instance of code reuse makes for a good unit to test.

xboxnolifes · on Oct 9, 2024

> What do you use unit tests for, other than verifying implementation details?

You don't need to verify the return of `parse_subject()` directly, since it will be part of the return of `parse_email()`. Verify it there.

ninetyninenine · on Oct 9, 2024

This is a major insight. Defining a local function isn't a big deal you can always just copy and pasta it out to global scope.

Any time you merge state with function you can no longer move the function. This is the same problem as OOP. Closures can't be modular the same way methods in objects can't be modular.

The smallest unit of testable module is the combinator. John Carmack literally mentioned he does pure functional programming now which basically everyone in this entire thread is completely ignoring.

gorgoiler · on Oct 9, 2024

Yup, and I should have called this out as a downside. Thank you for raising it.

On visibility, one of the patterns I’ve always liked in Java is using package level visibility to limit functions to that code’s package and that packages tests, where they are in the same package (but possibly defined elsewhere.)

(This doesn’t help though with the reduction in argument verbosity, of course.)

wraptile · on Oct 9, 2024

The latter pattern is very popular in Python web scraping and data parsing niches as the code is quite verbose and specific and I'm very happy with this approach. Easy to read and debug and the maintenance is naturally organized.

eru · on Oct 9, 2024

Funny enough, the equivalent of your Python example is how Haskell 'fakes' all functions with more than one argument (at least by default).

Imperative blocks of code in Haskell (do-notation) also work like this.

orf · on Oct 9, 2024

That’s gonna be quite expensive, don’t do this in hot loops. You’re re-defining and re-creating the function object each time the outer function is called.

gorgoiler · on Oct 9, 2024

Good point. I measured it for 10^6 loops:

(1) 40ms for inline code;

(2) 150ms for an inner function with one expression;

(3) 200ms for a slightly more complex inner function; and

(4) 4000ms+ for an inner function and an inner class.

  def f1(n: int) -> int:
      return n * 2

  def f2(n: int) -> int:
      def g():
          return n * 2
  
      return g()
  
  def f3(n: int) -> int:
      def g():
          for _ in range(0):
              try:
                  pass
              except Exception as exc:
                  if isinstance(exc, 1):
                      pass
                  else:
                      while True:
                          pass
                  raise Exception()
          return n * 2
  
      return g()
  
  def f4(n: int) -> int:
      class X:
          def __init__(self, a, b, c):
              pass
  
          def _(self) -> float:
              return 1.23
  
      def g():
          for _ in range(0):
              try:
                  pass
              except Exception as exc:
                  if isinstance(exc, 1):
                      pass
                  else:
                      while True:
                          pass
                  raise Exception()
          return n * 2
  
      return g()

raverbashing · on Oct 9, 2024

It might be a benefit in some cases, but I do feel that f1/f2/f3 are the prime candidates for actual unit testing

bitwize · on Oct 9, 2024

It's possible to nest subprograms within subprograms in Ada. I take advantage of this ability to break a large operation into one or more smaller simpler "core" operations, and then in the main body of the procedure write some setup code followed by calls to the core operation(s).

zelphirkalt · on Oct 9, 2024

Where is the part, where this is "careful"? This is just how scopes work. I don't see what is special about the inner functions using things in the scope of the outer functions.

int_19h · on Oct 9, 2024

Excessive use of external bindings in a closure can make it hard to reason about lifetimes in cases where that matters (e.g. when you find out that a huge object graph is alive solely because some callback somewhere is a lambda that closed over one of the objects in said graph).

InDubioProRubio · on Oct 9, 2024

So inlining is the private of functions without a object. Pop it all to stack, add arguments, set functionpointer to instructionstart of inline code, challenge accepted, lets go to..

LoganDark · on Oct 9, 2024

Remember to `nonlocal xs, db, logger` inside those inner functions. I'm not sure if this is needed for variables that are only read, but I wouldn't ever leave it out.

pansa2 · on Oct 9, 2024

> I'm not sure if this is needed for variables that are only read

It’s not needed. In fact, you should leave it out for read-only variables. That’s standard practice - if you use `nonlocal` people reading the code will expect to see writes to the variables.

LoganDark · on Oct 10, 2024

> That’s standard practice - if you use `nonlocal` people reading the code will expect to see writes to the variables.

Since when? I was under the impression Python virtually doesn't have lexical scoping at all and that's why `nonlocal` exists. I mean hell, in CPython you can literally access and modify the local variables of your caller (and everything else up the call stack too). I never associated `nonlocal` at all with specifically writes. Just access in general.

pansa2 · on Oct 14, 2024

> I was under the impression Python virtually doesn't have lexical scoping at all and that's why `nonlocal` exists

Python has had lexical scoping since version 2.2. PEP 227 [0] "describes the addition of statically nested scoping (lexical scoping)" - allowing access to (but not assignment to) names in outer scopes.

`nonlocal` was introduced later, in Python 3.0 [1], specifically to allow assignment to names in outer scopes.

[0] https://peps.python.org/pep-0227/

[1] https://peps.python.org/pep-3104/

a_t48 · on Oct 9, 2024

You can do this in C++, too, but the syntax is a little uglier.

kllrnohj · on Oct 9, 2024

Not that bad?

    int main() {
        int a = -1;
        [&] {
            a = 42;
            printf("I'm an uncallable inline block");
        }();

        printf(" ");

        [&] {
            printf("of code\n");
        }();

        [&] {
            printf("Passing state: %d\n", a);
        }();

        return 0;
    }

int_19h · on Oct 9, 2024

At this point, why wouldn't you just use a nested block?

a_t48 · on Oct 9, 2024

It’s not horrible, a little bit verbose though.