Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I find it ironic that the article points out that relying on error from humans to find errors is something of a hit or miss proposition and suggests that automating error finding is an appropriate course instead of making it less likely to make the error in the first place.

For example, I wonder how many errors would have been found if the definition of a format string was the default? That is, how many times would people have written something like "hello {previously-defined-variable}" and not meant to substitute the value of that previously defined variable at runtime?



I don't think this makes sense. Plain strings and format strings are not interchangeable, and using one where the other was meant is probably a bug.

Would you expect that a user input like "{secret} please" is interpolated? If so, we hopefully agree that this would blow major security holes into any python script processing untrusted user input. And if not... Why not?


>Would you expect that a user input like "{secret} please" is interpolated?

That's basically what the recent log4j security vulnerability was all about. "Helpfully" interpolating logs by default.


Look up how this works in Swift. They only have one string. No raw strings or f strings. Yet they have all the power of all three python string types and less syntax. It's very nice.


Swift does have raw strings (the #"extended delimiter"# syntax).


No. Those ALSO have string interpolation!

#"\#(expression)"#

That is exactly the point.


But they’re a distinct string syntax. Your point seemed to be that there was only one. rf"{expression}" works in Python too, note, so either way you want to interpret it, raw strings aren’t a difference.


No, they aren't. You can have any number of #. Including zero. It's ONE syntax.


If you only make it work with string literals (e.g. generate the underlying formatting logic at parse time), it wouldn't allow arbitrary inputs to be treated as f strings.


The assumption I'm thinking they mean is to make formatting default and unformatted not default, for example, how "raw" strings were treated, escaped characters are replaced with the ascii code by default unless the string is raw, signified by an 'r' prefixed in front.


Adding that behavior would break existing code that uses str.format, and Python tries to avoid breaking code between minor releases.


That’s not really a feasible solution in Python because that change would break a load of existing code.


So what? Raise a deprecation notice, treat it as a fatal error in two or three years and that's it. PHP has been doing this for years now.


As someone who would like to be working on new, interesting things in 2-3 years rather than bringing old code into conformance with breaking changes, this attitude captures a worrisome trend in development.

On the one hand, it's great that we have platforms that innovate and improve and harden over time, but we're also facing a development culture where more and more time is spent servicing package/platform/language/OS changes that have no material impact on our own otherwise-mature projects.

It's worth being judicious about where breaking changes are applied, right?


One person’s “new interesting thing” is another person’s “breaking change”.


We’re not talking about deprecating a feature here, we’re talking about the addition of behaviour that will break existing code, potentially in non-trivial and hard to debug ways, and in ways that could easily introduce security vulnerabilities.


Just bump the major version number from 3 to 4, right? How long could that migration take?


We've done this too many times and we've had enough pain, let's please proceed at a pace where we can worry about delivering our product and not updating formatted strings, thank you.


> treat it as a fatal error

Did you think this through? What would you treat as a fatal error? How would the compiler know if a particular string is old style code wanting to print some characters between curly braces or new style code wanting to string interpolate a variable?


Also see: Python 2 => 3 hell. Nobody wants to repeat that.


Right?! Imagine: "We're announcing Python 4. Python 3 was because we handled unicode in a way that turned out to make nearly 0 sense. Python 4 is because you lot can't put your f's in the right place."


Changing something as deeply rooted as the string type?

Python already went through exactly that disaster once before, when they changed the default string type from b””-strings to u””-strings. It took about 20 years for this transition to finally complete.


PHP has also been responsible for the majority of exploited servers and misconfigured applications. Whatever they are doing it, I take it as a strong negative signal.


That's not unreasonable considering that PHP is by far the most popular server-side language. It's not like we have many hackers targeting Erlang instead.


It's out of proportion. Take as many Django/Rails/ASP.Net exploited sites that you find and it won't hold a candle to PHP.

Also, want to talk Java? Let's not forget that log4j was exploited precisely because of implicit string conversions.

Implicit f-strings are a really bad idea.


Python does not do this. A change like that would require a major version number increment and the community would revolt.

Too bad we can't go back in time to 1996 or so.


To be fair, your suggestion might make for a more resilient default, but it's also a great way to leak data and add overhead for the default case. There are tradeoffs.


Not much overhead, I would think. We’re talking about literal strings in source code, not strings in general. It’s not much work to check those.

One thing that it would break is that strings read from files would be treated differently from those in source code, even those read from files that logically “belong” to the application (say config file)

I don’t think that’s an issue, though.

Also, in Swift "\(foo)" does string interpolation. I haven’t seen people complain it leaks data or makes Swift slow (but then, it’s not fast at compiling at all because of its rather complicated type inference)


> Also, in Swift "\(foo)" does string interpolation. I haven’t seen people complain it leaks data or makes Swift slow (but then, it’s not fast at compiling at all because of its rather complicated type inference)

I think that the claim is not that this leaks data in an absolute sense, but rather that changing the behaviour after people have come to rely on it will leak data from currently well behaving applications.


>...and suggests that automating error finding is an appropriate course instead of making it less likely to make the error in the first place.

You can't fix the syntax and standard lib of the language. It is what it is. Similarly, how many bugs would you prevent if Python had compiler support to catch those types of syntax (and type) errors.


This is how bash works. Any string with a $ in it will be interpolated unless you double escape it. Also depending on if you use double or single quoted strings.

Spaces as list separator could also fall into this philosophical question of what makes most sense as string separators. Some times it is super convenient, until you have actual spaces in your string and it becomes a pita.

See also the yaml Norway problem for what happens when implicit goes over explicit.

It generates about the same amount of bugs, if not more, and would also end up with a code-review-doctor suggesting you to use /$ over $. In the end, regardless of syntax, a human always have to make the final call on whether interpolation is wanted or not.


This is how strings work in swift. It's a much superior system imo.


what's also ironic is I left an easter egg in the code sample for how we downloaded the list of repositories and no one has noticed it yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: