On a meta-level, does anyone else think that the whole idea of writing a peer re...

Diggsey · on July 31, 2021

Yeah, I've also seen several academic papers on performance or "optimization" of existing algorithms which just demonstrate a complete lack of knowledge about how those algorithms are implemented in practice.

For example, there was a paper explaining how you could optimize the GJK algorithm by reducing the number of distance checks required, and in turn the number of square-roots... Despite the fact that everyone (including the authors of the original GJK algorithm) knows that you don't actually need to do a square-root to compare distances...

meepmorp · on July 31, 2021

> Despite the fact that everyone (including the authors of the original GJK algorithm) knows that you don't actually need to do a square-root to compare distances..

Academia's purpose is to produce research, typically measured in publications per unit time. Optimizing one paper leads to a global reduction in the size of the literature by pruning opportunities for subsequent research, harming the overall performance of the system.

feikname · on Aug 1, 2021

how is less and more accurate (read: not done wrong) studies pruning opportunities for future research?

from my view, it becomes easier to make research.

less, more condensed literature is easier to cite and read, as it's "unified".

from there, many other researches can easily base the work off the paper for novel research.

less doesn't necessarily mean worse.

knuthsat · on July 31, 2021

Problem is that academics are rarely experts at programming or have knowledge of computer architectures as much as someone in the industry. There are various tricks that are never taught at college, therefore academics have no idea some stuff even exists.

Best example is discrete optimization research (traveling salesman, vehicle routing and its variants, schedule rostering etc.). Stuff you find in the papers there achieves state-of-the-art results very slowly (using integer linear programming or some rarely optimized heuristics) making you believe these instances of a general NP-hard problem can't be solved quickly.

When you start tinkering, you either find that data structures can be added that reduce the complexity significantly or that there are regularities in instances that, when exploited, support massive speedups.

I would say that TSP research is an exception but most of stuff coming out that has a lot citations is way too slow and is never as brilliantly implemented as Lin Kernighan heuristic or other stuff from the age of insanely slow computers.

makmanalp · on July 31, 2021

> Problem is that academics are rarely experts at programming or have knowledge of computer architectures as much as someone in the industry. There are various tricks that are never taught at college, therefore academics have no idea some stuff even exists.

I want to push back on this generalization a bit. The academics that are focused on pushing the mathematical boundaries of discrete optimization are focused, no surprise, on only that. If theoretical improvements is not what you want, don't read those papers, read ones what people more focused on applications. Read literally any databases paper, or stuff like hyperscan, simdjson. I'd argue that a non-trivial amount of these are vastly /ahead/ of what's standard in industry, but industry is slow to adapt new ideas and catch up because of legacy software and existing codebases. Very similar stuff in the programming languages sphere, it took ages for industry to adapt ideas that were popular in academia for a long time (eg: Rust, LINQ). The idea that academia (at least in CS) is an ivory tower, far from the concerns of industry, is not very true as of recent. There's a lot of cross pollination and a revolving door of people going back and forth from one to the other spreading ideas.

hodgesrm · on July 31, 2021

100x +1. The whole field of query optimization has been devoted to improvement of efficiency and speed of database systems for decades. [1] Academic results are studied quite closely in industry. Also, "academic" includes people like Mike Stonebraker and Andy Pavlo. They aren't exactly slouches regarding issues of performance.

More generally, major waves of performance innovation in the IT field have been driven by advances in storage, compression, vectorization, virtualization, etc. Academic results have led to entirely new products like Vertica (from C-Store [2]) or VMware (from the Disco system [3]).

[1] https://en.wikipedia.org/wiki/Query_optimization

[2] https://w6113.github.io/files/papers/cstore-vldb05.pdf

[3] http://www.cs.cmu.edu/~15712/papers/bugnion97.pdf

edit: clarity

knuthsat · on July 31, 2021

I must say that when it comes to discrete optimization, the genetic/ant/simulated annealing/etc. stuff is more popular in academia than in industry (at least the industry that doesn't heavily include academics).

Works like Lin-Kernighan heuristic are extremely rare and a bunch of knowledge exists in industry only. Even the mentioned heuristic was for decades being implemented incorrectly until one individual came and demonstrated its superiority (K. Helsgaun).

I mean, most of the code that gets released today, doesn't even handle time windows constraint in the best way possible (optimal updates, constant time feasibility evaluations etc.). I believe all open source vehicle routing libraries do not have any data-structure relevant optimizations for handling time windows, task precedence, dynamic workshift starts etc. All are fairly simple implementation tricks that just got lost under giant amounts of population based heuristics or mathematical linear programming approaches.

jhgb · on Aug 1, 2021

> Even the mentioned heuristic was for decades being implemented incorrectly until one individual came and demonstrated its superiority (K. Helsgaun).

Does this mean that the Linkern program in the Concorde TSP suite is also implemented incorrectly?

knuthsat · on Aug 1, 2021

Concorde is fine. The LK heuristic was published in 1973. After that, until the mid 90s, no one could outperform the original published results with the same heuristic.

jhgb · on Aug 1, 2021

Honestly, I found the algorithm description cryptic. It just may be me not "in the know" when it comes to details, with your "bunch of knowledge [that] exists in industry only" being the details the authors didn't care to elaborate on in the algorithm description. Maybe a part of the reason for the failure to replicate is that other people found it cryptic as well and didn't understand crucial details.

BTW, in your opinion, is there any readable text on the LK algorithm? I'm afraid that all I've read so far seems to be suffering from this problem. I have a very specific optimization need that doesn't seem be covered even by LKH-3 (it seems to be that my problem could be treated either as asymmetric TSP with time windows on a small number nodes and a non-metric distance matrix, or alternatively from the other side as VRP with asymmetric distances, unknown number of vehicles, and bounded trip duration for every vehicle, but either of the two needs an additional constraint that all vehicle trips should get close to the number of working hours within a day with the exception of one trip which can be shorter - basically I need an "ATSP with sleeping breaks") so I may need to roll out my own implementation of something and for an outsider, there doesn't seem to be a good text on implementing these things realistically without having some prior knowledge already.

knuthsat · on Aug 6, 2021

http://webhotel4.ruc.dk/~keld/research/LKH/

Helsgaun's reports are pretty descriptive when it comes to LK algorithm.

But yes, I do agree that the original writing and the way the algorithm is taught in universities is a little bit cryptic and sometimes a completely different algorithm.

Simple explanation of the algorithm is:

1. 2-opt swap (A .. B -> .. -> C .. D ===> A .. C <- .. <- B .. D) - constant time compute of everything to check feasibility and new cost,

2. recursive 2-opt - now between C .. B with bounds (you can figure out which swaps are inferior and not recurse).

That's all there is to the algorithm. Helsgaun improves it a bunch (with preprocessing, deeper and more efficient k-opt moves, more efficient datastructures) but keeps the same idea.

There's another very flexible method called "ejection chains" started by Fred Glover and is very effective if you couple it with efficient data structures but also, very cryptic. It easily supports time windows, breaks, flexible work hours, flexible number of vehicles etc. Of course, the latest papers in "ejection chains" show no such thing :D

Sometimes you can stumble upon PhDs publishing their code and the tricks do not exist at all.

Just recently there's been a nice categorization of "timing problems" https://onlinelibrary.wiley.com/doi/abs/10.1002/net.21587

and there they try to categorize the time complexity of solving these problems and the tables are just not as up-to date with the industry. Some O(n log n) are O(n) or even better (depending on the local moves you do).

But it's a good start. A lot of these subproblems can be implemented really fast on a CPU and I guess the moment they are explicitly solved in a software library that's when the heuristics research will improve.

davidgl · on Aug 1, 2021

Do you have any links for efficient ways to handle time window constraints etc ?

knuthsat · on Aug 6, 2021

https://pubsonline.informs.org/doi/abs/10.1287/ijoc.4.2.146

Constant time feasibility check of inserts and linear time update after insert.

Of course, there are tricks that support multiple simultaneous inserts and tricks that minimize the update of the data structure after insert.

But, if you want to do 2-opt (k-opt) with time windows, then you have to work out the details because no one in academia did.

jhgb · on Aug 1, 2021

Helsgaun did describe some things in his LKH-3 technical report, although it's quite terse and does not really go into details: http://webhotel4.ruc.dk/~keld/research/LKH-3/LKH-3_REPORT.pd... . You may be better served by just looking at the source code in http://webhotel4.ruc.dk/~keld/research/LKH-3/LKH-3.0.6.tgz if you're interested how some specific constraint is handled.

pyrale · on July 31, 2021

From what I saw (I worked in an academia-consulting partnership to solve scheduling problems in transports), the OR researchers very frequently work in association with OR consulting shops, and they're well-aware of the difference between academic datasets and industrial datasets. In the conferences I saw, it was not infrequent to see different tracks for industrial and academic papers, both attended by the same crowd.

The point I agree with, though, is that this is not reflected in the papers. Academic papers focus on academic instances because they are more general (and usually harder, as you said) and because optimizations of specific instances of the problem are not that useful from an academic pov.

It's hard to know who works with who and who has experience with what if you're not an insider, though.

raverbashing · on July 31, 2021

> Stuff you find in the papers there achieves state-of-the-art results very slowly (using integer linear programming or some rarely optimized heuristics) making you believe these instances of a general NP-hard problem can't be solved quickly.

Yes, or looking for things like mathematical purity, linear problem statement, etc

In practice: you don't need the best solution and you can get a great solution in a multitude of ways.

kaba0 · on July 31, 2021

You do realize there is a whole area of the research field dedicated for heuristic algorithms? They have a proper academic basis just as much as the “correct” solutions.

jhgb · on Aug 1, 2021

Since he mentions the L-K heuristic, he obviously does.

comicjk · on July 31, 2021

On the one hand, peer review takes long enough already. On the other... I saw an influential paper that published obviously-wrong timing data, essentially saying that time(A) + time(B) < time(B). It seems they were including the Python interpreter startup time (~0.1s) in a profile of a ~0.01s algorithm.

lostdog · on July 31, 2021

Benchmarking papers are inaccurate when the original algorithms are not open sourced, and the grad student needs to rewrite the algorithm from scratch. They can easily create different implementation details, and wind up with an algorithm that's slower than the original.

I do think that the original algorithm authors should have the opportunity to correct the benchmarking code, or to release their original implementation as open source to be benchmarked.

In some sense, the benchmarking paper with a slower implementation is more "correct," since an engineer evaluating which algorithm to use is just as likely to implement the algorithm in a slower way than the original paper. The incentives are right too: the original paper author should be giving enough details to recreate their work, and the benchmarker is showing that really their published algorithm is slow.

peq · on Aug 1, 2021

It would be nice if pure benchmark papers were a thing. Most of the time system papers get accepted for some new idea. The evaluation section is often biased towards the new idea. Independent benchmarks could fix this.

thysultan · on July 31, 2021

Only the holy experts can divulge the realm of gods that is benchmarking and reveal to us the results. How—to do we beseech them o wise one?