On a meta-level, does anyone else think that the whole idea of writing a peer reviewed paper that is just a benchmark of different algorithms should be really rigorously reviewed before being accepted? Writing good benchmarks is hard, and so highly contextual that writing fair comparisons beteen algorithms (or data structures) is almost impossible unless you're an expert in all of the algorithms involved.
Yeah, I've also seen several academic papers on performance or "optimization" of existing algorithms which just demonstrate a complete lack of knowledge about how those algorithms are implemented in practice.
For example, there was a paper explaining how you could optimize the GJK algorithm by reducing the number of distance checks required, and in turn the number of square-roots... Despite the fact that everyone (including the authors of the original GJK algorithm) knows that you don't actually need to do a square-root to compare distances...
> Despite the fact that everyone (including the authors of the original GJK algorithm) knows that you don't actually need to do a square-root to compare distances..
Academia's purpose is to produce research, typically measured in publications per unit time. Optimizing one paper leads to a global reduction in the size of the literature by pruning opportunities for subsequent research, harming the overall performance of the system.
Problem is that academics are rarely experts at programming or have knowledge of computer architectures as much as someone in the industry. There are various tricks that are never taught at college, therefore academics have no idea some stuff even exists.
Best example is discrete optimization research (traveling salesman, vehicle routing and its variants, schedule rostering etc.). Stuff you find in the papers there achieves state-of-the-art results very slowly (using integer linear programming or some rarely optimized heuristics) making you believe these instances of a general NP-hard problem can't be solved quickly.
When you start tinkering, you either find that data structures can be added that reduce the complexity significantly or that there are regularities in instances that, when exploited, support massive speedups.
I would say that TSP research is an exception but most of stuff coming out that has a lot citations is way too slow and is never as brilliantly implemented as Lin Kernighan heuristic or other stuff from the age of insanely slow computers.
> Problem is that academics are rarely experts at programming or have knowledge of computer architectures as much as someone in the industry. There are various tricks that are never taught at college, therefore academics have no idea some stuff even exists.
I want to push back on this generalization a bit. The academics that are focused on pushing the mathematical boundaries of discrete optimization are focused, no surprise, on only that. If theoretical improvements is not what you want, don't read those papers, read ones what people more focused on applications. Read literally any databases paper, or stuff like hyperscan, simdjson. I'd argue that a non-trivial amount of these are vastly /ahead/ of what's standard in industry, but industry is slow to adapt new ideas and catch up because of legacy software and existing codebases. Very similar stuff in the programming languages sphere, it took ages for industry to adapt ideas that were popular in academia for a long time (eg: Rust, LINQ). The idea that academia (at least in CS) is an ivory tower, far from the concerns of industry, is not very true as of recent. There's a lot of cross pollination and a revolving door of people going back and forth from one to the other spreading ideas.
100x +1. The whole field of query optimization has been devoted to improvement of efficiency and speed of database systems for decades. [1] Academic results are studied quite closely in industry. Also, "academic" includes people like Mike Stonebraker and Andy Pavlo. They aren't exactly slouches regarding issues of performance.
More generally, major waves of performance innovation in the IT field have been driven by advances in storage, compression, vectorization, virtualization, etc. Academic results have led to entirely new products like Vertica (from C-Store [2]) or VMware (from the Disco system [3]).
I must say that when it comes to discrete optimization, the genetic/ant/simulated annealing/etc. stuff is more popular in academia than in industry (at least the industry that doesn't heavily include academics).
Works like Lin-Kernighan heuristic are extremely rare and a bunch of knowledge exists in industry only. Even the mentioned heuristic was for decades being implemented incorrectly until one individual came and demonstrated its superiority (K. Helsgaun).
I mean, most of the code that gets released today, doesn't even handle time windows constraint in the best way possible (optimal updates, constant time feasibility evaluations etc.). I believe all open source vehicle routing libraries do not have any data-structure relevant optimizations for handling time windows, task precedence, dynamic workshift starts etc. All are fairly simple implementation tricks that just got lost under giant amounts of population based heuristics or mathematical linear programming approaches.
Concorde is fine. The LK heuristic was published in 1973. After that, until the mid 90s, no one could outperform the original published results with the same heuristic.
Honestly, I found the algorithm description cryptic. It just may be me not "in the know" when it comes to details, with your "bunch of knowledge [that] exists in industry only" being the details the authors didn't care to elaborate on in the algorithm description. Maybe a part of the reason for the failure to replicate is that other people found it cryptic as well and didn't understand crucial details.
BTW, in your opinion, is there any readable text on the LK algorithm? I'm afraid that all I've read so far seems to be suffering from this problem. I have a very specific optimization need that doesn't seem be covered even by LKH-3 (it seems to be that my problem could be treated either as asymmetric TSP with time windows on a small number nodes and a non-metric distance matrix, or alternatively from the other side as VRP with asymmetric distances, unknown number of vehicles, and bounded trip duration for every vehicle, but either of the two needs an additional constraint that all vehicle trips should get close to the number of working hours within a day with the exception of one trip which can be shorter - basically I need an "ATSP with sleeping breaks") so I may need to roll out my own implementation of something and for an outsider, there doesn't seem to be a good text on implementing these things realistically without having some prior knowledge already.
Helsgaun's reports are pretty descriptive when it comes to LK algorithm.
But yes, I do agree that the original writing and the way the algorithm is taught in universities is a little bit cryptic and sometimes a completely different algorithm.
Simple explanation of the algorithm is:
1. 2-opt swap (A .. B -> .. -> C .. D ===> A .. C <- .. <- B .. D) - constant time compute of everything to check feasibility and new cost,
2. recursive 2-opt - now between C .. B with bounds (you can figure out which swaps are inferior and not recurse).
That's all there is to the algorithm. Helsgaun improves it a bunch (with preprocessing, deeper and more efficient k-opt moves, more efficient datastructures) but keeps the same idea.
There's another very flexible method called "ejection chains" started by Fred Glover and is very effective if you couple it with efficient data structures but also, very cryptic. It easily supports time windows, breaks, flexible work hours, flexible number of vehicles etc. Of course, the latest papers in "ejection chains" show no such thing :D
Sometimes you can stumble upon PhDs publishing their code and the tricks do not exist at all.
and there they try to categorize the time complexity of solving these problems and the tables are just not as up-to date with the industry. Some O(n log n) are O(n) or even better (depending on the local moves you do).
But it's a good start. A lot of these subproblems can be implemented really fast on a CPU and I guess the moment they are explicitly solved in a software library that's when the heuristics research will improve.
From what I saw (I worked in an academia-consulting partnership to solve scheduling problems in transports), the OR researchers very frequently work in association with OR consulting shops, and they're well-aware of the difference between academic datasets and industrial datasets. In the conferences I saw, it was not infrequent to see different tracks for industrial and academic papers, both attended by the same crowd.
The point I agree with, though, is that this is not reflected in the papers. Academic papers focus on academic instances because they are more general (and usually harder, as you said) and because optimizations of specific instances of the problem are not that useful from an academic pov.
It's hard to know who works with who and who has experience with what if you're not an insider, though.
> Stuff you find in the papers there achieves state-of-the-art results very slowly (using integer linear programming or some rarely optimized heuristics) making you believe these instances of a general NP-hard problem can't be solved quickly.
Yes, or looking for things like mathematical purity, linear problem statement, etc
In practice: you don't need the best solution and you can get a great solution in a multitude of ways.
You do realize there is a whole area of the research field dedicated for heuristic algorithms? They have a proper academic basis just as much as the “correct” solutions.
On the one hand, peer review takes long enough already. On the other... I saw an influential paper that published obviously-wrong timing data, essentially saying that time(A) + time(B) < time(B). It seems they were including the Python interpreter startup time (~0.1s) in a profile of a ~0.01s algorithm.
Benchmarking papers are inaccurate when the original algorithms are not open sourced, and the grad student needs to rewrite the algorithm from scratch. They can easily create different implementation details, and wind up with an algorithm that's slower than the original.
I do think that the original algorithm authors should have the opportunity to correct the benchmarking code, or to release their original implementation as open source to be benchmarked.
In some sense, the benchmarking paper with a slower implementation is more "correct," since an engineer evaluating which algorithm to use is just as likely to implement the algorithm in a slower way than the original paper. The incentives are right too: the original paper author should be giving enough details to recreate their work, and the benchmarker is showing that really their published algorithm is slow.
It would be nice if pure benchmark papers were a thing. Most of the time system papers get accepted for some new idea. The evaluation section is often biased towards the new idea. Independent benchmarks could fix this.