So in this case, with some modest performance engineering, Golang is surprisingly fast out of the box, and Rust requires more effort and increasingly less idiomatic code to reach the same result.
I haven't dug into the code, but it's clear that the Golang library has had a ton of optimization work put into it by very knowledgeable people. Techniques like object pooling are highly error prone, and certainly not "out of the box" Golang.
The Rust code and the blog post, on the other hand, seem to be written by someone less familiar with Rust and high-performance parsing. I think they would have avoided all their problems if they just used lifetimes to safely avoid copying from the start, instead of relying on increasingly elaborate workarounds like the `Bytes` crate. Apart from one "forces one to deal with the contagion of lifetimes" comment in the conclusion, they never mention why they didn't do this, even though it's clearly the idiomatic Rust solution. Maybe they had technical reasons for not doing lifetimes, but to me it just seems like unfamiliarity with Rust.
> Techniques like object pooling are highly error prone, and certainly not "out of the box" Golang.
This one is actually out of the box in Golang, it's called sync.Pool and is accessible in the standard library. It's very easy to use and not error prone, I've used it many times without any issues.
But creator of VictoriaMetrics is indeed someone very knowledgeable and known in Golang community in context of optimization.
I’m unfamiliar with all of the libraries used here, however, the solutions from the blog post struck me as quite odd. serde::Deserializer solves a similar problem and makes it quite straightforward and safe to borrow from the input, if possible. Obviously, the input needs to outlive the borrowed values.
> Apart from one "forces one to deal with the contagion of lifetimes" comment in the conclusion
I guess that must be the logical consequence of the “async function are contagious” meme… I wonder if at some point we end up with people arguing that dynamic typing is obviously better because it avoids “contagion with types”.
For a language designed for fresh-out-of-college engineer to pick up in a few weeks and be effective, it is very easy to squeeze out a lot of performance.
* built-in profiler.
* built-in escape analysis tool.
* It's easy to pass pointers instead of copying data.
* []byte is sub-slice-able, with a backing array. This does throw people off occasionally, but the trade off is performance.
* Go lets you have real arrays of structs, optimizing cpu caches.
* Built-in memory pools
And more.
And if you look at "non-idiomatic" performance code, they are surprisingly legible by the said fresh engineer. It's as if the designers didn't want to give up all the usual C performance tricks while making a Java/Python kind of friendly language, and this shows.
Of course Go can go only so far, due to the built-in runtime and GC. But it gets very far. Much farther than at first glance, or second glances that language snobs would give credit for.
While it’s a nice bed time story if you like go, the reality is no, it’s not typical.
Go has a good perf story, but typically rust or c++ would be faster after heavy optimisation; and should be more or less on par with typical applications. This isn’t a critique of go, and shouldn’t surprise anyone.
Typically go also has unexpected optimisation hoops to jump through and problems related to the heavy use of channels (see the well documented answer here: https://stackoverflow.com/questions/47312029/when-should-you...), so you would generally expect it to be slower…
…but, naive implementations are always slower, and really, it’s probably much of a muchness out the box for most day to day uses.
In almost all situations (even python or Java) you can get great performance if you invest time and effort in it.
But idiomatic code typically faster than rust? No, not really.
It's really easy to improve the performance of Go implementation as compared to python or Java. There are lots of built-in tools to help you (like the profiler), and the resulting code is really very legible even to fresh college grads.
This is based on my first hand experience, but YMMV.
While I am not a big Go fan, most people validating Go's performance mix language and implementation, and forget gccgo also exists, sharing many optimizations offered by GCC's backend.
Unfortunely, it seems stuck in Go 1.18, the last pre-generics version, with no roadmap for moving forward.
Given Go's folks stance on generics, still a lot of Go code is compilable with gccgo.
The Go code is already hyper-optimized by experts, just not by the blog authors, so you don't read about it here. As someone who has tried to write high-performance Go code on occasion, I can assure you that a ton of digging would have been required on that side as well.
I don't see why did you label Aliaksandr Valialkin, the author, an "expert". I mean, he's no dummy but what exactly makes him an expert on optimizing Go code?
As someone who also writes Go, I don't see any "hyper optimizations" in the code. It just decodes the bytes of Protocol Buffer using straightforward code that I would expect a competent developer to write.
It really is just: read bytes from memory ([]byte) and interpret them according to PB spec.
There's only one trick there: unsafeBytesToString() that does no-allocation conversion of []byte to string. This is unsafe in general but safe in their specific case. And I've seen this trick before so it's not some secret, expert-only knowledge.
Most comments here are like bad LLMs: hallucinating opinions without bothering to spent even few minutes acquiring the data to base those opinions on.
He’s probably most well known in the golang community for fasthttp which is a widely used and highly optimized golang replacement for the golang stdlib. I’m a long term golang developer and I think calling him a golang optimization “expert” is fair.
That said, I agree with your assessment about this particular code. It’s fairly straightforward idiomatic go.
> I don't see why did you label Aliaksandr Valialkin, the author, an "expert". I mean, he's no dummy but what exactly makes him an expert on optimizing Go code?
I was trying to convey the meaning of "far more experienced than the blog post authors", but without having to insult the authors. It's a good writeup after all, and I'm glad they took the time.
We must have some different interpretations of what "optimized" means. This is the very first piece of code in the file you linked:
func (fc *FieldContext) NextField(src []byte) ([]byte, error) {
if len(src) >= 2 {
n := uint16(src[0])<<8 | uint16(src[1])
if (n&0x8080 == 0) && (n&0x0700 == (uint16(wireTypeLen) << 8)) {
// Fast path - read message with the length smaller than 0x80 bytes.
msgLen := int(n & 0xff)
src = src[2:]
if len(src) < msgLen {
return src, fmt.Errorf("cannot read field for from %d bytes; need at least %d bytes", len(src), msgLen)
}
fc.FieldNum = uint32(n >> (8 + 3))
fc.wireType = wireTypeLen
fc.data = src[:msgLen]
src = src[msgLen:]
return src, nil
}
}
// ... function continues beyond this point
As far as I can tell, this entire codepath exists solely as an optimization. I spent many years working on a chess engine for fun, so I'm pretty well versed in bit twiddling, but I'm seriously struggling with this. Like, is it doing `(n&0x8080 == 0)` to check to whether length is less than 0x80? Is that even correct?
I think "hyper optimized" is a completely fair characterization. But we clearly work in different industries.
Protobuf uses a bunch of variable length encodings. Here it's decoding a TLV format, but the length is itself a variable length integer (and seems to be a kind of tag-value encoding?) where you basically get 7 bits per byte telling you the value, and the leftmost bit tells you whether there's another byte. So if you mask with 0x8080 and get zero, then it was a 1 byte (7 bit) integer.
If 0x8080 is not set, then the tag-value record is 2 bytes. Left byte has tag. Right is value. Then they're masking with 0x0700 to get the type of record, which should be LEN.
So if it's a single byte LEN record, they can take that single byte as the length (they mask with 0x00ff, but really it's 0x007f. They already know the 0x80 bit is zero, and the value is contained in the least significant 7 bits). Otherwise they have to do some fiddly logic to decode the variable length integer to figure out the length (length here being the L in TLV).
I'm not sure the presence of bit unpacking code in a decoder for a bit packed protocol is sufficient to call it hyper optimized. That seems like the nature of the problem.
> I don't see why did you label Aliaksandr Valialkin, the author, an "expert". I mean, he's no dummy but what exactly makes him an expert on optimizing Go code?
https://victoriametrics.com/team/ - Let's see. Author of multiple performance-optimized libraries with a masters degree in computer software engineering and a background in highly scalable systems (as needed for adtech). Sounds pretty much like an expert for optimizing code to me.
> And I've seen this trick before so it's not some secret, expert-only knowledge.
So, if you know it it's not export knowledge or what's your argument here?
Yes, and this is what I was curious about, because I haven't seen the performance cost of this discussed a lot.
Worded differently: does the strict concept of ownership/lifestimes in Rust bias a default (naive) implementation towards lower performance (eg due to required copying) when compared to a naive Golang (or even Java) implementation?
I have no doubts that after heavy optimization, Rust beats languages such as Go & Java.
Using clone or Arc with boxing everywhere to avoid using references with lifetimes at all will lead to code that's slower than Go/Java, yes, unless you're just cloning small objects that don't internally use heap allocations or your algorithm dictates to only need moves, not sharing, or perhaps except it will likely use less RAM which in some situations may still make it faster. But such "newbie" code will probably still be using some other existing code that is using references internally, which makes things faster for those parts. Also, the difficulty of the use of references varies depending on how long / indirect they are going to be used, in many places references are easy to deal with even for a beginner; so it becomes a question of how much the code relies on clone and reference counting.
When I learned Rust, I actually never went with the "use clone or Arc to make your life easier while learning" recommendation but always used references and learned how to use lifetime declarations and program design to go as far with them as reasonable. TBF I had experience with C and C++ already. But once reasonably experienced working in Rust (after a year?), your code should be faster most of the time the way you write it on the first try without needing optimization work.
> In Go, a string is just a simple wrapper around []byte, and deserializing a string field can be done by simply assigning the original buffer's pointer and length to the string field. However, Rust's PROST, when deserializing String type fields, needs to copy the data from the original buffer into the String
It's interesting that the protobuf code generator for Go seems to allow this direct access. For C++, you also need a copy (and potentially a heap allocation) since `string` fields are returned via `const std::string&`. Protobuf support for `std::string_view` has been years in the making.
FlatBuffers does the trick as a better replacement to ProtoBuf when running in some resource-critical devices. We were working on a project that leverage Arrow IPC (internally FlatBuffers) and shared memory to collect metrics on edge devices with limit CPU and memory, hopefully we can open source it soon.
with indexing patterns like `self.vec[..self.len]` and setting `self.len = 0` to clear you avoid the cost of dropping all items at once(like Vec::clear does). However you still have to drop items, so with this solution the cost is amortised in `RepeatedField::push` and other methods that do `self.vec[i] = new_item`.
> It's designed to avoid the drop overhead
Isn't true. You don't avoid the the overhead, at most you delay/amortise it.
WriteRequest::timeseries is a vector (https://github.com/prometheus/prometheus/blob/main/prompb/re...) and
the repeated file `Timeseries::labels` and `Timeseries::samples` are reused across different timeseries. You don't have to alloc a new vector for the lables and samples for each new timeseries instance.
That would be true if you used `Vec::clear` too, it doesn't allocate a new vector. My point was that you still end up running Drop implementations with RepeatedField<T>, just not all at once. See https://play.rust-lang.org/?version=stable&mode=debug&editio...
prost is the most widely used Protobuf implementation in Rust, maintained by the Tokio organization. prost generates structs and serialization/deserialization code for you.
easyproto according to GitHib Search is used only by two projects. easyproto provides primitives for serializing and deserializing Protobuf, and requires hand writing code to do both.
A fair comparison would be prost vs google.golang.org/protobuf, or easyproto vs parts of quick-protobuf.
In most cases you can make Go as fast as Rust, but from my experience writing performance-sensitive code in Go requires significantly larger time investment and overall requires deeper language expertise. Pebble (RocksDB replacement in Go by CockroachDB) is a good example of this, the codebase is littered with hand-inlined[1] functions, hand-unrolled loops and it's not[2] even using Go memory management for performance critical parts, it's using the C memory allocator and manual memory management.
Yes, but their code appears to actually be `unsafe` (in the Rust terminology sense) without specifying that in their function declarations. They use `unsafe` inside their `slice` function, but return a value that is unsafe to use, hence `slice` should be marked `unsafe`, as should `copy_to_bytes` and then `merge_bytes`. Same for PromLabel::merge_field and PromTimeSeries::merge_field as far as I can see, and maybe higher up in their actual app. This is definitely not how Rust code is supposed to work, if a function isn't marked unsafe, it should not be allowed to introduce UB; they violate that. This approach is on par security wise with C/C++ code iff programmers are aware of the pitfalls, which normally isn't the case since Rust programmers expect non-unsafe functions to be safe (i.e. not require additional care to avoid undefined behaviour).
They either need to mark their functions `unsafe`, or use lifetimes (which may require changes in some APIs, which may be the reason they didn't).
I was looking at the main branch, and described the situation there. They have a different branch for the optimization work; in that branch, they do mark those functions as `unsafe` (and already did when I posted).
In the image at the top of the article, why is the Rust crab altered to have "angry" eyes and holding a knife aimed at the Go gopher? Aside from the joke of "don't bring a knife to a protobuf fight" the inference of violence sucks and lessens the spirit of friendly competition and "all in good fun." I don't know if Rust has a code-of-conduct or rules for use of their mascot, but I bet this doesn't follow it.
Congratulations! You've won the Poe's Law Post of the Day Award!
"Poe's law is an adage of Internet culture which says that, without a clear indicator of the author's intent, any parodic or sarcastic expression of extreme views can be mistaken by some readers for a sincere expression of those views."
But it does appear to be the author's intent, considering their Twitter account has a photo of the crab using the gohper's carcass (with dead eyes) as a carpet referencing the same article (if you translate the Chinese). Also, the author went out of their way to use a CrabLang logo (not a Rust logo) to add the knife.
https://x.com/ratuthomm/status/1775183479858483439https://imgur.com/a/txMb4Kw
What intent? Have you asked the author what their intent is? The linked Twitter post does not have a knife in sight. What is wrong with using the CrabLang logo when he was using valid CrabLang?
Maybe you're not thinking about it enough? Do you know anyone who has been almost fatally stabbed (attacked by gangs) with knives? I do! It likely violates a code-of-conduct and is unprofessional.
Isn’t it tiring to think about every possible offence in the world and how someone somewhere will be offended by (knives, tires, cars, pens, saws, sharks, planes, needles, ropes, speakers, games, … you get the point) while writing an article about a programming language?
At some point one needs to exercise common sense and learn to live in a public society where people speak and not everyone is out to offend you.
I have no association with any of these communities, but the crab holding a knife was a somewhat well-known meme[1].
I guess it can also be viewed as a play on words, given that crablang is a fork.
Given that the creators of crablang explicitly say it was a "lighthearted response"[2] to some of Rust's changes, it makes sense that they'd use a meme for a logo.
That said, seems you're not alone in wanting a different logo[3].
Ah, I think I remember that image meme from a long time ago (I never would have connected those dots). Thanks for the context here, it actually helps take the edge off!
That's the challenge with these inside jokes. If you don't get the context, and I didn't recall it instantly either, usually the interpretation will be wildly different.
Exactly, then why does it matter that the author had anything in their post as a figure of speech or analogy?
Is it wrong to post a meme of a dog sitting near fire - https://knowyourmeme.com/memes/this-is-fine
As a joke from SREs who handle firefighting calls?
Does it offend dog lovers, people who are scared of fire?
When I see a knife, it never suggests a weapon, unless it is the kind of knife that is really a weapon, like a double-edged stiletto, or it is wielded by someone who has obvious intentions to use it as a weapon.
The knife from that logo does not look like a weapon, but just like a standard utility knife. Moreover, a crab cannot move a knife in the way in which it is used as a weapon, e.g. for stabbing, but only in a way similar to a human who eats using a knife. Therefore a crab does not suggest someone who uses a knife for violent purposes.
For people like myself, who do not buy industrially-made food, there is no other tool more important than knives. Without using knives every day, I would starve to death.
So you may be offended by seeing a knife that in your mind looks like a weapon, but I am offended when someone claims that one of the most essential, if not the most essential tool of the humans suggests violence or other bad things.
Some people may use knives seldom or never, but then their lives are completely dependent of the work of other humans who use knives to produce the things that sustain the lives of those who do not use knives.
Thank you. Sometimes these trains of sensitivity in thought threads can be annoying.
I definitely appreciate your contextual perspective and the ability to express it much more clearly and calmly than my own initial reaction.
I'm not sure if we submit, as a society, need to see more Tex Avery cartoons as children.
I genuinely feel that either a new civil war in the US and or global conflict is eminent and the young of today are largely ill prepared to say the least.
It’s simple. You ignore fire as a bad thing and don’t even consider that as equivalent comparison to a knife. But someone who was affected by arson probably will say “why include fire in a blog post?!” . My point is, it is not on the author to think about all these effect when they write/speak. As a mature society, we should learn to not expect every source to filter their thoughts and rather expect the consumer to filter out what they don’t want.
By not valuing fire/arson at the same level of concern as Knife(important to you), you just validated the core of the problem.
How about the image of the dog / fire viewed by someone who was orphaned and horribly scarred for life as a child in a household fire?
Trauma exists in all forms in our world. Sometimes the extremity of trauma is used in jest simply because it’s so extreme and at odds with the situation. That’s a form of absurdist humor that absolutely runs the risk of triggering someone that the extreme situation is personal to, but was never intended to hurt anyone.
I think almost everyone considers that situation - someone triggered by personal trauma when seeing a cartoon crab attacking a cartoon gopher with a knife on a programming blog about performance between two programming languages - sad and has empathy for those suffering from a relived trauma. That must be debilitating in life and no one is insensitive to that level of embodied suffering.
However, likewise, almost no one feels sympathy for the person who pulls out a code of conduct to kill any cartoon humor not designed for the Sunday serialization of a national newspaper.
Is this typical?