Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The correct answer to parsing JSON is... don't. We experimented last hackday with building Netflix on TVs without using JSON serialization (Netflix is very heavy on JSON payloads) by packing the bytes by hand to get a sense of how much the "easy to read" abstraction was costing us, and the results were staggering. On low end hardware, performance was visibly better, and data access was lightening fast.

Michael Paulson, a member of the team, just gave a talk about how to use flatbuffers to accomplish the same sort of thing ("JSOFF: A World Without JSON"), linked in this thread: https://news.ycombinator.com/item?id=12799904



Not sure what your point is (or the point of that presentation, for that matter).

Of course there are binary serialization formats that are faster than XML or JSON, and of course they're less error-prone. This has been known for about 40 years now.

JSON/XML are used precisely because people want a human-readable interchange format. For high-performance uses, consider Google's Protocol Buffers or Boost::serialize. You're acting like you just hackathoned the biggest thing since sliced bread, but that's exactly how payloads have been sent (until high-bandwidth made us all lazy) since the inception of the Internet.


From experience, I think the whole "human-readable" idea is a bit overrated. All it means is that the format is entirely/mostly in ASCII. But if you have a hex editor, like all good programmers should, binary formats are not any less human-readable (or writable) nor more difficult to work with; and for some, even a text editor with CP437 or some other distinctive SBCS will suffice after a while. It's somewhat like learning a language; and if you are the one developing the format, it's a language that you create.

Then again, I grew up working with computers at a time when writing entire apps in Asm/machine language was pretty normal as well as other things which would be considered horribly impossible by many developers of the newer generation, and can mentally assemble/disassemble x86 to/from ASCII, so my perspective may be skewed... just a tiny little bit. ;-)


But the phrase is "human-readable" and not "programmer-readable".


A minor gripe with your comment, but as a programmer conceivably must be human, both conditions are satisfied when a programmer is capable of reading it.


I thought my point was clear - don't get involved parsing JSON; I agree with the OP, parsing JSON is a minefield. I went further by implying that it is also unnecessary when ease of reading isn't needed, and called out some alternatives. I think it's amusing that you mentioned protocol buffers - were you aware that when I mentioned flat buffers that they were built in relation to performance inefficiencies in the very protocol buffers that you mentioned?

We didn't just "hackathoned the biggest thing since sliced bread", btw, we took a real world example of exchanging a human readable format for a human-with-tools-readable one and saw a significant win. High-bandwidth also isn't as prevalent as you think, and yes, you're generally paying both performance wise and occasionally monetarily for the laziness you mentioned. But then, if you've known this for 40 years and don't know how to measure it, there's not much I can do for you in a comment.


I believe that is the point. Choose the right serialization strategy to fit the job. Most projects default to JSON regardless of how suitable. At some scale that should be revisited since the human-readable / performance trade-off equation can change.


I want to use something like flat buffers in NodeJS for optimizing websocket traffic and implementing a FS database. But I cant find much stuff for it in JavaScript. Do you (de)serialize the flat buffers or use them directly by abstracting get/set for example via Object.defineProperty ?


Google included a complete example of how to (de)serialize data with them, as well as generating the accessor functions for JS. See https://google.github.io/flatbuffers/flatbuffers_guide_use_j...

No, I don't try to decode them directly if I can avoid it. We handrolled a C-based byte packer for our honeybadger project only because function access is relatively slow on the interpreter we have on low end TVs, but reading blocks through a Uint8Array is pretty fast. Writeup is here: http://alifetodo.blogspot.com/2016/05/project-honeybadger-pi... . I can push the C packer to github if there's interest, but since you mention NodeJS, you might have better luck with Paulson's benchmark NodeJS example: https://github.com/michaelbpaulson/flatbuffers-benchmarks


On consideration I don't think flat buffers is worth the added complexity. Using basically C structs is sure faster, and suitable for low end devices and clients written in C. But JSON have many advantages.


You may want to take a look at Protocol Buffers or the new Ion format by Amazon; the latter can seamlessly switch between binary and human-readable.


Yup, I'm a fan of protobufs and the more recent flatbufs. Definitely like the tooling around being able to seamlessly switch for readability. I'll check out ion.


Unfortunately one often has to interact with APIs which require JSON, but I completely agree that pure binary formats are far simpler to work with.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: