Author here. This was self-posted, so hope you find it useful. Please ask me any...

Ozzie_osman · on Dec 31, 2020

Great writeup. One thing that could be expanded upon could be the "append-only log" approach to message queues, which is the abstraction Kafka is using. The paradigm itself is extremely intuitive and performant. Producers write to the end of the log, and consumers maintain an offset of how far they've gotten.

That pattern is really powerful and simple, even if the main tool using it (Kafka) ends up being difficult to operate.

heipei · on Dec 31, 2020

Thanks for the guide, as others have said this is good introductory reading for new engineers. I know I really only learned about dedicated message queues 1-2 years into my career and felt really dumb for not having known/thought about message queuing before.

One thing you might consider adding are more examples for different popular queuing systems and how they differ from one another. The software I always reach for is nsq (https://nsq.io/) because it's meant to run co-located with the message producers and readers are supposed to connect to multiple instances where the messages are produced (using a lookup daemon). This is quite different from the queues on your list, so much so that I'd consider adding it just because it works so differently.

sudhirj · on Dec 31, 2020

Will add NSQ notes. Researched it once for distributed web socket pub sub, did find it pretty interesting.

sudhirj · on Dec 31, 2020

Added NSQ notes.

alexbouchard · on Dec 31, 2020

Great write up. I think you have clearly laid out the things that a newcomer to the space should know about queues. Bookmarked!

Talking to a bunch of engineering teams I found that some use case for queues are very generic (almost identical use case and implementation across teams). Specifically webhook handling is something that keeps coming up. We've been working for a few months of a queue that's specifically for ingesting and delivery of webhooks. Do you see a future for use case specific queueing systems instead of defaulting to a general purpose queue?

In our case we abstract the actual implementation and behave more like you would expect a standard webhook.

For reference, it's https://hookdeck.io

sudhirj · on Dec 31, 2020

The way I’ve handled incoming webhooks is to run a Lambda+API Gateway to put them into an SQS queue, and the inverse, SQS to Lambda, for sending webhooks out.

This was possible only because AWS provides these services, of course. If you’re offering an infinitely scalable HTTP endpoint to soak up webhooks and allow me to query them at my leisure, or put them into a queue for me, that would be useful.

I haven’t looked into hookdeck in detail yet, will post again once I do.

alexbouchard · on Dec 31, 2020

That's essentially it.

We've heard from teams having issues dealing with large uncontrollable spikes from their webhook providers and we can smooth out that out. There's additional benefits that can be introduce before it gets to your own infra such as verifying signatures, filtering events, etc.

API Gateway + SQS + Lamda is definitely a common and good approach. My understanding is that you often start running into into other problems. Hitting DB connection limits from serverless invocation is a recurring one! I'm hoping we can make the troubleshooting / replayability easier as well.

Thanks for sharing your approach and opinion! Hoping to hear more!

wdb · on Dec 31, 2020

Are you aware of any nice articles or open source projects for setting up the ability to dispatch webhooks events? I am currently thinking of using GCP PubSub for it and have it consumed by consumer (GCP Cloud functioon?) which does the network request, and requeues it back to the topic when its non 200 response. If it keeps failing 10 times it will get send to a dead letter queue.

sudhirj · on Dec 31, 2020

Not off the top of my head, no. Your plan sounds good except for the "and requeues it back" part. Ideally you should just ignore failure (don't acknowledge/delete) and have the queue control plane decide when and how to retry—unless you have special logic (exponential backoff?) around that. If you do need to re-queue, just make sure you re-queue before you delete/ack the current message, otherwise you might lose jobs.

alexbouchard · on Dec 31, 2020

Is that something you would be interested in a hosted solution for? We've built the infrastructure to deal with incoming webhooks but we've been exploring the idea to also leverage it for dispatching. Turns out the same infrastructure works both ways!

mindentropy · on Jan 1, 2021

I am from an embedded background/Operating system background. I would like to know if the ring buffers/circular queues in drivers or stacks suffer same problems as people are discussing here?

bambataa · on Dec 31, 2020

Thank you for putting this together. I am reading it with interest.

I spotted one small typo and thought you would like to know:

“like transferring information form your software into an email or an SMS on the cellphone network.”

sudhirj · on Dec 31, 2020

Thanks, fixing.

yogevyuval · on Dec 31, 2020

What web framework is used for this blog? Really liked it

sudhirj · on Dec 31, 2020

It’s Ghost (Ghost.org). The theme is Dawn (it’s on the official theme marketplace). My pandemic project is to offer Ghost hosting (https://ghosting.dev) - let me know if you want a free personal setup.

RedShift1 · on Dec 31, 2020

No mention of MQTT?

sudhirj · on Dec 31, 2020

As a protocol? Not sure how it would deserve more mention than JSON or some other wire protocol. Almost every example I’ve given will use a client SDK, which may or many not use MQTT under the hood.

Anything specific you think is worth pointing out?

RedShift1 · on Dec 31, 2020

Not persé as a protocol but one of the implementations like Mosquitto, like RabbitMQ, Kafka, etc... was covered.