Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Your main resource contention there would be network throughput, then broker CPU time

The default consumer read sizes are so small you will hit broker CPU and worker thread limits long before network throughput. (Both consumer batch sizes and broker threads can be increased trivially but there's not much documentation around when to do this.)



The default fetch.min.bytes is 1, that's true, but the default fetch.max.bytes is 50 MiB.

And fetching small amounts of data repeatedly doesn't impose much overhead unless you're deliberately disconnecting and reconnecting between polls.

And I assure you, you can definitely bottleneck network before CPU and/or network threads, Kafka was literally designed for very large numbers of consumers.

And as for tuning, Kafka The Definitive Guide is pretty much as the name suggests. I've been recommending it very strongly, especially the chapters on monitoring and cluster replication, for years.

You can download a free draft copy of the 2nd edition from Confluent, check it out :)


> but the default fetch.max.bytes is 50 MiB.

But max.partition.fetch.bytes is only 1MB.


Yep, it's aligned with the default max.message.bytes, largest batch size the broker will accept.

1 MiB is allegedly the ideal batch size for throughput.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: