Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not just Wikipedia. My home server (hosting a number of not particularly noteworthy things, such as my personal gitea instance) has been absolutely hammered in recent months, to the extent of periodically bringing down the server for hours with thrashing.

The worst part is that every single sociopathic company in the world seems to have simultaneously unleashed their own fleet of crawlers.

Most of the bots downright ignore robots.txt, and some of the crawlers hit the site simultaneously from several IPs. I've been trying to lure the bots into a nepenthes tarpit, which somewhat helps, but ultimately find myself having to firewall entire IP ranges.



Why not just rate limit? IP range based blocking will likely hit far more legitimate users than you think.


I know how I can block IPs in my router, but I'm not sure how I can rate limit. And I don't want to rate limit on the server level, because that in itself places load on the server.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: