That's what robots.txt does. However you'd have to delist yourself from search e...

JohnFen · 2025-08-08T19:33:41 1754681621

> That's what robots.txt does

It most certainly does not. robots.txt is almost totally worthless against genAI crawlers. Even being unindexed from search engines doesn't keep you safe.

throw10920 · 2025-08-09T14:27:51 1754749671

This is factually false.

There's ample documentation of crawlers straight-up ignoring robots.txt.

It's not a legal control, but a technical one - and a voluntary one, which means that it's trivial to ignore.

And there's obviously nowhere to put a robots.txt for a book that you've published.

jart · 2025-08-09T20:10:27 1754770227

The biggest, best, most reputable organizations e.g. Google, Bing, Yahoo, Yandex, Baidu, DuckDuckGo, OpenAI, and Anthropic have all publicly promised to respect your robots.txt file. You can make them hurt if they lie. So you know they're telling the truth. There's some people out there who don't respect robots.txt like Archive Team. However they're more likely to be treated as folk heroes here on Hacker News than trigger AI training fears.

lcnPylGDnU4H9OF · 2025-08-08T23:25:13 1754695513

That's a naive statement about robots.txt; nothing about it is binding or enforceable. It is a request that well-behaved crawlers heed. Other crawlers treat the Disallow section as a list of targets.