jsheard 5 hours ago

> "Google-Extended" is their scraper

FYI Google-Extended isn't a dedicated scraper, you'll never see requests coming from that user agent so that rewrite rule won't do anything. When GoogleBot parses robots.txt it looks for Google-Extended and those rules are used to determine whether or not the data scraped by GoogleBot can be used for training. Just throw this robots.txt on your site in addition to those rewrite rules to cover all bases.

https://raw.githubusercontent.com/ai-robots-txt/ai.robots.tx...