VioletPixel

๐Ÿšซ Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives

Cloudflare:

We are observing stealth crawling behavior from Perplexity, an AI-powered answer engine. Although Perplexity initially crawls from their declared user agent, when they are presented with a network block, they appear to obscure their crawling identity in an attempt to circumvent the websiteโ€™s preferences. We see continued evidence that Perplexity is repeatedly modifying their user agent and changing their source ASNs to hide their crawling activity, as well as ignoring โ€” or sometimes failing to even fetch โ€” robots.txt files.

This is my shocked face.

I've set up a robots.txt file on this site which tells all the generative AI bots (including Google) to fuck off, but I'm under no illusions that any of them are obeying it.

That's fine, though. There's more than one way to discourage an unwelcome bot or scraper.