Cloudflare turns AI against itself with endless maze of irrelevant facts

Instead of simply blocking bots, Cloudflare’s new system lures them into a “maze” of realistic-looking but irrelevant pages, wasting the crawler’s computing resources. The approach is a notable shift from the standard block-and-defend strategy used by most website protection services. Cloudflare says blocking bots sometimes backfires because it alerts the crawler’s operators that they’ve been detected.

One question that’s been crossing my mind lately–what will this do? A couple years ago, we worried about model collapse, but if anything, it’s gone the other way now. Claude is trained on its own data, modulated by researchers, for example.

The effect may be that it just separates out the shops that can afford advanced scrapers and inefficient compute versus the ones that can’t, and the veterans (who have already built up their datasets) versus the newcomers.

So continues the training data arms race, I suppose.