2024-06-05 UTC
# capjamesg[d] I can't remember exactly what I did, but I have mentally noted things like exponential back-off, strong testing for canonicalisation, respecting 429s / higher incidence rates of 500s, respecting Retry-After, crawling multiple sites at once rather than crawling each one sequentially (and thus moving all your crawl capacity to one site at once).