2023-10-12 UTC
# [snarfed] interesting. for a while now, I'd accepted the gospel that the fediverse blocks web crawlers in robots.txt. out of curiosity, I looked at a few robots.txt files today, and evidently some servers do block by default, but many don't. eg evidently Mastodon and Lemmy allow web crawlers by default, eg https://mastodon.social/robots.txt , https://lemmy.ml/robots.txt