• #dev 2023-10-12
  • Prev
    Next
  • #indieweb
  • #dev
  • #wordpress
  • #meta
  • #stream
  • #microformats
  • #known
  • #events
#dev ≡
  • ←
  • →
2023-10-12 UTC
# 13:42
[snarfed]
interesting. for a while now, I'd accepted the gospel that the fediverse blocks web crawlers in robots.txt. out of curiosity, I looked at a few robots.txt files today, and evidently some servers do block by default, but many don't. eg evidently Mastodon and Lemmy allow web crawlers by default, eg https://mastodon.social/robots.txt , https://lemmy.ml/robots.txt