#dev 2022-11-26
2022-11-26 UTC
# IWDiscordRelay <jacky#7226> The concept of TOFU is helpful in some contexts
geoffo joined the channel
# IWDiscordRelay <jacky#7226> https://infosec.exchange/@atax1a/109400028861061670
# IWDiscordRelay <corlaez#3273> Hi, it has been a while but I have been pondering something. mf2 allows parsing of a limited number of web relevant data... but is there something similar in spirit (as in attributes or CSS in html) that allows you to parse generic data (extract data from html). What are the most popular standards for that? I am sure there must be some
geoffo joined the channel
neceve and gxt joined the channel
gxt and mro joined the channel
# [KevinMarks] Corlaez: depends what you're trying to do. Beautiful Soup will do a good job of letting you pull specific bits out of a page, but you are going to have to maintain a lot of mappings. There's granary.io which parses a lot of places into a common format, and there's https://github.com/postlight/parser which works on a lot of news sites. There's https://indieweb.org/XRay too.
# [KevinMarks] If you want previews rather than full content then parsing for Facebook metadata can work. Some libraries fall back on that if there's no mf2 support.
barnaby and mro joined the channel
# ash[m] gRegor: I'm hoping I've fixed my markup for reposts so it's parsing correctly for you now?
# ash[m] oh hmm nope I'm looking at it now and it's still weird
# barnaby hmm s/the test suite/monocle isn’t picking up you as being the reposter though https://monocle.p3k.io/preview?url=https://acegiak.net
# Loqi authorship is how to indicate who the author is for a post, and an algorithm that determines the author of a post https://indieweb.org/authorship
# ash[m] I do cause otherwise monocle picks up the avatar alt text
# ash[m] think I've got it pretty good now. including marking up the attachments in the reposts!
mro joined the channel
# ash[m] the alt-text on my avatar is "ash mcallan's avatar", which is why I then put the p-name separately?
# ash[m] it's not great alt text but if I just put "ash mcallan" as the alt text then I assume screen readers would be like "ash mcallan ash mcallan"
# ash[m] right? I don't know
# ash[m] yeah fair
# ash[m] I might remove the alt then
mro joined the channel
# ash[m] https://acegiak.net/o/6c1b8f5515414e149ba1ce62fdedfa14 we'll see how that goes
# ash[m] Ahhh that explains that
# ash[m] Cool
# ash[m] ok I've removed both the explicit pname and the alt-text
# ash[m] yeah i read the h-card page on the wiki but only in adhd mode so i don't know if it's explained in there but in a hidden way?
# ash[m] Now I just have to make sure I don't break it next time I pull from main 😝
mro and oenone joined the channel
# ash[m] Pushed a patch at least. Ball is in tsileo's court now
mro and MerlinStar joined the channel
# @endgameviable ↩️ Incidentally the more I work on this, the more I find other work on the Internet that already does blog-to-activitypub interop e.g. I just learned about https://fed.brid.gy/ (twitter.com/_/status/1596546391127597061)
mro, chenghiz_, gxt, geoffo and mro_ joined the channel
# [snarfed] yup, age old Mastodon problem. https://github.com/mastodon/mastodon/issues/4486#issuecomment-395076695
# Loqi [snarfed] i noticed this recently too. >1k requests in <45s, >25qps. not a disaster, my site handled it fine, but still, noticeable. small thread on it here: https://mastodon.technology/@snarfed/100119606571241751 , cc @ashfurrow @neekz0r.
<img width="530" ...
# [snarfed] unrelated, whoa, I just noticed that huffduffer regularly crawls rel-me links and shows them on your profile. eg I only gave it my web site, but it found everything else: https://huffduffer.com/snarfed
[jacky], geoffo and sp1ff joined the channel
# Loqi It looks like we don't have a page for "thundering herd" yet. Would you like to create it? (Or just say "thundering herd is ____", a sentence describing the term)
gxt joined the channel
mro and [snarfed] joined the channel
gxt joined the channel
# sknebel https://simonwillison.net/2022/Nov/26/productivity/ this by simonw is quite interesting about how he makes it easier for him to go back to old code. definitively a problem I recognize e.g. from my site code :D