#dev 2021-07-29

2021-07-29 UTC
angelo joined the channel
#
GWG
aaronpk: Microsub doesn't have a location property except for check-ins
#
GWG
It only allows for it in h-cards
#
vikanezrimaya
one more reason to hate JF2
#
vikanezrimaya
less expressiveness
#
GWG
vikanezrimaya: JF2 allows for it, Microsub has a limited vocabulary
#
vikanezrimaya
Microsub, granary.io and webmentions.io are basically the three places that use JF2 and I'm not even sure about Granary
#
vikanezrimaya
AFAIK it's not used anywhere else
#
GWG
I use it
#
vikanezrimaya
where do you use it btw?
#
[snarfed]
no jf2 in granary yet. pull requests welcome! https://github.com/snarfed/granary/issues/109
#
GWG
I simplify to jf2 when parsing
#
GWG
I like the structure
[jeremycherfas], hendursaga and capjamesg joined the channel
#
petermolnar
following the publish-in-pdf article, I decided to see if there's a simple way to pre-generate prints for my site and I found: https://weasyprint.org/ - this things is beautiful, it parses print CSS as well, except for srcset rules, sadly.
#
petermolnar
What I didn't realize is that I applied this to my category pages as well, so now I have a certain 35MB file with 425 pages (!) for the whole of my photo category page.
hendursa1 and capjamesg joined the channel
#
capjamesg
Morning everyone!
#
capjamesg
Today is "deploy my webmention endpoint on digital ocean" day :D
nekr0z, reed, calebjasik, lasr[m], Abhas[m], SamWilson[m], vikanezrimaya, Lohn, astralbijection[, LaBcasse[m], cambridgeport90[, benatkin, capjamesg and ben_thatmustbeme joined the channel
#
sknebel
anyone happen to know if there is a tool that can easily scrape the JSON-LD islands out of a list of URLs?
hendursaga joined the channel
#
petermolnar
doesn't render on lynx?
#
sknebel
I want the data inside the islands :D
#
sknebel
and before I get to parsing html myself ...
#
sknebel
(I guess first stage some agressive grep-use would kinda help)
#
petermolnar
the more I think about it the more I'd recommend beautifulsoup -> filter <script> with type="application/ld+json"
#
petermolnar
anything else will be super messy
#
petermolnar
(and keep it in mind that I like regex)
#
capjamesg
I can speak for bs4 for html parsing.
#
capjamesg
I use it for my search engine crawler and it's really easy to use.
#
Zegnat
Just make sure to turn on the html5 parser in bs4, or however you do it, because HTML parsing hard
capjamesg, KartikPrabhu, cadeyrn[d], Lohn, Seirdy, [jgmac1106] and [cleverdevil] joined the channel