#microformats 2023-07-17

2023-07-17 UTC
angelo joined the channel
#
[aciccarello]
I saw a verge article with at least 3 different titles.
ur5us, milkii, btrem, vladimyr and [Jo] joined the channel
#
[tantek]
aciccarello, that sounds reasonable about only returning metaformats if there are no microformats (mf2 I'd say) on the page. I'm expecting that theoretically, there's some chance that invisible OGP *might* be more accurate than broken classic hentry (e.g. WordPress) on the page, though I do wonder if we should special-case WordPress for that.
#
[aciccarello]
Cool, I'll keep iterating on that and do testing on different sites to see when it makes sense to use the fallback.
#
GWG
The way [pfefferle] just changed it, it uses microformats then metaformats for any missing elements. But microformats always prevail
#
[tantek]
GWG, mf2 or even backcompat?
#
[tantek]
I'm open to that
#
[tantek]
(to both mf2 mf1 / all)
#
GWG
mf2, in none of my versions do I consider if it is backcompat.
#
GWG
php microformats enables it by default
#
[tantek]
got it, so in the parsed form it "just" looks like mf2 whether it came from actual mf2 or backcompat mf1
#
GWG
Exactly
gRegor, [jacky] and [pfefferle] joined the channel
#
[pfefferle]
and then it falls back to WordPress-API, Meta-Header (including OGP and twitter-cards) and finally JSON-LD: https://github.com/pfefferle/wordpress-webmention/blob/main/includes/class-handler.php#L22
#
[tantek]
makes sense that JSON-LD is the lowest, least quality / usage expected.
#
[tantek]
mf2 > metaformats > JSON-LD
#
[tantek]
presumably nothing else, like no generic RDFa or eRDF or microdata?
#
[pfefferle]
we have some dublin core (eRDF) https://github.com/pfefferle/wordpress-webmention/blob/main/includes/Handler/class-meta.php#L83 but there is no real Microdata/RDFa parser implemented yet.
#
[pfefferle]
but could be easily added
#
[pfefferle]
oh, there is a lib by [jkphl] but it's quite complex https://github.com/jkphl/rdfa-lite-microdata
#
Loqi
[preview] [jkphl] rdfa-lite-microdata: RDFa Lite 1.1 and HTML Microdata parser for web documents (HTML, SVG, XML)
#
[tantek]
yes point being what if any marginal benefit would there be to taking on the ongoing tech maintenance/debt?
#
[tantek]
like can you name any sites that would get better results because you added the code? if not, it's not worth adding, that's the point
[schmarty] joined the channel
#
[pfefferle]
I think most site owners would use JSON-LD instead of microdata or rdfa if they implement http://schema.org
#
[pfefferle]
The bigger problem is sites without any data, like gackerndes for example
#
[pfefferle]
Hacker-News
#
sknebel
German auto-carrot is fun :D
#
sknebel
(I was wondering if that was what you intended to write ...)
#
sknebel
(actually, if I'm going to be slightly mean, that's not a bad label, instead of "orange site" etc...)
#
[tantek]
wow the Google translation is not an inaccurate reference for HN
#
[tantek]
[pfefferle] see #indieweb-chat 🙂
cobypear, [pfefferle] and [jacky] joined the channel
#
[aciccarello]
I was wondering about JSONLD but I figured it would be a pain to parse. Pretty much everyone has some basic meta tags for social sharing.
#
[aciccarello]
I need to look at that WordPress code
aciccarello joined the channel
#
aciccarello
Just found http://microformats.org/wiki/link-preview-brainstorming which should probably be linked to metaformats
#
[tantek]
ok will do
#
[tantek]
only 6-8 year gap between those pages and /metaformats lol
#
aciccarello
I see the wordpress code has special parsing for site_name. Is there any mf2 representation for that?
#
aciccarello
I see some thoughts on https://indieweb.org/Site_name but otherwise it doesn't look like it.
#
gRegor
I think h-app.p-name might be the closest explicit mf2 corresponding to og:site_name
#
gRegor
But that's mostly used on app-type sites
#
gRegor
Just looked at CNN for an example, title element is "Breaking News, Latest News and Videos | CNN", og:site_name is "CNN"
sebbu2 joined the channel