#microformats 2023-10-29

2023-10-29 UTC
[0x3b0b], eitilt, IWSlackGateway, eitilt1, [tantek], bjoern, Sophie, [calumryan], jkphl, Sonja, [jeremycherfas], dervondenbergen, [sebsel], [bjoern], plantroon, tom and barnaby joined the channel
#
Zegnat
New mf2 parsing change proposal, because my h-card is growing, and apparently nested h- on dt- properties is iffy: https://github.com/microformats/microformats2-parsing/issues/71
#
Loqi
[preview] [Zegnat] #71 Use dt-* property from a nested microformat as the value for a parent dt-* property
tom joined the channel
#
Zegnat
PR launched for the typescript microformats parser: https://github.com/microformats/microformats-parser/pull/279
#
Loqi
[preview] [Zegnat] #279 Fix/nested within dt
IWSlackGateway and [tantek] joined the channel
#
[tantek]
!tell barnaby if you're still around, can you file the issue on https://github.com/microformats/microformats2-parsing/issues requesting that the 'lang' attribute on the <html> element in particular be added to the "parse a document for microformats section"? https://microformats.org/wiki/microformats2-parsing#parse_a_document_for_microformats
#
Loqi
Ok, I'll tell them that when I see them next
[KevinMarks] joined the channel
#
barnaby
[tantek]: are you aware of any specification or guidance that the <html> (vs e.g. <body>) element is the best place to put the lang attribute to indicate the page-wide language? browsing a few large sites it definitely seems to be commonly used, but I can’t find an “official” source for it
#
Loqi
barnaby: [tantek] left you a message 46 minutes ago: if you're still around, can you file the issue on https://github.com/microformats/microformats2-parsing/issues requesting that the 'lang' attribute on the <html> element in particular be added to the "parse a document for microformats section"? https://microformats.org/wiki/microformats2-parsing#parse_a_document_for_microformats
#
barnaby
thanks [KevinMarks]!
#
Loqi
[preview] [barnabywalters] #72 Parse document language from <html lang=""> attribute
#
j​kingweb
Note that the HTML specification does require user agents to consider language from <meta> (though its use by authors is invalid), and if all else fails HTTP. See "Otherwise" step in algorithm after note here: https://html.spec.whatwg.org/multipage/dom.html#the-lang-and-xml:lang-attributes
[calumryan] joined the channel
#
barnaby
I also updated #3 with a summary and example of the new proposal https://github.com/microformats/microformats2-parsing/issues/3#issuecomment-1784148470
#
Loqi
[preview] [barnabywalters] As discussed at the [2023 Nürnberg mf2 parsing issues session](https://indieweb.org/2023/Nuremberg/mf2): this proposal should be expanded to apply to all properties, not just h-* and e-*. So, the following (contrived) HTML ```html <article class...
#
barnaby
good point jkingweb, I added a comment linking there
#
barnaby
looking at the values of HTTP headers is outside the scope of the microformats parsing spec (although parsers which implement fetching could offer it as an additional non-standard feature), but looking for a language in <meta content> is quite reasonable IMO
eitilt joined the channel
#
[tantek]
we don't have to do all the stuff that is invalid for authors
#
[tantek]
because all that stuff is for back compat for old pages
#
[tantek]
if an author is publishing mf2, then they can be expected to be using better markup in general
#
[tantek]
barnaby, re: "any specification or guidance that the <html> (vs e.g. <body>) element is the best place to put the lang attribute" — the W3C Validator will give you a *warning* if your <html> tag lacks an explicit 'lang' attribute as well
#
[tantek]
I would be opposed to looking for lang in meta
#
barnaby
yep that’s a good point, no need for back-compat in new mf2 parser feature
#
barnaby
ah and it’s mentioned in the HTML spec on the <html> element page, I only looked at the lang attribute page https://html.spec.whatwg.org/#the-html-element
#
[KevinMarks]
IIRC, there are a fair few web pages that declare the lang as 'en' but actually contain other languages, because of CMS defaults. This was one reason Google started guessing language from content in Chrome etc.
#
barnaby
sounds likely, but that sounds out of the scope of the mf2 parsing algorithm. parsers just need to make the marked-up information available, it’s up to the consumer to decide what to actually do with it
#
[tantek]
Agreed Barnaby
neceve, [jacky], angelo and mustastum joined the channel