#microformats 2023-10-19

2023-10-19 UTC
[tw2113] and jeremycherfas joined the channel
#
btrem
[tantek] There are issues specifically on alt="" and missing alt in the living spec.
#
btrem
There are several cases where authors should or must must alt="". There are a few cases -- maybe not quite edge, but certainly unusual -- where the author can omit the alt attribute.
#
btrem
The PR for the MDN `img` page was already merged. Several hours ago, in fact, and just maybe one hour after I created it.
#
btrem
I am quite impressed with MDN.
btrem, gRegor and gRegorLove_ joined the channel; strk left the channel
#
[tantek]
btrem++ that's great!
#
Loqi
btrem has 7 karma in this channel over the last year (10 in all channels)
[catgirlinspace], [Niklas_Siefke], [snarfed], eitilt and [jacky] joined the channel
#
[jacky]
sigh got stuck with some more yak shaving (this time the mf2 lib)
#
[jacky]
regarding rebuilding trees, I now wish that there was some GUID system for the DOM (that works on XML specifically)
#
[jacky]
I _almost_ concerned hashing every node's outer HTML and using that for a `data-mf-hash` property for a client side debugger
#
[jacky]
and a compatible parser could _generate_ those hashes independently (a client-side parser that uses an hash algo compatible with a non-browser based one)
#
[schmarty]
possibly heavy but that seems like it would be effective!!
#
[schmarty]
... unless there were multiple identical chunks (like repeated `h-entry`) in a page 🤔
#
aaronpk
Also hashing HTML/XML 😬
#
[jacky]
lol I _know_
#
[tantek]
lol I first read that as hashtagging HTML/XML 😂
#
[jacky]
there's also the "inject a random ID to every noticable element" (or just leaning on XPath)
#
[jacky]
this is kinda holding me back from moving forward with my projects, I want/need the parser to be compliant as much as possible 😭
#
[jacky]
I can say the tests for `h-adr`, `h-geo`, `h-product`, `h-recipe` and `rel` for MF2 all pass
#
[jacky]
well lol
#
[jacky]
those seem to be the easy ones
#
[jacky]
there's something about h-event->dates that keeps throwing off the parser (namely about colons and when to keep them)
#
[tantek]
dates are hard
#
sknebel
[jacky]: specifically vcp?
#
[jacky]
that I need to check, hold on
#
[jacky]
each of the colons in the TZ offset get lost
#
[jacky]
the line that I think I'm following too strictly is:
#
[jacky]
> However the colons ":" separating the hours and minutes of any timezone offset are optional and discouraged. If the offset uses XX:YY format, remove the colon so it is XXYY format. Omitting the colon makes it less likely that a timezone offset will be confused for a time.
#
sknebel
right, for non-VCP you should just pass through the string value from the attribute
#
j​kingweb
Exactly.
#
sknebel
so you are joining the two data paths too early
#
sknebel
(for vcp vs non-vcp)
#
j​kingweb
(Personally my parser does normalization everywhere and I just monkey-patch the tests to expect normalized output)
eitilt joined the channel
#
[jacky]
see, I think I'm in favor of normalization
#
[jacky]
only to standardize how that's meant to show
#
[jacky]
but I can't necessarily _enforce_ my will here
#
[tantek]
Given the consistent support for this change, I believe there is the possibility of parsers taking the initiative with normalizing datetimes: https://github.com/microformats/microformats2-parsing/issues/12
#
[jacky]
and there's value in showing it as the user intended
#
Loqi
[preview] [tantek] #12 should dt-* parsing do date and time parsing for all values?
#
[jacky]
that weird feeling when you've posted on that before lol
#
[tantek]
the normalization in vcp datetimes attempts to preserve user intent of the values therein
#
[jacky]
goes to throw a wrench in this
#
[tantek]
"I think I'm in favor of normalization" <-- hopefully to comment something like this on the issue?
#
Loqi
[preview] I mentioned before how this is a upstream blocker to get the Rust library fully compatible. That's changed but normalization would simplify the act of parsing (and testing) date values, thus me throwing my vote in favor of it and curious to hear if a...
#
[jacky]
hm I also need to do some sniffing when it's being parsed from JSON into the normalized format
#
[jacky]
but that's easy
#
[tantek]
jacky++ sounds good
#
Loqi
jacky has 6 karma in this channel over the last year (50 in all channels)
#
[tantek]
checks the change control process
#
[tantek]
alright I've reviewed the comments on #4 #8 #12 and believe there are no objections except one from Zegnat which I think is mistaken about the impact of the changes described therein (all implied dates and timezones are scoped to properties in the same object. e.g. a timezone (or date) on a dt-* property in one object cannot affect / imply a timezone (or date) on a dt-* property in another object, and no one is mixing floating timezone and
#
[tantek]
fixed timezone times in the same object — there is no use-case that is being harmed )
#
[tantek]
which means next step for these issues per the mf2 parser change control is "Encourage and get 1+ implementation(s). Encourage, get, and document 1+ implementation(s) of implementation affecting aspects of a proposed resolution, preferably with a test case if applicable."
#
[jacky]
rushes to make changes
#
[jacky]
I feel like the php or python ones would be more likely to have change here
#
[jacky]
I'd have to update the rust website as well, I figure, to match this too
#
[jacky]
side-note: that'd be a great place to consider adding a "debugger" as a side page 🤔
#
[tantek]
to be more clear, we already have "Encourage and get 1+ implementation(s)." on #4 and #8 so for those I need to do the next step "Resolve by implementation verified rough consensus" and write up a resolution on those and share that in #microformats
#
[tantek]
what we need is "Encourage and get 1+ implementation(s)." for #12 and yes jacky, the Rust implementation would absolutely count for that.
#
[tantek]
(#4 and #8 are implemented by php mf2)
#
j​kingweb
For what it's worth my parser (https://packagist.org/packages/mensbeam/microformats) already does universal date normalization (optionally, on by default), so there's an implementation already.
#
[tantek]
jkingweb++ great! can you note that in a comment on #12 ?
#
Loqi
jkingweb has 5 karma over the last year
#
[tantek]
also, how did/does it handle the implied question in this comment from gRegor? https://github.com/microformats/microformats2-parsing/issues/12#issuecomment-1175632145
#
Loqi
[preview] [gRegorLove] I found some more edge cases that this spec update should cover: > * if the value has a specific ISO8601 date, time, and timezone, use those and stop looking for "value" elements. ```html <div class="h-event"> <span class="dt-start"> <...
#
gRegor
jkingweb++ nice, didn't realize you had a parser. Add it to https://microformats.org/wiki/microformats2#Parsers! Let me know if you get any edit issues about adding external links
#
Loqi
jkingweb has 6 karma over the last year
#
j​kingweb
I had been struggling how to understand how to fix some problems in the PHP parser, and in my quest to understand I ended up just writing a whole parser. 😛
#
[tantek]
then I need to answer Zegnat's question/issue/concern, then wait to see if others agree (rough consensus), and then I think we can declare a Resolution on all three (#4 #8 #12) all at once and I can edit the spec accordingly!
#
[tantek]
Appreciate the nudging by y'all. Let's get this fixed.
#
gRegor
I'm excited for #12
#
[tantek]
gRegor do you have a pref for whether the "T" is normalized to a space or not or the ":" removed from a timezone in the cases you noted in your comment I just linked ^ ?
#
j​kingweb
[tantek] Were you asking how I handle ISO 8601 date-times with "T" separator? The date is parsed into components and output with a space separator instead. I thought the VCP steps were clear in saying that you stop _looking_ for dates, not that you stop processing. The normalization steps are separate from the parsing steps (they're not in the bullet list), so they should be done in all cases.
#
[tantek]
^ that helps jkingweb. Thank you. gRegor WDYT of that interpretation in the context of your comment?
#
gRegor
I'm in favor of no "T" and no colon in TZ offset, so no change to this section https://microformats.org/wiki/value-class-pattern##If+by+parsing+the+%22value%22+element(s)
#
[tantek]
Sounds good. Going with implementation initiative there per jkingweb is my pref too
#
j​kingweb
I'm in favour of omitting colons in zone offsets, too, by the way. My parser normalizes with colons currently, but only because it made more tests pass.
eitilt and [KevinMarks] joined the channel
#
j​kingweb
(I'd also be in favour of transforming Z into +0000, but I expect that might be an unpopular opinion)
#
j​kingweb
Rationale: it would cut the number of date/time formats downstream consumers of the JSON have to deal with from seven to five, with no loss of information.
#
[tantek]
Those sound good. Can you add them as comments to # 12?
#
j​kingweb
Sure thing.