#microformats 2023-08-25

2023-08-25 UTC
eitilt, angelo, Loqi, GWG, ehmry, plantroon, gRegor, btrem and [jacky] joined the channel
ah gotcha!
[tantek], btrem, SigmundurM, [david], barnaby and eitilt joined the channel
gets into the plumbing of implying a name for an item
By the way I wrote a bunch of tests for that, if it helps.
nice! I did see your PR about that (gregorLove brought them up)
I've been using the main branch but I'm down to flip over the Rust library to experiment with this once the current suite is fixed 🎉
(I also have even more tests in another branch)
Is "Mon May 16 20:41:45 2022" a valid timestamp to use in MF2 parsing?!
this has been tripping up the parser as it thinks it's not a timestamp and uses it as its _name_ 😆
mf2 datetime parsing can be a pain in the ass
it’s pretty much the only part of the spec where you have to care exactly what the values you’re dealing with are
especially when parsing value-class
I'm tempted to do a little hack and check if the potential name matches an associated dt name value 😆
it's so picky! barnaby
but this is good
I _can_ do something to catch this format, which isn't a big deal but I'd want to put it behind a parsing flag
okay it's not but the parser's using the inner text of the element _as_ the name (great)
[jacky], What's the HTML you're testing?
but looks like this is a parser issue
been narrowing down where this is happening
Ah. yeah it should use the `datetime` attribute on that, not the innertext
For some reason, it's not skipping the `time` element when extracting the text 🤔
Guessing I should ignore any children that have a property class name in its name as well
Is this for implied name, or something else?
implied name, yeah!
I haven’t looked at the parsing spec for a while, but last I remember names are not implied for items which have explicit properties
That microformat should have no implied name, because it has an e-content property.
🔔 then that means this is doing this too early like vika mentioned earlier
> if no explicit "name" property, and no other p-* or e-* properties, and no nested microformats,
ah but nothing about no other `dt-` properties
which _shouldn't_ be an issue
probably, yep. last time I worked on the PHP parser, deriving implied properties were one of the last stages
but I think it's the parser pulling it out still
hmm that’s true. I’m not sure if that’s just a parsing spec oversight, or if we ever made a specific decision about it. I’d err on the side of leaving the text content of child dt-* elements in, as the name will probably make more sense with them than without
I wonder what use cases there are for implied name on items with explicit dates
I can see the case of `<div class="h-entry">soon it will be <time datetime="$TOMORROW_DATE" class="dt-published">tomorrow</time></div>` needing to be parsed as "soon it will be tomorrow"
hm okay this is an order issue thing, that I can shift
true, it could be useful for minimal event markup
seems reasonable
that fixed it for vika's test case
(the names match the files used in `microformats/tests` for ease of discovery
whew this is a bit of whack a mole lol
going to need _way_ more tests for this
good thing I got an hour to lollygag for a bit
Yeah, that minimal case should get the text inside <time> as part of textContent: http://php.microformats.io/?id=20230825232021587
that worked!