#microformats 2023-04-11

2023-04-11 UTC
mouse[d], IWDiscordRelay, [timothy_chambe], btrem and corlaez joined the channel
#
aaronpk
Loqi: why no preview
#
Loqi
pong
#
IWDiscordRelay
<c​apjamesg#4492> 🤦
#
sknebel
oof, seems indeed endless and not just stack explosion
IWSlackGateway joined the channel
#
sknebel
dont have time to dig into it right now though :/
IWSlackGateway and [KevinMarks] joined the channel
#
[KevinMarks]
That is pathological markup. Python parsed it but it makes no sense https://gist.github.com/kevinmarks/7e6bab548e943e1264ac35c797d7e543
#
sknebel
of course it doesnt make sense as mf2, but you only can know that after parsing
#
[KevinMarks]
What benighted template language generated that?
#
sknebel
my first guess is "colibri page builder" for WP
#
sknebel
but even if it were an artificial example, endless loop is not the right thing to do :D
#
capjamesg
That is... Eek.
#
capjamesg
Has there been a h-section-fluid-container all this time that I didn't know about?
#
capjamesg
The close brackets only just fit on my laptop screen.
gRegor and [tantek] joined the channel
#
[tantek]
no that's a made-up h-*
#
[tantek]
sounds like a good "evil" test case to add once it's been fixed
#
aaronpk
i'm really curious what is the minimum markup that can cause this bug
[schmarty] joined the channel
[chrisbergr] joined the channel
btrem and [jacky] joined the channel
#
[jacky]
that seems to parse okay with the Rust one
#
[jacky]
this is one of those kind of scenarios where it _might_ help to have a limit on what classes to expect for `h-` objects, perhaps?
#
[jacky]
though the output looks wild https://imgur.com/a/kEumEr3
[marksuth] and ur5us joined the channel
#
sknebel
I *think* one could do limiting after parsing. would be interesting to experiment with what "works"
#
sknebel
although we dont have a good collection of real examples sadly
#
[KevinMarks]
An h-* item with no properties, only children, could be stripped?
#
aaronpk
pretty sure there are some h-feeds with no properties in the wild
#
[KevinMarks]
Is there a case for items that only contain children?
ur5us and [snarfed] joined the channel
#
[tantek]
a page which is a list or collection can be represented as an h-feed, no properties, and a bunch of h-entry children
ur5us and gRegor joined the channel
#
[jacky]
aaronpk: do you know if the PHP parser does DOM node transversal or is it doing some selector logic?
#
[jacky]
The Rust parser takes the DOM walking approach (it's harder IMO but I think it prevents this looping case)
#
aaronpk
no idea
#
[jacky]
2k lines for the parser
#
[jacky]
(including tests tho, actually)
gRegor joined the channel