#microformats 2023-04-11

2023-04-11 UTC
mouse[d], IWDiscordRelay, [timothy_chambe], btrem and corlaez joined the channel
Loqi: why no preview
<c​apjamesg#4492> 🤦
oof, seems indeed endless and not just stack explosion
IWSlackGateway joined the channel
dont have time to dig into it right now though :/
IWSlackGateway and [KevinMarks] joined the channel
That is pathological markup. Python parsed it but it makes no sense https://gist.github.com/kevinmarks/7e6bab548e943e1264ac35c797d7e543
of course it doesnt make sense as mf2, but you only can know that after parsing
What benighted template language generated that?
my first guess is "colibri page builder" for WP
but even if it were an artificial example, endless loop is not the right thing to do :D
That is... Eek.
Has there been a h-section-fluid-container all this time that I didn't know about?
The close brackets only just fit on my laptop screen.
gRegor and [tantek] joined the channel
no that's a made-up h-*
sounds like a good "evil" test case to add once it's been fixed
i'm really curious what is the minimum markup that can cause this bug
[schmarty] joined the channel
[chrisbergr] joined the channel
btrem and [jacky] joined the channel
that seems to parse okay with the Rust one
this is one of those kind of scenarios where it _might_ help to have a limit on what classes to expect for `h-` objects, perhaps?
though the output looks wild https://imgur.com/a/kEumEr3
[marksuth] and ur5us joined the channel
I *think* one could do limiting after parsing. would be interesting to experiment with what "works"
although we dont have a good collection of real examples sadly
An h-* item with no properties, only children, could be stripped?
pretty sure there are some h-feeds with no properties in the wild
Is there a case for items that only contain children?
ur5us and [snarfed] joined the channel
a page which is a list or collection can be represented as an h-feed, no properties, and a bunch of h-entry children
ur5us and gRegor joined the channel
aaronpk: do you know if the PHP parser does DOM node transversal or is it doing some selector logic?
The Rust parser takes the DOM walking approach (it's harder IMO but I think it prevents this looping case)
no idea
2k lines for the parser
(including tests tho, actually)
gRegor joined the channel