#microformats 2018-01-13

2018-01-13 UTC
KartikPrabhu, iwaim___ and [keithjgrant] joined the channel
#
[keithjgrant]
looks like I have some weird <p> tag formatting inside it. that could be part of it.
#
aaronpk
that is weird
#
aaronpk
hah found the bug
tantek joined the channel
#
gRegorLove
Closed two of those php-mf2 issues. Commented on the third; need some feedback/direction: https://github.com/indieweb/php-mf2/issues/60#issuecomment-357400495
#
Loqi
[gRegorLove] @aaronpk Let's leave this open for now. I read https://github.com/indieweb/php-mf2/issues/60#issuecomment-228587873 as "authoring issue, ignore." However, I just tested the original HTML with #131 and it parses a space character for `updated`. I'...
#
gRegorLove
tantek, Zegnat, sknebel ^
#
gRegorLove
aaronpk ^^
#
gRegorLove
Node parser supports YYYYMMDD format, despite spec
#
KartikPrabhu
I propose a MYYDYMYD format
#
gRegorLove
mf2py returns empty array
#
tantek
more like "beyond spec"
#
gRegorLove
I propose you shu-- jk, KartikPrabhu ;) :D
#
gRegorLove
Typo. Meant: you're super
#
gRegorLove
tantek: aye, that's better phrasing
#
gRegorLove
Ruby parses "2014-12-04 00:00:00", adding time precision not authored.
#
tantek
gRegorLove: I'm not sure I understand the state of the issue
#
gRegorLove
Hehe, and Go gets the string inside the span: "4 December 2014"
#
gRegorLove
Just a moment
#
tantek
re: supporting YYYYMMDD - that's likely harmless
#
tantek
and if multiple parsers decide to adopt it, then that makes a case for adding it to the spec to reflect implementation reality
#
gRegorLove
My main question is: if parser doesn't support YYYYMMDD, should the property be ignored, or parsed as an empty string?
#
KartikPrabhu
good question
#
gRegorLove
Agreed it seems like a harmless enhancement to support the format. php-mf2 doesn't currently
#
Loqi
[gRegorLove] #76 Don't add time precision unless authored
[miklb], tantek, chrisaldrich, gRegorLove, nitot, [kevinmarks] and [mrkrndvs] joined the channel
#
Zegnat
Procrastination of the day, 2 issues files on value-class-pattern :)
#
gRegorLove
If I could get a review and some thumbs up (and no objections) on https://github.com/microformats/microformats2-parsing/issues/17, I'll update the spec
#
Loqi
[Zegnat] #17 Define removal of SCRIPT and STYLE elements everywhere textContent is requested.
#
gRegorLove
I'm probably in favor of deprecating value-title parsing, but YYYY isn't a valid format according to VCP date and time parsing.
#
gRegorLove
Which does seem a bit odd, but I don't know the history of VCP
[miklb] joined the channel
#
gRegorLove
Haha "Why might someone write some markup like this? Ask a consulting psychologist." in the VCP questions section.
#
Loqi
ahahahaha
#
Zegnat
Oh, I guess I should change the dt- in the test to p- then and rerun the parsers.
#
Zegnat
I just assumed this one was allowed without rechecking vcp date time parsing
#
Zegnat
Honestly I was just rewriting vcp for myself so I would know how I would want to implement it, and these 2 issues came up. See https://wiki.zegnat.net/media/mf2dom.html
KartikPrabhu joined the channel
#
gRegorLove
So close on this backcompat phpmf2 backcompat update, just one test giving me grief.
#
gRegorLove
.h-entry > .hfeed > .e-content is a pain, heh
#
Zegnat
Alright, why is that specific one a problem?
#
gRegorLove
Hard to explain. Method I'm working on is recursive, digging down to each mf root. It parses them for properties correctly based on backcompat or mf2. Upgraded backcompat elements are kept track of.
#
gRegorLove
.e-content gets ignored in the hfeed correctly, but gets caught by h-entry later, so the content property shows up.
#
gRegorLove
I need to retrace my steps; this test was passing the other day. I changed something to break it :)
#
Loqi
[sknebel] We just encountered this difference in handling between `mf2py` and `php-mf2` while trying to help debug Bridgy on a WordPress site with the following structure: ```html <body class="h-entry"> <div id="page" class="hfeed site wrap"> <h1 cla...
#
Zegnat
Hmm, I’d have to look at your current parsing code to comment I think. Not in a very clear code mood tonight though.
#
gRegorLove
No worries, I'll find it.
#
gRegorLove
yay, I think I just did.
#
gRegorLove
hard is recursion
#
Zegnat
One of the reasons I tried to write down the DOM based algo for VCP today :P
#
Zegnat
But I think you said you were doing XPaths?
#
gRegorLove
Yeah, php-mf2 has always used xpaths. I haven't deviated too much from that
#
gRegorLove
But now it's more precise: drill down to the most-nested mf, run backcompat to upgrade elements as needed, then use xpaths to extract the p-, e-, dt- etc. and parse them accordingly.
#
gRegorLove
Before it was more like, find a root mf, if it's an mf1 root run backcompat on anything inside it.
#
Zegnat
Interesting, so you go through the entire DOM at least twice? Once for upgrading backcompat, then for actual mf2 parsing?
#
Zegnat
He, I just realised my DOM algo for VCP can’t be literally copied into PHP as PHP’s DOM doesn’t seem to have a TreeWalker.
#
gRegorLove
Probably more than that, though it's not traversing every element. It's using xpaths a lot. If you have an hentry, it runs an xpath for each of hentry's properties to upgrade those elements.
[cleverdevil] and tantek joined the channel
#
Zegnat
Hmm, interesting. I wonder if a pure DOM traversing lib can be faster. That’s a project for next month though.
#
gRegorLove
It could probably be more efficient, for sure. No noticeable slowdown, though. Hoping this should fix 99.9% of weird mf2/mf1 combinations; definitely giving more correct results so far.
#
Zegnat
gRegorLove++
#
Loqi
gregorlove has 17 karma in this channel (210 overall)
[chrisaldrich], [stefp], [miklb], KartikPrabhu and ben_thatmustbeme joined the channel