#microformats 2018-03-24

2018-03-24 UTC
[mrkrndvs] joined the channel
#
tantek
hey parser devs do any of the parsers support h-entry backcompat handling of rel-tag? e.g. http://www.manton.org/2018/03/indieweb-generation-4-and-hosted-domains.html has a bunch of rel=tag that should be picked up as "p-category" but I'm not seeing them e.g. in PHP or Python parsers - but interestingly I'm seeing it Ruby parser results!
#
Loqi
[manton] IndieWeb generation 4 and hosted domains
#
Loqi
[manton] IndieWeb generation 4 and hosted domains
#
tantek
edited /h-entry (+75) "follow-up on proposed parser back-compat properties in GH issue #7"
(view diff)
#
tantek
created /specifications (+37) "r"
(view diff)
#
tantek
edited /Category:Specifications (+118) "ask questions on IRC"
(view diff)
#
tantek
created /rel-feed (+759) "stub with dfn, to be expanded to a spec, related spec materials, see also"
(view diff)
#
tantek
edited /Category:Draft_Specifications (+49) "IRC not mailing list"
(view diff)
#
tantek
edited /admin-to-do (+337) "/* wiki */ more spec styling"
(view diff)
#
tantek
created /Category:Stub_Specifications (+50) "link to related admin-to-do"
(view diff)
[snarfed] joined the channel
#
tantek
edited /rel-feed (+70) "HTML rel-alternate processing model"
(view diff)
[kevinmarks] joined the channel
#
Loqi
[manton] IndieWeb generation 4 and hosted domains
#
tantek
using which parser?
#
gRegorLove
mf2py does rel-tag backcompat
#
gRegorLove
php-mf2 has an issue filed for it and I'll have a PR this weekend
#
tantek
gRegorLove: is it possible the mf2py at microformats.io is out of date?
#
gRegorLove
Likely. I've been using KartikPrabhu's dev version: https://kartikprabhu.com/connection/mfparser#response
#
tantek
that begs the question of *.microformats.io maintenance
#
tantek
now that we're rapidly iterating on parsers
#
tantek
does it get autodeployed perhaps when a new version is "marked" or whatever?
#
gRegorLove
php.microformats.io is also behind
#
gRegorLove
I don't think so
#
tantek
hmm - that's worth investigating then
#
tantek
and preferably automating
#
tantek
maybe upon official versions being tagged?
#
gRegorLove
That would be nice
#
monquie25
edited /get-started (+1910) "/* your blog */"
(view diff)
#
monquie25
edited /get-started (-294) "/* events */"
(view diff)
#
KartikPrabhu
yes rel-tag in hreview backcompat is only in the new version of mf2py 1.1.0 which is not released to pypi yet
#
monquie25
edited /get-started (-317) "/* your website */"
(view diff)
#
monquie25
edited /get-started (+98) "/* products */"
(view diff)
#
tantek
uho oh
#
tantek
on it
#
tantek
lol hair product? seriously?
#
tantek
edited /get-started () "(-1376) Reverted edits by [[Special:Contributions/Monquie25|Monquie25]] ([[User talk:Monquie25|Talk]]) to last version by [[User:Zegnat|Zegnat]]"
(view diff)
#
tantek
edited /Special:Log/block () "blocked [[User:Monquie25]] with an expiry time of infinite (account creation disabled): Y THO"
(view diff)
#
aaronpk
I don't actually know how to update php.microformats.io
#
KartikPrabhu
veganstrightedge controls that
tantek and [jeremycherfas] joined the channel
#
gRegorLove
I suppose we could have php.microformats.io redirect to pin13.net?
[unoabraham], [kevinmarks], tantek and Garbee joined the channel
#
Zegnat
Microdata, no microformats :( ^^^
tantek joined the channel
#
aaronpk
I'd be happy to run php.microformats.io
#
aaronpk
I think it's on heroku right now? Not sure if there is any benefit to that
#
Zegnat
3 parsing errors in a single test case (implied-title) ... Alright, time to fok mf2/tests ...
#
Zegnat
s/fok/fork/
[kevinmarks] joined the channel
#
[kevinmarks]
You can make it automatically deploy on check-in that way
#
[kevinmarks]
As long as you send a pr to fix them
#
KartikPrabhu
aaronpk: you reach out to veganstriaghtedge on github to update php.microformats.io
#
Zegnat
[kevinmarks], I will be sending a PR, yes. Since php-mf2 pulls the official repo
#
KartikPrabhu
[kevinmarks]: would also be good to figure out how to push the new mf2py to pypi
#
Zegnat
[kevinmarks], it looks like you have commit access to the tests? Could you look into merging the two outstanding PRs?
#
aaronpk
I have sites that automatically deploy from github on my own servers too
[kim_landwehr] joined the channel
#
Zegnat
Why do the tests have a special datetime format? This seems wrong. Surely the mf2 spec provides us with the exact strings that should be used, so we should be testing for those?
#
sknebel
what do you mean?
#
KartikPrabhu
Zegnat: yes they shouldn't
#
KartikPrabhu
sknebel: the test repo uses a "normalised" datetime for the output
#
Loqi
[microformats] tests: Microformats test suite
#
sknebel
never noticed that before
#
Zegnat
E.g. I see a failing test because php-mf2 is correctly reporting the exact string used in the HTML, but the test case has stripped the T delimitor and replaced it with a space
#
Zegnat
is very tempted to start a test repo from scratch
#
sknebel
there's also an open bug about some tests having that, I think because at some point there was some work towards date normalization that never got anywhere?
#
sknebel
yeah, these were added by glennjones who I think also started normalization work in the JS parser
#
sknebel
so I'd guess that's the reason
#
sknebel
to have a way to test the normalization as well
#
sknebel
IMHO, fix the test case and remove the section from the readme
#
KartikPrabhu
in vcp datetime parsing http://microformats.org/wiki/value-class-pattern#Date_and_time_parsing the last step should also allow ordinal dates right?
#
Loqi
Value Class Pattern
#
Zegnat
You mean at the concatenating step?
#
KartikPrabhu
that is the ordinal date should not be "normalised" to YYYY-MM-DD
#
KartikPrabhu
I really don't want to write that normalising code
#
sknebel
I don't think so, no. e.g. am/pm timestamps aren't normalized either, but not listed as an example there, right?
#
sknebel
or are they normalized? haven't looked into that part too much
#
KartikPrabhu
mf2py normalises them
#
KartikPrabhu
i think for that reason
#
Zegnat
I think ordinal dates and am/pm needs to be normalised, yes.
#
KartikPrabhu
<sigh> holds off on adding ordinal dates to mf2py
#
sknebel
yes, on re-reading it says what todo with them
#
Zegnat
I think all the possible outputs listed in that last step are HTML5 valid date time strings
#
Zegnat
And ordinal is not valid in HTML5
#
sknebel
whereas the ordinal date part doesn't say anything on converting it
#
KartikPrabhu
what if the date is not valid at all i.e. 2013-400
#
KartikPrabhu
ot the am/pm time
#
Zegnat
No clue KartikPrabhu, haha
#
Loqi
nice
#
KartikPrabhu
13pm becomes what?
#
Zegnat
Nothing
#
Zegnat
pm “suffix to add 12 to HH value less than 12”
#
Zegnat
13 is not less than 12, should not add 12
#
sknebel
yeah, it sounds like that doesn't match as a valid time string then
#
Zegnat
I think all the am/pm normalisation is covered in the vcp
#
Zegnat
am: treat an HH value of 12 as 00.
#
Zegnat
pm: add 12 to HH value, if less than 12.
#
Zegnat
But I don’t know about invalid dates. And to be honest, we accept invalid dates through the datetime attribute anyway, as no normalisation is done.
#
KartikPrabhu
decides to back away from this mess
#
Zegnat
YYYY = YYYY + floor(DDD/365), DDD = DDD % 365
#
Zegnat
For normalisation? :P
#
KartikPrabhu
yeah that won't work to convert to MM-DD
#
KartikPrabhu
I don't use vcp so I'll just leave that
#
KartikPrabhu
just noticed "am" is case insensitive!
#
Zegnat
still thinks we should normalise all dt-* output
#
Zegnat
But I am not writing a spec proposal for that :P I like to steer clear of vcp
[kevinmarks] joined the channel
#
[kevinmarks]
The difficulty is not adding midnight to dates without times
#
[kevinmarks]
And also the birthday without year case
#
[kevinmarks]
Adding tests for these cases is probably worth it
#
[kevinmarks]
I ran a Facebook to contacts importer a while back, and it normalised birthdays without years to 2000, so they're all turning 18 this year
#
Zegnat
That’s true [kevinmarks]. But those are all supported now, I believe so not a problem
#
Zegnat
Although dates without years are not supported in vcp
#
Zegnat
(I think)
#
Zegnat
I am now in a loop of "composer update mf2/tests" and "./vendor/bin/phpunit"
#
KartikPrabhu
fixed AM PM conversion in mf2py experimental (still case sensitive though)
#
KartikPrabhu
<sigh> ordinal date conversion to MM-DD needs leap years
#
sknebel
KartikPrabhu: use datetime.strptime?
#
sknebel
or first of year + a timedelta
#
KartikPrabhu
now wonders if this whole thing can be done with strptime
#
Zegnat
I wonder where some of these tests are coming from. This one just missed an entire line of text in its `value`: https://github.com/Zegnat/microformats-tests/commit/853792b80848ff0ddfb0f762b3acef8d82bb828e
#
Zegnat
And of course, as this is about e-* parsing and entire blob with lots of whitespace, php-mf2 is still failing at it and I have no idea if it is the test or php-mf2 right now.
#
Zegnat
Because e-* parsing is incredibly hard to do by hand
#
Zegnat
Lets see if I can get PHPUnit to escape the whitespace, that might make it easier to catch these
[kevinmarks] joined the channel
#
[kevinmarks]
ok, merged those 2
#
[kevinmarks]
I think will's is out of date re whitespace.
KartikPrabhu joined the channel
#
Zegnat
Could be. It is just very hard to manually check what the right result should be for something like https://github.com/Zegnat/microformats-tests/blob/zegnat-master/tests/microformats-v2/h-entry/urlincontent.html
#
gRegorLove
KartikPrabhu: There's an issue to clarify the ordinal dates in final output https://github.com/microformats/microformats2-parsing/issues/27
#
Loqi
[gRegorLove] #27 vcp: Clarify ordinal dates in parsed result
#
gRegorLove
Zegnat, did you change your opinion on that?
#
Loqi
[Zegnat] I think ordinal dates and am/pm needs to be normalised, yes.
#
Zegnat
I just think that dt values should be normalised in general. But as a first step, no I have not changed my mind. If the spec allows YYYY-DDD for date, it should also allow it in the last concat step
#
Zegnat
Normalising everything to valid HTML5 datetime values seems like a future project.
#
gRegorLove
Ok, cool. We're in agreement.
#
Zegnat
Oh, I locally merged my rel-parsing PR with the central-tests PR, and I am happy to report that the rel-parsing PR makes php-mf2 pass some tests it didn’t before! :)
#
Zegnat
So some tests are usefull! Haha
#
Loqi
haha
#
Zegnat
Maybe add your thoughts on the ordinal date on the issue as well, gRegorLove?
#
Zegnat
Oh, I mean, KartikPrabhu
#
Zegnat
Too many implementers. This is a good thing. Haha
#
Zegnat
Oooh, this is interesting: https://indieweb.org/IndieArchive
#
Zegnat
Now I kinda want to add folders using that storage structure to the tests
gRegorLove_ joined the channel
#
gRegorLove_
rel-tag backcompat question for this example: <a href="http://fberriman.com/tag/conferences/" rel="tag">conferences</a>
#
gRegorLove_
"take the last path segment of their "href" value as a value for a p-category property"
#
gRegorLove_
Is that inclusive of the trailing slash?
#
gRegorLove_
E.g. the last path segment is "conferences" not "" right?
#
gRegorLove_
I know "conferences" is the intended p-category
#
Zegnat
Oh, hmm, that sounds like a spec mistake. I am pretty sure "" is the last path segment
#
Zegnat
So intent is probably “last non-empty path segment”
#
Zegnat
Possibly after resolving the URL (so you have taken out single-dot and double-dot path segments)
#
Zegnat
Although I might be berated for being too technical and theoretical on that resolving point again ;)
#
gRegorLove_
mf2py is correctly getting "conferences"
#
gRegorLove_
So I think as a minimum I will remove the trailing slash, then get the last segment.
#
Zegnat
conferences seems correct to me too. If the spec says last segment, the spec is wrong.
#
gRegorLove_
Can always iterate on it more later. :) This adds rel-tag to php-mf2 hentry parsing, so it's a good step forward.
#
KartikPrabhu
gRegorLove_: yes no trailing slash would be my thought too
#
Zegnat
Mind opening an issue? I don’t know where in the spec this is ...
#
gRegorLove_
Sure, this is on h-entry and now h-review.
#
gRegorLove_
In a little bit; working on the code currently.
#
Zegnat
Same here, trying to bring my plain text parsing to php-mf2 :D
#
gRegorLove
Hm, PHP's DOMDocument doesn't let you create HTML5 elements like <data> it appears?
#
gRegorLove
Trying to do something like mf2py, create the <data class="p-category"> and append it.
#
gRegorLove
Yeah, that's what I'm using.
#
KartikPrabhu
huh. so it can create <a> but not <data> ?
#
KartikPrabhu
gRegorLove: another way would be to create a <abbr> and put the value in the title attribute
#
KartikPrabhu
or link>title
#
gRegorLove
nvm, appears to be working. Issue is with my appending.
#
Zegnat
You are not the only one struggling with DOMDocument tonight, gRegorLove.
#
Zegnat
I just spent 15 minutes tearing my hair out :P
#
gRegorLove
TIL "C14N() returns an empty string if the node is not included in the document tree"
#
gRegorLove
Because, of course.
#
gRegorLove
Anyway, progress! Think I've got hentry rel-tag working.
#
Zegnat
tagName property isn’t uppercased in PHP, but it should be for HTML per DOM spec. But of course, PHP DOM doesn’t actually know HTML so it doesn’t conform.
#
Zegnat
That was mine
#
Zegnat
tests++
#
Loqi
tests has 3 karma in this channel (18 overall)
#
Zegnat
Already found 1 mistake in my algo
nitot, [jeremycherfas], [snarfed] and [kevinmarks] joined the channel
#
Loqi
[gRegorLove] #164 rel=tag hentry/hreview backcompat
#
gRegorLove
Heading out to enjoy some sun! o/
#
Zegnat
Feel free to tag me as reviewer whenever, gRegorLove. Or do I need to be added to the repo for that? (Which you can also do.)
#
Zegnat
Will check it out in a bit. Enjoy the sun!
[cleverdevil], [gerwitz] and [jjdelc] joined the channel
#
Zegnat
Weirdest bug I've seen today, php-mf2 adds a \n into the html property during e-* parsing: http://pin13.net/mf2/?id=20180324221557272
#
Zegnat
Test and PR follow tomorrow.
uf-wiki-visitor joined the channel
#
uf-wiki-visitor
How
#
uf-wiki-visitor
What's up
#
uf-wiki-visitor
Love you
[kevinmarks], [kimberlyhirsh] and [miklb] joined the channel
#
gRegorLove
I'm not sure how to add you to the indieweb org
[snarfed] joined the channel
#
gRegorLove
Looks like I don't have permission. aaronpk or tantek might need to.