#microformats 2014-07-08

2014-07-08 UTC
Rastus_Vernon and gRegor` joined the channel
#
bret
say i have an html table… is there a microformat (or another tool) that lets me parse that table into json or csv or something?
#
tantek
bret - I believe you're looking for the DOM?
#
tantek
or the Table OM ?
#
bret
hrmmm maybe
robmorrissey, statonjr, gRegor`, tantek, JohnBeales, hongpong, brianloveswords, eschnou, shaners_, ChiefRA, pfefferle, chiui, RCheesley_, adactio, pfefferle_ and TallTed joined the channel
tantek joined the channel
#
tommorris
hey tantek - check out this bit of silliness. http://jsonresume.org/
#
tantek
tommorris - lol - perhaps add to microformats.org/wiki/resume-formats ? ;)
#
tantek
hey at least it's not humans.txt ;)
#
tommorris
I’m just writing a particularly gnarly bug report for them.
#
tommorris
Basically blasting them with all the issues around names they haven’t thought about.
#
tantek
tommorris - look they're on #jsonresume ;)
JohnBeales and brianloveswords joined the channel
#
tommorris
oh, wow, they have an open issue for internationalisation - for countries where there are multiple official languages like Canada with en_CA and fr_CA
#
tommorris
if only there were some way of representing that in HTML
#
tommorris
oh, and their attempt at representing addresses is a complete mess too.
#
tantek
tommorris: it's like some college interns decided to do a summer project or something
#
tantek
tommorris - hahahaha they're renaming names of naming fields. yo dawg....
#
tantek
I heard you like renaming...
#
tantek
however, nice to see that other folks using hCard as a point of design in that issue
#
tommorris
as terrible standards go, this is definitely top 5 for me.
#
tantek
tommorris: that URL for me "We're sorry, but something went wrong."
#
tantek
<title>We're sorry, but something went wrong (500)</title>
#
tommorris
Hmm, odd.
#
tommorris
investigates.
#
tantek
tommorris - you must be bored
#
tantek
you could instead open an issue saying they should close their effort and "just" use h-resume
#
tantek
if they want a JSON version, just used the parsed output of mf2py or phpmf2
#
tantek
done and done
#
tommorris
tantek: can you hit refresh on that post of mine?
#
tantek
400 Bad Request / nginx/1.4.6
#
tantek
3rd refresh it worked
#
tantek
funny that you remember such hilarity from 2008
#
tantek
this is definitely good for some LOLs: http://www.w3.org/2005/Incubator/cwl/XGR-cwl/
#
tantek
"a unambiguous formal language playing the same role of natural languages for humans." - because so many natural languages have an unambiguous role. lol.
#
tantek
editors mostly from "Institute of Semantic Computing" - sounds legit.
#
tommorris
I wonder if I could break CWL by feeding it innuendo and double entendre.
eschnou joined the channel
#
tommorris
tantek: all this sillness reminds me, need to find time to go through and make sure the microformats-2 versions of specs have as detailed examples as the original specs
#
tommorris
given people will want to copypasta them into their websites.
#
tantek
I tend to think of the examples priority as somewhat like:
#
tantek
1. ultra minimal simple example - for someone to get how simple it can be
#
tantek
2. fairly "complete" example that makes use of all (or as much sensibly as possible) of the features in a single example
#
tantek
3. examples in between those based on real world use-cases
#
tantek
what do you think of those?
#
tommorris
I was generally impressed with most of the examples on the classic pages
#
tantek
I wasn't
#
tantek
(even though I tried to do a good job on them :/ )
#
tantek
trying to be more deliberate / methodical with h-* examples
#
tommorris
perhaps my bar for being impressed has been set very low by reading too many W3C specs. ;-)
dwayhs and hober joined the channel
#
tantek
edited /web-sign-in (+429) "susequent sign-ins, input type=url"
(view diff)
JohnBeales, caseorganic, caseorga_, chiui, statonjr_ and jgarber joined the channel
#
jgarber
Hello, all. A question about microformats-2 parsing rules. I'm using the <template> element which includes some h-entry class names: https://github.com/jgarber623/sixtwothree.org/blob/master/src/_layouts/post.html#L54-L66. By their nature, <template> elements are hidden and (particularly in my case) contain incomplete markup. Yet, microformats parsers include this in the output (see: http://news.indiewebcamp.com/parse
#
jgarber
?source=http://sixtwothree.org/blog/now-accepting-webmentions/ ). Is this the desired behavior? Or, should parsers ignore content in <template> elements?
#
tantek
microformats parsers do not treat <template> in any special way
#
tantek
that is - it's like <span>
#
tantek
so what should happen?
#
tantek
and who supports <template> now?
#
jgarber
And with some polyfilling, reasonably modern IE and Safari.
#
jgarber
I think the desired behavior would be to ignore content within <template> tags given the likelihood that the markup within is incomplete (as in my use case).
eschnou joined the channel
#
tantek
so a "must ignore" then?
#
tantek
like comments?
KartikPrabhu joined the channel
#
jgarber
I believe so, yes.
#
tommorris
wonders what the hell <template> is for.
pfefferle joined the channel
#
tommorris
these here HTML5 elements are popping up like crazy.
#
jgarber
tommorris It was new to me, too. According to MDN:
#
jgarber
The HTML template element <template> is a mechanism for holding client-side content that is not to be rendered when a page is loaded but may subsequently be instantiated during runtime using JavaScript. Think of a template as a content fragment that is being stored for subsequent use in the document. The parser does process the content of the <template> element during the page load to ensure that it is valid, however.
#
jgarber
It's a weird intersection of content and behavior and I'm not 100% sure how I feel about it. But... it's in the spec.
#
tommorris
Odd. I was looking for something similar recently, but that’s not it. ;-)
#
KartikPrabhu
why are people pushing templating into HTML?
#
Hixie
microformat parsers that parse HTML according to the HTML parsing rules shouldn't see the contents of <template> elements
#
Hixie
the microformat spec shouldn't need to mention them
#
tommorris
wonders what the HTML parsing rules are… this week. ;-)
#
Hixie
(stuff between <template> tags don't end up in the DOM)
#
Hixie
(they end up in a side document)
#
Hixie
tommorris: hey, we have them now. this is a significant step up from a few years ago.
#
tommorris
true dat.
#
tantek
Hixie, thanks for the clarification. I'll make an informative note to that extent then. Could be worth a test case.
#
jgarber
tantek: You could run http://sixtwothree.org/blog/now-accepting-webmentions/ through any of the existing parsers and observe an incomplete h-entry pulled from a <template> element.
#
jgarber
A lone example, for sure, though.
#
tommorris
tantek: definitely worth a test case, and given that a fair few parsers are probably simply using some kind of permissive-tagsoup parse rather than a full browser-style page parse, templates with microformats classes potentially could break them.
#
tantek
thanks jgarber - a lone example will suffice for now
#
jgarber
Sure thing! Happy to help out.
#
tantek
edited /microformats2-parsing (+416) "explicit reference to HTML parsing rules at start, add note HTML parsing rules section and template example, and jgarber test case in the wild"
(view diff)
jgarber joined the channel
#
tantek
tommorris - could you file that on phpmf2 as well?
caseorganic joined the channel
#
KartikPrabhu
is there a list of tags that are supposed to be ignored in HTML parsing?
#
tommorris
KartikPrabhu: well, ideally you should follow the HTML parsing rules in the spec.
#
KartikPrabhu
yes... to follow the parsing rules, I want a list of tags to be ignored
#
Loqi
gives KartikPrabhu a list of tags to be ignored
#
KartikPrabhu
unless I trust my HTML parser to do the right thing which has been tricky
#
tommorris
I’ll probably hack the template ignore rule into mf2py until html5parser for python gets updated to support it.
#
KartikPrabhu
tommorris: yeah my thoughts exactly :)
#
tantek
tommorris++
#
Loqi
tommorris has 28 karma
#
KartikPrabhu
tommorris++ will pull that also send a PR for some datetime parsing improvements kylewm made to my fork
#
Loqi
tommorris has 29 karma
#
tommorris
KartikPrabhu: if you’ve got those, am happy to merge in and maybe put out a new release this week
#
KartikPrabhu
cool will send PR tonight
caseorganic, jgarber, pfefferle, adactio and encolpe joined the channel
#
tantek
ok now it's bugging me that we look at <link> elements for rel values but not for h-* microformats properties, specifically u-* properties
#
tantek
!tell tommorris, Kartikprabhu what do you think of changing http://microformats.org/wiki/microformats2-parsing##a.u-x[href]%20or%20area.u-x[href] to "a.u-x[href] or area.u-x[href] or link.u-x[href]" ?
#
Loqi
Ok, I'll tell them that when I see them next
Rastus_Vernon joined the channel
#
gRegor`
I like it, tantek, as someone who doesn't necessarily want to alter their design to add a photo / h-card on the homepage. Although my shortcut icons are just the "g" currently, not my photo.
#
tantek
but you do *have* a shortcut icon at least
#
tantek
even if it is minimal
#
tantek
do you not show the icon on the page?
KartikPrabhu joined the channel
#
Loqi
KartikPrabhu: tantek left you a message 55 minutes ago: what do you think of changing http://microformats.org/wiki/microformats2-parsing##a.u-x[href]%20or%20area.u-x[href] to "a.u-x[href] or area.u-x[href] or link.u-x[href]" ?
#
KartikPrabhu
tantek: re link.u-x[href] if there are enough use-cases then it wouldn't be hard to change mf2py code
#
gRegor`
No, I don't show the shortcut icon on the homepage. Just the logo at the top.