#microformats 2014-06-10

2014-06-10 UTC
brianloveswords, kylewm, tantek, philipashlock, caseorganic, pfefferle, chiui, eschnou, KartikPrabhu, krendil, iSRAELiWORK, barnabywalters, Phae, iwaim and tobiastom joined the channel
#
barnabywalters
waves to tobiastom
#
tobiastom
:)
#
barnabywalters
feel free to expand on use cases for your nested references idea, if you want
#
tobiastom
it’s not really about my use case. I can for sure build some search, or reuse yours, but still: I think the goal of the JSON format would be to have the data available in a easy to use way.
#
tobiastom
if you requite some tools, or some own implementation, to get easy stuff done, like get all h-cards, I think it makes it much harder for everyone.
#
barnabywalters
tobiastom: agreed, and the current form is the most straightforward for the most common use cases
#
barnabywalters
realistically there’s always going to be a whole bunch of work which needs doing to raw canonical JSON before it can actually be used
#
tobiastom
my use case is not really related to that, I’m just trying to get a summary for an URL right now, which will support microformats, og and others, but this just came to my mind while implementing it.
#
barnabywalters
authorship data, filling in missing data, normalising things, making sure datetimes aren’t in the future, etc
#
tobiastom
hm. maybe you are right.
#
barnabywalters
so the canonical JSON is a first step towards that, allowing us to make parsers interoperable, and build tools which enhance that canonical data
#
tobiastom
not sure, will think about it :)
#
barnabywalters
e.g. aaronpk’s php-comment-presentation
#
barnabywalters
and my mf-cleaner
#
tobiastom
do you have a link to the comment-presentation thing?
#
barnabywalters
one of the things I’ll be extracting from my feed reader work into it’s own package is a toolkit which will do as much of this normalizing work as possible
#
tobiastom
yeah, I see your valid point about correcting stuff and so on, but I think having an optimized thing, and not needing an optimizer in the first place would be a better start.
#
tobiastom
we are generating the JSON from a source anyways.
#
tobiastom
thanks for the link.
#
barnabywalters
tobiastom: in theory that’s correct, however it requires the parsers to try to understand the semantics of what they’re parsing, not just the structure
#
barnabywalters
once improved toolkits are made, users will not have to worry about the canonical JSON stage
#
barnabywalters
basically only parser developers should have to worry about it
#
barnabywalters
but it’s necessary a) as a way of deterministically comparing and testing parsers, and b) as a standard representation on top of which better tools can be created
#
tobiastom
agreed on the parser developer point. but I think the main question is: do we want to keep the space for optimizers open on the future, or should the parsers generate that in the first place.
#
tobiastom
if we want them to generate the optimize, why create a intermediate format in the first place. we just need to test the optimized one.
#
tobiastom
if I would like to implement a parser now, I would be required to create the JSON structure now, and then the optimized version. That’s more work for me, so why do it?
#
barnabywalters
tobiastom: because at the moment parsing is deterministic and standardized, but optimising is still open for experimentation
#
tobiastom
I see.
#
barnabywalters
often because optimisation is somewhat context-sensitive, and programming-language specific
#
barnabywalters
e.g. the level of HTML sanitization required in different cases will vary
#
barnabywalters
if it was to be standardised, we would have to figure out each use case, and name each level of sanitization, and document it thoroughly
#
tobiastom
Yeah, I think you are right. My intention is just to generic right now. For me the best case would be if all the different solutions (like og, rdf, whatever) would sit on a table an talk about the optimized format, but I assume that will not happen. :)
#
barnabywalters
also, it’s actually *less* work over all to have the canonical “raw” JSON format, because then parser developers only need to worry about making that, and other people can work on normalizers — and one normalizer will work with any parser
#
barnabywalters
tobiastom: TBH I don’t think that “optimised format” exists
#
barnabywalters
or could exist
#
barnabywalters
most of the progress we’ve made in the #microformats and #indiewebcamp commnities has been the result of focusing on specific use cases
tantek joined the channel
#
tobiastom
yeah, I think I’m just to… impatient.
#
barnabywalters
tobiastom: it’s understandable. I’ve spent a *lot* of time yak shaving on these sort of enabling projects and am just now getting to the stage where I can come up with an idea for a project, and start working on business logic right away
#
barnabywalters
without having to build yet more parsers and normalisation tools :)
#
barnabywalters
thanks, raising issue!
#
tobiastom
still not sure if I put up a pull request for my restructuring. :)
#
tobiastom
as it would break the project, possibly not. :)
#
barnabywalters
tobiastom: I suspect your suite is going to be objectively easier to use, so once it’s in use by a couple of parsers (certainly php-mf2, point the mf2py devs at it as well) if it works then you can suggest making it the canonical test suite
#
barnabywalters
having the tests as microformats themselves is a cool idea but awkward to use in practise. simple files are much easier
#
tobiastom
agreed.
#
tantek
barnabywalters: you'll have to bring that up with glennjones - I think he has an explanation about why the tests marked up with microformats are a good thing
#
barnabywalters
tantek: AFAIK currently tobiastom’s tests are generated from glennjones’s tests (as they are impressively thorough), so the two could coexist
#
barnabywalters
but if one is significantly easier both to maintain and for developers to use, as I suspect is the case, then IMO that should be the test suite we point parser developers towards
#
tobiastom
yeah, I can update them anytime with changes from glennjones’s tests.
#
tobiastom
again, I only created them in the first place to use the great tests from php and automate stuff.
pfefferle and dwayhs joined the channel
#
tantek
barnabywalters: sounds like a good discussion between you guys and glennjones
#
tantek
reads the background in #indiewebcamp
Atamido_, jonnybarnes, caseorganic, TallTed, pfefferle, brianloveswords, caseorga_, elux, globbot, chiui, tantek, philipashlock, encolpe and robmorrissey joined the channel
#
kylewm
!tell tantek ICYMI KartikPrabhu had an interesting question about whether implying 00 minutes/seconds is a case of artificial precision http://logs.glob.uno/?c=freenode%23microformats&s=9Jun+2014&e=9+Jun+2014#c72332
#
Loqi
Ok, I'll tell him that when I see him next
#
kylewm
huh, that's the first time I've seen Loqi use "him" instead of "them"
barnabywalters joined the channel
#
@jalbertbowdenii
@pazguille right....i don't see the advantage in this case. no semantics. iframed third party content. sorry. how is that better than hcard?
(twitter.com/_/status/476421259866415104)
tantek joined the channel
#
Loqi
tantek: kylewm left you a message 38 minutes ago: ICYMI KartikPrabhu had an interesting question about whether implying 00 minutes/seconds is a case of artificial precision http://logs.glob.uno/?c=freenode%23microformats&s=9Jun+2014&e=9+Jun+2014#c72332
#
tantek
kylewm good point
eschnou, shaners and KartikPrabhu joined the channel
KartikPrabhu, barnabywalters, caseorga_, tantek, Loqi, brianloveswords, krendil, waterbaby999 and caseorganic joined the channel
tantek, kylewm, netweb and tobyink joined the channel