#microformats 2017-04-27
2017-04-27 UTC
[cleverdevil] and [eddie] joined the channel
# ben_thatmustbeme Woo, making great progress on my rewrite of the parser
# ben_thatmustbeme Super basic parsing is already working.
# gRegorLove ben_thatmustbeme++
tantek joined the channel
# ben_thatmustbeme It's actually pretty interesting as I'm learning little edge cases of microformats I didn't know about
# gRegorLove I think I need some clarification on the implied URL parsing related to: https://github.com/indieweb/php-mf2/issues/110
# gRegorLove "else if .h-x>a[href]:only-of-type:not[.h-*], then use that [href] for url" from http://microformats.org/wiki/microformats2-parsing##if+no+explicit+%22url%22+property
# gRegorLove ".h-x > a[href]:only-of-type" means .h-x has only one direct child <a>, correct?
# gRegorLove Meaning, :only-of-type doesn't restrict sibling elements from having <a> as children
# gRegorLove See the github issue. The second link is inside a sibling <b>
# gRegorLove Maybe a product of weird MediaWiki formatting
# gRegorLove (Speaking of edge cases, ben_thatmustbeme. Heh)
# gRegorLove tantek: So is the parser technically correct in this example?
# ben_thatmustbeme I haven't gotten to much of the implied properties part yet. May get messy, not sure yet
# gRegorLove selectoracle is your friend when you get there: http://tux.theopalgroup.com/cgi-bin/css3explainer/selectoracle.py
# gRegorLove mf2py also returns the implied URL for that HTML
# gRegorLove And microformat-shiv
# ben_thatmustbeme I suppose it would, assuming the > means direct decendant in the html
# ben_thatmustbeme I suppose it would be correct
# ben_thatmustbeme And it doesn't mean descendants that are not inside sub [h,p,e,dt,u]-*
# gRegorLove Yeah, it's direct descendant afaik.
# gRegorLove Reasoning probably being to prevent really unexpected implied values
# gRegorLove Yeah, moving the </b> to the end gives no implied URL
# ben_thatmustbeme So the conclusion is, stop issues <b> tags already
# ben_thatmustbeme Also if it weren't direct descendants the parsing would get way more messy
# KartikPrabhu gRegorLove: is mf2py giving the correct implied URL not the one in the <b>
# KartikPrabhu and so is pin13
# KartikPrabhu so they seem to playing by the parsing rules
# KartikPrabhu if you put a u-url on the /2017/Bellingham link then they both return that link as expected
[chrisaldrich], nitot, [tamaracks], [eddie], tantek and [jeremycherfas] joined the channel
# gRegorLove KartikPrabhu: The HTML's already been fixed to get the desired u-url explicitly. the issue appeared to be php-mf2 not following the implied u-url algorithm correctly.
# KartikPrabhu aah ok. I was wondering if mf2py is doing it right, and I think it is
# gRegorLove But after review, it appears it is parsing correctlly, just the weird HTML didn't give the desired u-url as a result
# KartikPrabhu yeah
# gRegorLove All of the parsers are doing it, and it appears all it takes is moving the </b> to the end, then no implied u-url
# KartikPrabhu yeah, that is what the parsing-algo says atm
# gRegorLove So pretty sure there's no parsing bug. Will await tantek's confirmation to be sure.
# KartikPrabhu also, traversing down children of h-* is going to be very annoying
# gRegorLove Yeah, the more I looked at it, the reasoning for the very strict implied algo makes sense
# gRegorLove short version: if you really want the property, add it explicitly :)
# KartikPrabhu yeah I think that is true for more complex markup
# KartikPrabhu but implied-properties are cool too :P
# gRegorLove !tell tantek summarized the conversation on github: https://github.com/indieweb/php-mf2/issues/110
[johnholdun], [kevinmarks], nitot, [colinwalker], [jeremycherfas], [pfefferle] and tantek joined the channel
# @rashidnoorani http://schema.org for all types of researched predefined #schemas. #gids17 #microformats. (twitter.com/_/status/857514688665579520)
# Loqi tantek: gRegorLove left you a message 3 hours, 34 minutes ago: summarized the conversation on github: https://github.com/indieweb/php-mf2/issues/110
nitot, adactio, rodolfojcj, barpthewire, KartikPrabhu and tantek joined the channel
# ben_thatmustbeme hmm, noticed a difference between pin13 and unmung as far as stripping whitespace
# ben_thatmustbeme specifically the html:
# KartikPrabhu before the <p> tag?
# KartikPrabhu that might be due to the HTML parsers used and not the mf2 part
# KartikPrabhu in fact pin13 removes the next line \n in the value and ummung does not
# ben_thatmustbeme that too
# KartikPrabhu ben_thatmustbeme: what is your HTML so I can try it on my mf2py
# ben_thatmustbeme thanks loqi
# ben_thatmustbeme hands loqi the dictionary entry on sarcasm
# KartikPrabhu interesting, my mf2py installation preserves the space before <p> in html property and keeps the \n in the value property
# KartikPrabhu ben_thatmustbeme: try it here https://kartikprabhu.com/connection/mfparser
# ben_thatmustbeme likelty some of this is due to what is considered whitespace by the language
# ben_thatmustbeme though some don't try to strip at all, others do
# ben_thatmustbeme or rather what the stripping function considers whitspace
# ben_thatmustbeme trying to understand the .e-*.h-* interaction in my parser, making me rethink a few things
# ben_thatmustbeme would that be the only time you can have anything other than type, properties, children and value?
# ben_thatmustbeme is having an html as well
[chrisaldrich], [kevinmarks], [jeremycherfas], rodolfojcj and nitot joined the channel
# ben_thatmustbeme i'm confused what the difference is between the name and photo sections for example
# ben_thatmustbeme .h-x>img:only-child[alt]:not([alt=""]):not[.h-*]
# ben_thatmustbeme vs .h-x>img[src]:only-of-type:not[.h-*]
# ben_thatmustbeme just getting lost in them a bit
gRegorLove, rodolfojcj and [kevinmarks] joined the channel
# gRegorLove ben_thatmustbeme: First one means: .h-x with an img[src] as its only child where the alt is not empty and the img does not have an .h-x
# gRegorLove Second is: .h-x with only one img as a child and the img does not have .h-x
# ben_thatmustbeme "with an img[src]" mean with and image with a src attribute
# gRegorLove Right
# ben_thatmustbeme okay
# ben_thatmustbeme dang, i just wrote this as only-of-type instead of only-child
# ben_thatmustbeme i think it was the difference in ordering that was confusing me
# ben_thatmustbeme img:only-child[alt] vs img[src]:only-of-type
KartikPrabhu joined the channel
# ben_thatmustbeme last questions gRegorLove to make sure i have this right,
# ben_thatmustbeme .h-x>img:only-child[alt]:not([alt=""]):not[.h-*]
# ben_thatmustbeme if it has more than one img tag, say 4, one has h-*, one has no alt, one has an empty alt, one has a non-empty alt and no h-*....
# ben_thatmustbeme oh wait, only, ONLY CHILD, basically cuts that all
# ben_thatmustbeme i guess thats a question for only-of-type
# ben_thatmustbeme but i'm just going to assume its actually only of that type, not only of that with that has attribute ...
# gRegorLove Correct, I'm pretty sure only-of-type applies only to the selector it comes after, not the following attributes
# KartikPrabhu yes, that's how it works in CSS too
# gRegorLove Are you using xpath in the parser?
# ben_thatmustbeme its using nokogiri and i'm descending the tree myself
# ben_thatmustbeme though i suppose that might make more sense huh
# gRegorLove Maybe, not sure. Was just going to suggest php-mf2 has several of them, like in parseImpliedPhoto()
# ben_thatmustbeme i sort of don't want to look directly at other parsers, lest it confuse me more
# gRegorLove Haha, fair enough.
# KartikPrabhu ben_thatmustbeme: that is actually a good idea. independently written parser might find inconsistencies in the already existing ones
# ben_thatmustbeme *write a big pile of code to handle implied properties* *rerun tests* *number changes from 56 failers to 55 failures* *SIGH*
# ben_thatmustbeme yeah, that was the other reason
# KartikPrabhu ben_thatmustbeme: also please document the "space collapsing" difference you found.
# ben_thatmustbeme sure, where?
# KartikPrabhu err good point :P
# KartikPrabhu ben_thatmustbeme: maybe see http://microformats.org/wiki/microformats2-parsing-issues#whitespace_collapsing_revisited ?
# gRegorLove May be related to https://github.com/indieweb/php-mf2/issues/69? Haven't checked the HTML you're referring to
# KartikPrabhu gRegorLove: https://ben.thatmustbe.me/static/test1.html
[colinwalker], rodolfojcj, [chrisaldrich], [eddie], [cleverdevil], tantek and [manton] joined the channel
# ben_thatmustbeme \me wipes brow, failing on 43 of the 92 tests now but i'm only testing the v2 folder yet
# ben_thatmustbeme pretty good progress though
# ben_thatmustbeme https://raw.githubusercontent.com/microformats/tests/master/tests/microformats-v2/h-card/nested.html curious on this one, I don't see why the child h-card h-org has a value attribute
# KartikPrabhu ben_thatmustbeme: all h-* get atleast a value
# KartikPrabhu so people can use value as fallback text representation for any h-* in case they don't understand the particular vocabulary
# ben_thatmustbeme except for those in items[] ?
# KartikPrabhu I think all h-* get a value
# KartikPrabhu do you have an example?
# ben_thatmustbeme the parsing for that one
# ben_thatmustbeme also, not finding the part in the parsing spec of where it gets that value from
# ben_thatmustbeme i see it for if .p-*.h-* etc
# KartikPrabhu oops maybe I mispoke
# KartikPrabhu mf2py does not give value for that markup in any h-*
# KartikPrabhu value is for e-* things I think, so you have html property and a value property for plaintext representation
# KartikPrabhu strange pin13 i.e. php-mf2 does give a value just like the tests!
# ben_thatmustbeme so value is used inif p-*.h-* e-* u-*.h-*
# ben_thatmustbeme that section under value: is not terribly clear
# KartikPrabhu but that is only if the child microformat is also a property
# ben_thatmustbeme yeah
# KartikPrabhu in this example markup it sin't
# ben_thatmustbeme i don't see anywhere that value: should be set for children
# KartikPrabhu right
# KartikPrabhu might be a bug in the tests, maybe leave a !tell to tantek to confirm
nitot joined the channel
# KartikPrabhu but then either php-mf2 is wrong or mf2py is
# KartikPrabhu ben_thatmustbeme++ for thorough checking of mf2 tests
# ben_thatmustbeme not sure what unmung uses
# KartikPrabhu mf2py i am guessing
# KartikPrabhu so it doesnot have the "value"
# ben_thatmustbeme i'm basing all of this parser on the tests, so if it doesn't pass things, i'll know
# KartikPrabhu yes, that is good. you are simultaneously checking the tests, the spec and other parsers :P
# KartikPrabhu I think I did something like this while writing code for mf2py :P
# KartikPrabhu but now have forgotten everything
# ben_thatmustbeme !tell tantek hitting what is either an error in the mf2 tests and a bug in php-mf2 or something missing in the spec and a bug in mf2py. children elements seem to be getting a value: set, but not sure why. https://github.com/microformats/tests/blob/master/tests/microformats-v2/h-card/nested.html
# ben_thatmustbeme !tell tantek h-card/nested.html parses without value for child h-org h-card in mf2py and with one via php-mf2
# ben_thatmustbeme this might actually answer a LOT of my non-passing tests
# ben_thatmustbeme just looking through
# ben_thatmustbeme my only real points left to add are proper date parsing, and backcompat... i think
# KartikPrabhu nice
# ben_thatmustbeme this one is wrong in the other direction, p-affiliation h-card should have a value
# ben_thatmustbeme at least the parsers seem to all agree on that one, pretty clear thats a bug in the test
# @e_service_store SEO: Sfruttare il codice dei microformati. #seo #microformati #microformats #query https://www.e-service-online.com/eservice/seo-sfruttare-microformati-microformats-esempi/ (twitter.com/_/status/857694394471915520)
tantek joined the channel
# KartikPrabhu ben_thatmustbeme: yup that one does seem like a bug
# KartikPrabhu and now I recall the logic
# KartikPrabhu if some h-* that you understand has a property which is a h-*2 that you don't understand, then you can use the "value" directly
# KartikPrabhu which is also why the "value" is generated depending on the property type
tantek, [chrisaldrich], [ianmjones] and nitot joined the channel
# gRegorLove Yeah, looks like php-mf2 is incorrectly always setting the 'value' for a nested h-*: https://github.com/indieweb/php-mf2/blob/master/Mf2/Parser.php#L870
# KartikPrabhu gRegorLove: so do you agree this is a problem in the tests and php-mf2 and the mf2py seems to be following the spec?
[mko] joined the channel
# gRegorLove Need to wrap it in a conditional check for mf property classes
# gRegorLove mf2py (unmung) seems to be following the algo, no 'value' in the child.
# KartikPrabhu ok, could you file bug on both php-mf2 and spec?
# gRegorLove php-mf2 appears to have a bug, always adding the 'value' regardless if it's a property
# gRegorLove Don't think there's a spec issue
# KartikPrabhu sorry tests not spec
# KartikPrabhu 4/5-letter wrdos are hrad
# gRegorLove quiet
# gRegorLove :)
# KartikPrabhu :P
# gRegorLove Aha, already an issue. I thought this sounded familiar. https://github.com/indieweb/php-mf2/issues/98
# KartikPrabhu aah is there an issue on the tests?
# gRegorLove Tests issue: https://github.com/microformats/tests/issues/58
sknebel_ and [ianmjones] joined the channel
# KartikPrabhu cool, I thumbs-upped it
# gRegorLove Looks like there was a similar fix for another test: https://github.com/microformats/tests/pull/53
# gRegorLove Woo, down to 8 open issues in php-mf2
edsu joined the channel
# gRegorLove Once I add rel-urls I can start using the test suite more seriously
rodolfojcj and tantek joined the channel