#microformats 2017-04-27
2017-04-27 UTC
[cleverdevil] and [eddie] joined the channel
#
ben_thatmustbeme Woo, making great progress on my rewrite of the parser

#
ben_thatmustbeme Super basic parsing is already working.

#
gRegorLove ben_thatmustbeme++

tantek joined the channel
#
ben_thatmustbeme It's actually pretty interesting as I'm learning little edge cases of microformats I didn't know about

#
gRegorLove I think I need some clarification on the implied URL parsing related to: https://github.com/indieweb/php-mf2/issues/110

#
gRegorLove "else if .h-x>a[href]:only-of-type:not[.h-*], then use that [href] for url" from http://microformats.org/wiki/microformats2-parsing##if+no+explicit+%22url%22+property

#
gRegorLove ".h-x > a[href]:only-of-type" means .h-x has only one direct child <a>, correct?

#
gRegorLove Meaning, :only-of-type doesn't restrict sibling elements from having <a> as children

#
gRegorLove See the github issue. The second link is inside a sibling <b>

#
gRegorLove Maybe a product of weird MediaWiki formatting

#
gRegorLove (Speaking of edge cases, ben_thatmustbeme. Heh)

#
gRegorLove tantek: So is the parser technically correct in this example?

#
ben_thatmustbeme I haven't gotten to much of the implied properties part yet. May get messy, not sure yet

#
gRegorLove selectoracle is your friend when you get there: http://tux.theopalgroup.com/cgi-bin/css3explainer/selectoracle.py

#
gRegorLove mf2py also returns the implied URL for that HTML

#
gRegorLove And microformat-shiv

#
ben_thatmustbeme I suppose it would, assuming the > means direct decendant in the html

#
ben_thatmustbeme I suppose it would be correct

#
ben_thatmustbeme And it doesn't mean descendants that are not inside sub [h,p,e,dt,u]-*

#
gRegorLove Yeah, it's direct descendant afaik.

#
gRegorLove Reasoning probably being to prevent really unexpected implied values

#
gRegorLove Yeah, moving the </b> to the end gives no implied URL

#
ben_thatmustbeme So the conclusion is, stop issues <b> tags already

#
ben_thatmustbeme Also if it weren't direct descendants the parsing would get way more messy

#
KartikPrabhu gRegorLove: is mf2py giving the correct implied URL not the one in the <b>

#
KartikPrabhu and so is pin13

#
KartikPrabhu so they seem to playing by the parsing rules

#
KartikPrabhu if you put a u-url on the /2017/Bellingham link then they both return that link as expected

[chrisaldrich], nitot, [tamaracks], [eddie], tantek and [jeremycherfas] joined the channel
#
gRegorLove KartikPrabhu: The HTML's already been fixed to get the desired u-url explicitly. the issue appeared to be php-mf2 not following the implied u-url algorithm correctly.

#
KartikPrabhu aah ok. I was wondering if mf2py is doing it right, and I think it is

#
gRegorLove But after review, it appears it is parsing correctlly, just the weird HTML didn't give the desired u-url as a result

#
KartikPrabhu yeah

#
gRegorLove All of the parsers are doing it, and it appears all it takes is moving the </b> to the end, then no implied u-url

#
KartikPrabhu yeah, that is what the parsing-algo says atm

#
gRegorLove So pretty sure there's no parsing bug. Will await tantek's confirmation to be sure.

#
KartikPrabhu also, traversing down children of h-* is going to be very annoying

#
gRegorLove Yeah, the more I looked at it, the reasoning for the very strict implied algo makes sense

#
gRegorLove short version: if you really want the property, add it explicitly :)

#
KartikPrabhu yeah I think that is true for more complex markup

#
KartikPrabhu but implied-properties are cool too :P

#
gRegorLove !tell tantek summarized the conversation on github: https://github.com/indieweb/php-mf2/issues/110

[johnholdun], [kevinmarks], nitot, [colinwalker], [jeremycherfas], [pfefferle] and tantek joined the channel
#
@rashidnoorani http://schema.org for all types of researched predefined #schemas. #gids17 #microformats. (twitter.com/_/status/857514688665579520)
#
Loqi tantek: gRegorLove left you a message 3 hours, 34 minutes ago: summarized the conversation on github: https://github.com/indieweb/php-mf2/issues/110

nitot, adactio, rodolfojcj, barpthewire, KartikPrabhu and tantek joined the channel
#
ben_thatmustbeme hmm, noticed a difference between pin13 and unmung as far as stripping whitespace

#
ben_thatmustbeme specifically the html:

#
KartikPrabhu before the <p> tag?

#
KartikPrabhu that might be due to the HTML parsers used and not the mf2 part

#
KartikPrabhu in fact pin13 removes the next line \n in the value and ummung does not

#
ben_thatmustbeme that too

#
KartikPrabhu ben_thatmustbeme: what is your HTML so I can try it on my mf2py

#
ben_thatmustbeme thanks loqi

#
ben_thatmustbeme hands loqi the dictionary entry on sarcasm

#
KartikPrabhu interesting, my mf2py installation preserves the space before <p> in html property and keeps the \n in the value property

#
KartikPrabhu ben_thatmustbeme: try it here https://kartikprabhu.com/connection/mfparser

#
ben_thatmustbeme likelty some of this is due to what is considered whitespace by the language

#
ben_thatmustbeme though some don't try to strip at all, others do

#
ben_thatmustbeme or rather what the stripping function considers whitspace

#
ben_thatmustbeme trying to understand the .e-*.h-* interaction in my parser, making me rethink a few things

#
ben_thatmustbeme would that be the only time you can have anything other than type, properties, children and value?

#
ben_thatmustbeme is having an html as well

[chrisaldrich], [kevinmarks], [jeremycherfas], rodolfojcj and nitot joined the channel
#
ben_thatmustbeme i'm confused what the difference is between the name and photo sections for example

#
ben_thatmustbeme .h-x>img:only-child[alt]:not([alt=""]):not[.h-*]

#
ben_thatmustbeme vs .h-x>img[src]:only-of-type:not[.h-*]

#
ben_thatmustbeme just getting lost in them a bit

gRegorLove, rodolfojcj and [kevinmarks] joined the channel
#
gRegorLove ben_thatmustbeme: First one means: .h-x with an img[src] as its only child where the alt is not empty and the img does not have an .h-x

#
gRegorLove Second is: .h-x with only one img as a child and the img does not have .h-x

#
ben_thatmustbeme "with an img[src]" mean with and image with a src attribute

#
gRegorLove Right

#
ben_thatmustbeme okay

#
ben_thatmustbeme dang, i just wrote this as only-of-type instead of only-child

#
ben_thatmustbeme i think it was the difference in ordering that was confusing me

#
ben_thatmustbeme img:only-child[alt] vs img[src]:only-of-type

KartikPrabhu joined the channel
#
ben_thatmustbeme last questions gRegorLove to make sure i have this right,

#
ben_thatmustbeme .h-x>img:only-child[alt]:not([alt=""]):not[.h-*]

#
ben_thatmustbeme if it has more than one img tag, say 4, one has h-*, one has no alt, one has an empty alt, one has a non-empty alt and no h-*....

#
ben_thatmustbeme oh wait, only, ONLY CHILD, basically cuts that all

#
ben_thatmustbeme i guess thats a question for only-of-type

#
ben_thatmustbeme but i'm just going to assume its actually only of that type, not only of that with that has attribute ...

#
gRegorLove Correct, I'm pretty sure only-of-type applies only to the selector it comes after, not the following attributes

#
KartikPrabhu yes, that's how it works in CSS too

#
gRegorLove Are you using xpath in the parser?

#
ben_thatmustbeme its using nokogiri and i'm descending the tree myself

#
ben_thatmustbeme though i suppose that might make more sense huh

#
gRegorLove Maybe, not sure. Was just going to suggest php-mf2 has several of them, like in parseImpliedPhoto()

#
ben_thatmustbeme i sort of don't want to look directly at other parsers, lest it confuse me more

#
gRegorLove Haha, fair enough.

#
KartikPrabhu ben_thatmustbeme: that is actually a good idea. independently written parser might find inconsistencies in the already existing ones

#
ben_thatmustbeme *write a big pile of code to handle implied properties* *rerun tests* *number changes from 56 failers to 55 failures* *SIGH*

#
ben_thatmustbeme yeah, that was the other reason

#
KartikPrabhu ben_thatmustbeme: also please document the "space collapsing" difference you found.

#
ben_thatmustbeme sure, where?

#
KartikPrabhu err good point :P

#
KartikPrabhu ben_thatmustbeme: maybe see http://microformats.org/wiki/microformats2-parsing-issues#whitespace_collapsing_revisited ?

#
gRegorLove May be related to https://github.com/indieweb/php-mf2/issues/69? Haven't checked the HTML you're referring to

#
KartikPrabhu gRegorLove: https://ben.thatmustbe.me/static/test1.html

[colinwalker], rodolfojcj, [chrisaldrich], [eddie], [cleverdevil], tantek and [manton] joined the channel
#
ben_thatmustbeme \me wipes brow, failing on 43 of the 92 tests now but i'm only testing the v2 folder yet

#
ben_thatmustbeme pretty good progress though

#
ben_thatmustbeme https://raw.githubusercontent.com/microformats/tests/master/tests/microformats-v2/h-card/nested.html curious on this one, I don't see why the child h-card h-org has a value attribute

#
KartikPrabhu ben_thatmustbeme: all h-* get atleast a value

#
KartikPrabhu so people can use value as fallback text representation for any h-* in case they don't understand the particular vocabulary

#
ben_thatmustbeme except for those in items[] ?

#
KartikPrabhu I think all h-* get a value

#
KartikPrabhu do you have an example?

#
ben_thatmustbeme the parsing for that one

#
ben_thatmustbeme also, not finding the part in the parsing spec of where it gets that value from

#
ben_thatmustbeme i see it for if .p-*.h-* etc

#
KartikPrabhu oops maybe I mispoke

#
KartikPrabhu mf2py does not give value for that markup in any h-*

#
KartikPrabhu value is for e-* things I think, so you have html property and a value property for plaintext representation

#
KartikPrabhu strange pin13 i.e. php-mf2 does give a value just like the tests!

#
ben_thatmustbeme so value is used inif p-*.h-* e-* u-*.h-*

#
ben_thatmustbeme that section under value: is not terribly clear

#
KartikPrabhu but that is only if the child microformat is also a property

#
ben_thatmustbeme yeah

#
KartikPrabhu in this example markup it sin't

#
ben_thatmustbeme i don't see anywhere that value: should be set for children

#
KartikPrabhu right

#
KartikPrabhu might be a bug in the tests, maybe leave a !tell to tantek to confirm

nitot joined the channel
#
KartikPrabhu but then either php-mf2 is wrong or mf2py is

#
KartikPrabhu ben_thatmustbeme++ for thorough checking of mf2 tests

#
ben_thatmustbeme not sure what unmung uses

#
KartikPrabhu mf2py i am guessing

#
KartikPrabhu so it doesnot have the "value"

#
ben_thatmustbeme i'm basing all of this parser on the tests, so if it doesn't pass things, i'll know

#
KartikPrabhu yes, that is good. you are simultaneously checking the tests, the spec and other parsers :P

#
KartikPrabhu I think I did something like this while writing code for mf2py :P

#
KartikPrabhu but now have forgotten everything

#
ben_thatmustbeme !tell tantek hitting what is either an error in the mf2 tests and a bug in php-mf2 or something missing in the spec and a bug in mf2py. children elements seem to be getting a value: set, but not sure why. https://github.com/microformats/tests/blob/master/tests/microformats-v2/h-card/nested.html

#
ben_thatmustbeme !tell tantek h-card/nested.html parses without value for child h-org h-card in mf2py and with one via php-mf2

#
ben_thatmustbeme this might actually answer a LOT of my non-passing tests

#
ben_thatmustbeme just looking through

#
ben_thatmustbeme my only real points left to add are proper date parsing, and backcompat... i think

#
KartikPrabhu nice

#
ben_thatmustbeme this one is wrong in the other direction, p-affiliation h-card should have a value

#
ben_thatmustbeme at least the parsers seem to all agree on that one, pretty clear thats a bug in the test

#
@e_service_store SEO: Sfruttare il codice dei microformati. #seo #microformati #microformats #query https://www.e-service-online.com/eservice/seo-sfruttare-microformati-microformats-esempi/ (twitter.com/_/status/857694394471915520)
tantek joined the channel
#
KartikPrabhu ben_thatmustbeme: yup that one does seem like a bug

#
KartikPrabhu and now I recall the logic

#
KartikPrabhu if some h-* that you understand has a property which is a h-*2 that you don't understand, then you can use the "value" directly

#
KartikPrabhu which is also why the "value" is generated depending on the property type

tantek, [chrisaldrich], [ianmjones] and nitot joined the channel
#
gRegorLove Yeah, looks like php-mf2 is incorrectly always setting the 'value' for a nested h-*: https://github.com/indieweb/php-mf2/blob/master/Mf2/Parser.php#L870

#
KartikPrabhu gRegorLove: so do you agree this is a problem in the tests and php-mf2 and the mf2py seems to be following the spec?

[mko] joined the channel
#
gRegorLove Need to wrap it in a conditional check for mf property classes

#
gRegorLove mf2py (unmung) seems to be following the algo, no 'value' in the child.

#
KartikPrabhu ok, could you file bug on both php-mf2 and spec?

#
gRegorLove php-mf2 appears to have a bug, always adding the 'value' regardless if it's a property

#
gRegorLove Don't think there's a spec issue

#
KartikPrabhu sorry tests not spec

#
KartikPrabhu 4/5-letter wrdos are hrad

#
gRegorLove quiet

#
gRegorLove :)

#
KartikPrabhu :P

#
gRegorLove Aha, already an issue. I thought this sounded familiar. https://github.com/indieweb/php-mf2/issues/98

#
KartikPrabhu aah is there an issue on the tests?

#
gRegorLove Tests issue: https://github.com/microformats/tests/issues/58

sknebel_ and [ianmjones] joined the channel
#
KartikPrabhu cool, I thumbs-upped it

#
gRegorLove Looks like there was a similar fix for another test: https://github.com/microformats/tests/pull/53

#
gRegorLove Woo, down to 8 open issues in php-mf2

edsu joined the channel
#
gRegorLove Once I add rel-urls I can start using the test suite more seriously

rodolfojcj and tantek joined the channel