#microformats 2018-05-29

2018-05-29 UTC
pniedzielski[m], KartikPrabhu, nitot, globbot, ivc, gRegorLove, MeanderingCode, edsu, reidab, sknebel, aaronpk, GWG, ben_thatmustbeme, [Natris1979], wakest, schmarty, [tantek] and tantek_ joined the channel
KartikPrabhu, sorry, I had gone to sleep right before you posted the new proposal. I am wondering if whitespace should be trimmed on the alt. Thoughts?
I would think it should be as authored
do we trim whitespace from alt elsewhere. Maybe in p-name no?
but here alt is supposed to be used to generate the image back so I would think no trimming
Gotcha. Yes. That makes sense.
gRegorLove, enjikaka, barpthewire and [jgmac1106] joined the channel
@fredrin Fred! Fred! Fred, LOOKIT! http://html5doctor.com/microformats/ http://html5doctor.com/microdata/ With HTML5 you can add extra machine readable semantics (microformats) and lightweight semantic meta-syntax! So, for webcomic page you can agree on a vocabulary and then markup
A #Call To #Search #Engines to #Reduce #Dependence on Microformats https://goo.gl/5vve6Z https://t.co/FD5xwLZaeU
[kevinmarks] joined the channel
A call to SEO handwavers to reduce dependence on hashtags
That tweet comes a little late, Google has already stopped using rel-author for search results IIRC
[cjwillcock], hober, [tantek], [pfefferle], tantek_, [schmarty] and [cleverdevil] joined the channel
Zegnat, I'll try to look into https://github.com/indieweb/php-mf2/issues/176 later today
[kartikprabhu] #176 funny parsing of 'u-photo h-cite'
barpthewire and [pfefferle] joined the channel
gRegorLove, yeah, I didn’t have much more time to look into it. I am guessing the h-cite is parsed later than the h-entry, therefore at the time the h-entry is being parsed there is no u-photo yet and it gets implied. But I don’t really know how the recursive h-* parsing is implemented
Hopefully that’ll mean something more to you than it does to me ;)
The h-cite should be parsed before the h-entry, but yeah something weird is going on.
recursion hard is
I wonder if it would be easier if the h-* parsing also triggered h-* parsing recursively. Rather than having some other method find all h-* and parse them. But I didn’t feel like refactoring between two forms of recursion right now
the recursion steps are stated in order in the parsing spec I think
[keithjgrant] and vivus joined the channel; vivus left the channel
Yes, but that’s not actually how it is implemented in PHP, KartikPrabhu. The PHP implementation doesn’t really “walk the DOM tree”. Instead it uses XPath to immediately request the nodes we are interested in.
It is a valid way of querying DOM. I just haven’t looked into the older parts of how the parser works enough to know exactly how it is firing these queries and ties it together
right. I think in this case the order of the parsing matters
Zegnat, Not sure I follow. When an h-* is found, it parses it recursively. Sounds like what you're describing?
Does it? I didn’t see the hparsing method being called within the hparsing method?
It's in parse_recursive().
Yeah, so I have no idea how the h-cite parsed inside parse_recursive() makes it into the h-entry’s properties.photo array. Which is where I got stuck ;)
getRootMF() finds the root MF, then loops through them and calls itself again, so the h-cite is the first processed, then the h-entry
Ah, interesting
That process is a little inefficient I guess, since getRootMF will return the h-cite and it doesn't need to, but that method accounts for elements that have already been parsed, avoiding it appearing twice in the parsed results.
I expected parseH() to call parseH() when it encountered a nested microformat. Which is what I might have tried refactoring it into, so order can be relied upon. When I saw there was a completely separate logic for finding and parsing h-*, and then somehow merging them together, I decided I would file this as “gRegorLove’s problem” ;)
Or, in the worst case, a “weekend problem”
Yeah, it might be better for parseH to be recursive. I'm not sure. parseH has existed since before my time and I think I was concerned about changing it massively.
I think the important part is to figure out at what step the “merging” of parsed h-* happens. Currently it seems to happen too late, as the h-entry parser doesn’t detect a photo property and moves on to implying one.
At least I believe that is what my testing showed was going on
parseH did formerly parse a sub-mf, but it didn't account for some of the edge cases correctly iirc
and it wasn't recursion
[jgmac1106], tantek_ and [cleverdevil] joined the channel
↩️ I don't see a check for whether the license link uses the correct rel="license" tag to facilitate indexing. I understand: #microformats emerged in 2005, publishers still have to join us in the 21st century.
tantek_: Zegnat left you a message 1 day, 12 hours ago: Issue #6 is still open because I resolved it in the spec but I can't close the issue. Not enough rights on the repo. You'll have to close it yourself, tantek, as the person who opened it.
oops :)
done, thanks Zegnat
tantek_++ for bookkeeping on the issues :D
tantek has 15 karma in this channel (435 overall)
tantek_: almost done implementing https://github.com/microformats/microformats2-parsing/issues/2#issuecomment-392608361 in mf2py adding tests now. Might be good to get some others devs to agree to implement? cc: gRegorLove
[kartikprabhu] Here are the proposed changes to the spec to account for `alt` attribute. Add a new section 1.5 with title "parse an `img` element for `src` and `alt`" with the steps - if `img[alt]` - return a new `{}` structure with - `value`: the `src`...
Hey, slow down now. ;) jk
snarfed has been hankering for this one ;)
[jeremycherfas], [jz], tantek_, chrisaldrich and [schmarty] joined the channel
[kartikprabhu] experimental mf2py now implements the [above algorithm](https://github.com/microformats/microformats2-parsing/issues/2#issuecomment-392608361) under the flag `img_with_alt`. Feel free to try it out at https://kartikprabhu.com/connection/mfparser ...
chrisaldrich joined the channel