#microformats 2018-03-22

2018-03-22 UTC
[tantek], sebsel, webchat52, tantek, webchat249, webchat140, j12t and [unoabraham] joined the channel; Myth0s left the channel
# 06:06 
gRegorLove aaronpk: vendor prefix fix! https://github.com/indieweb/php-mf2/pull/161
# 06:06 
Loqi [gRegorLove] #161 Add failing tests and fixes for #158, #160
# 06:08 
aaronpk ooh
# 06:08 
aaronpk gRegorLove++
# 06:08 
Loqi gregorlove has 27 karma in this channel (227 overall)
KartikPrabhu joined the channel
# 06:12 
aaronpk way too late for me to look at this right now but I will review tomorrow!
tantek, [unoabraham], edsu_, nitot, voxpelli, echarlie and [pfefferle] joined the channel
# 07:42 
Zegnat Hmm, I might just have time to file a PR for the rel parsing changes today, gRegorLove! :D
# 07:43 
Zegnat About time I get my name on the contributors board for the PHP code
# 07:44 
gRegorLove Nice!
# 07:48 
Zegnat All the rel parsing seems to be inside a single method, so it is a good way to get my toes wet with the php-mf2 project
# 07:52 
Zegnat Oh, gRegorLove, you didn’t need to escape the hyphens in the regex for the new classes.
# 07:53 
Zegnat Hyphens only have a special meaning within character classes ([]). Though it doesn’t hurt to escape them outside it is unneccessary.
# 07:54 
Zegnat (The regex looked like the one I wrote for KartikPrabhu, except for the ^ $ and the hyphen escapes, which is why I spotted them in the first place.)
tantek, [kevinmarks] and nitot_ joined the channel
# 09:28 
Zegnat !tell gRegorLove,aaronpk reviews welcomed: https://github.com/indieweb/php-mf2/pull/162
# 09:28 
Loqi Ok, I'll tell them that when I see them next
# 09:28 
Loqi [Zegnat] #162 Improve rel parsing
nitot joined the channel
# 09:53 
Zegnat When I read things like this, I feel like tantek missed his calling as a rapper:
# 09:53 
Zegnat if url is not in the array of the key rel-value in the rels hash then add url to the array
# 09:58 
Zegnat Alright, just had to push one more fix. I think it now matches Python’s output for all the things and adheres to the spec better.
# 10:15 
Zegnat Hmm, I can’t request reviewers?
[kevinmarks] joined the channel
# 11:25 
Zegnat Spec question: http://microformats.org/wiki/microformats2-parsing#parse_a_hyperlink_element_for_rel_microformats
# 11:25 
Zegnat “ "text": the text content of the element if any ” - does “any” include empty strings or not?
# 11:25 
Loqi [Tantek Çelik] microformats2 parsing specification
# 11:25 
Zegnat sknebel, maybe an idea? ^^^
# 11:26 
sknebel mh. I don't see the value in including empty strings here
# 11:27 
Zegnat Neither do I, but e.g. empty title-attributes do get included as there is no such check on there
# 11:27 
sknebel so I'd probably not allow empty strings here
# 11:28 
Zegnat PHP and Python both support empty string for title. Python accepts an empty string for text, but PHP does not.
# 11:30 
Zegnat So expected behaviour is “the textContent of the element unless that is an empty string”?
# 11:36 
sknebel I'd think so. a <link> doesn't have textContent anyways, so there can be cases where there is no content right?
# 11:44 
Zegnat I think textContent is an empty string on link elements
# 11:44 
Zegnat Pretty sure of that, actually. I think I made them clarify that in the DOM spec :P
# 11:46 
Zegnat Double checked. And yes: textContent of a link element will be an empty string. So discarding empty strings there sounds right.
# 12:08 
ahliqiu edited /get-started () "(-3477)" (view diff)
barpthewire joined the channel
# 12:15 
Zegnat sknebel, I would love a review on https://github.com/microformats/microformats2-parsing/issues/32
# 12:15 
Loqi [Zegnat] #32 Clarify attribute properties added to objects in rel-urls.
# 12:40 
sknebel Zegnat: sorry for bugging you about it, don't have my wiki login handy: spam to kill ^^^
# 12:41 
zegnat edited /get-started (+3477) "Undo revision 66729 by [[Special:Contributions/Ahliqiu|Ahliqiu]] ([[User talk:Ahliqiu|Talk]]) - spam" (view diff)
# 12:42 
Zegnat Wow. Only the word “SHUTDOWN” is enough to pass the captcha now!
# 12:42 
sknebel I don't think it checks at all for those
# 12:42 
Zegnat I couldn’t submit with an empty field
# 12:42 
Zegnat I didn’t try a single character only
# 12:50 
Zegnat GitHub confirms it, this is my first PR on the mf2 PHP parser. I almost can’t believe that. Guess my contributions so far have just been bickering about the spec itself.
[kevinmarks] and Garbee joined the channel
# 14:47 
KartikPrabhu Zegnat: I think textContent should be specified more clearly in almost all places it occurs
# 14:48 
Zegnat Yes, although for the parsing of values on hyperlinks, I think it is fine to at least specify “text content” as being the true textContent property of the element
# 14:49 
KartikPrabhu sure but whether empty strings are allowed or not (and before/after leading space stripping) should be mentioned
# 14:50 
Zegnat Ah, I didn’t bring up space stripping on that issue I believe
# 14:50 
Zegnat Feel free to comment with that!
# 14:50 
KartikPrabhu also not sure if the "remove <style> and <script>" and "replace img" is relevant
# 14:51 
KartikPrabhu looking for issue
# 14:52 
Zegnat https://github.com/microformats/microformats2-parsing/issues/32
# 14:52 
Loqi [Zegnat] #32 Clarify attribute properties added to objects in rel-urls.
[manton] joined the channel
# 14:52 
Zegnat Since we are making some good process on the rel parsing now, it would be nice to get that clarified (and shipped) before the next stable versions
# 14:57 
KartikPrabhu https://github.com/microformats/microformats2-parsing/issues/32#issuecomment-375336366
# 14:57 
Loqi [kartikprabhu] `text` should also specify the following
1. Is empty string checked before/after stripping leading and trailing spaces i.e. is `text: "   "` considered valid?
2. Should child `<style>` and `<script>` elements be dropped before?
3. Should child `...
# 15:05 
Zegnat Thanks KartikPrabhu! Updated my proposal :)
# 15:09 
KartikPrabhu Zegnat: your alt and src rule is missing <img>
# 15:10 
KartikPrabhu it might be better in the spec to specify textContent in one place and refer it from others
# 15:10 
Zegnat Huh, that must have been a weird GitHub & HTML bug. I coped this from #17. Will fix in a second
# 15:10 
Zegnat That’s probably true
# 15:13 
Zegnat I think bringing textContent out into its own chapter is something that must be done for #17, so I am willing to wait with that if we can get rel parsing clarified sooner
# 15:14 
Zegnat Ha, when I click edit on my comment, the <img> shows! Well done GitHub Markdown. I’ll go put ` around it
# 15:15 
Zegnat (updated)
[kevinmarks] joined the channel
# 15:45 
Zegnat Who has commit access to microformats/tests ? [kevinmarks], you are in the org right? Can you check?
# 15:48 
Zegnat would also like to submit a motion to grant more people access
[eddie], [cb], [tantek], KartikPrabhu, [colinwalker], j12t, nitot and tantek joined the channel
# 18:09 
Zegnat Working on the aaronpk-plaintext-whitespace-variant again :D
# 18:10 
aaronpk the what now
# 18:10 
Loqi aaronpk: Zegnat left you a message 8 hours, 42 minutes ago: reviews welcomed: https://github.com/indieweb/php-mf2/pull/162
# 18:11 
Zegnat https://pin13.net/mf2/whitespace.html
# 18:11 
aaronpk ah nice
# 18:12 
Zegnat I have that testing JS implementation. But actually writing it down into a implementable (and comprehendable) spec, rather than just pointing at that code, is a different matter
# 18:28 
Zegnat Hmm, I am still not 100% clear on the process. Should I implement https://github.com/microformats/microformats2-parsing/issues/32 in my rel patch for php-mf2 so the spec can be updated to reflect the parser, or do I wait for more reactions on the parsing spec issue?
# 18:28 
Loqi [Zegnat] #32 Clarify attribute properties added to objects in rel-urls.
nitot, Kyle-K and KartikPrabhu joined the channel
# 20:11 
KartikPrabhu Zegnat: I have +1 ed it. Maybe get gRegorLove's thoughts too and we will satisfy "change control"
# 20:11 
Zegnat Yep, hoping gRegorLove will find the time to review my PR against php-mf2 anyway :)
# 20:22 
KartikPrabhu Zegnat: maybe add a test example with expected output that verifies your change. I can put this change in experimental mf2py too
# 20:23 
Zegnat I am not sure there are any proper tests for current expected output. So would need multiple tests to be sure to cover all cases.
# 20:24 
Zegnat Currently (still) busy writing up and testing a textContent algo, so those tests will have to wait a minute
# 20:24 
KartikPrabhu sure no worries
hurdygurd, tantek and nitot joined the channel
# 21:15 
tantek edited /Special:Log/block () "blocked [[User:Ahliqiu]] with an expiry time of infinite (account creation disabled): Spamming links to external sites: vandalism" (view diff)
# 21:15 
tantek edited /Special:Log/block () "blocked [[User:Dagototo]] with an expiry time of infinite (account creation disabled): Spamming links to external sites" (view diff)
# 21:16 
tantek edited /Special:Log/block () "blocked [[User:Parlaybola]] with an expiry time of infinite (account creation disabled): Spamming links to external sites" (view diff)
# 21:18 
gRegorLove Zegnat: Re #32, do we have any real world examples of markup like `<a href="#a" rel="a" hreflang=""></a>
# 21:18 
Loqi gRegorLove: Zegnat left you a message 11 hours, 49 minutes ago: reviews welcomed: https://github.com/indieweb/php-mf2/pull/162
# 21:18 
gRegorLove  <a href="#a" rel="a" hreflang="en"></a>`?
# 21:19 
gRegorLove I'm mostly +1 on that issue, but not sure about the "and not an empty string" part
# 21:19 
Zegnat No. That’s a synthetic example of why we would not want to keep empty strings when the document may later provide an actual value
[jeremycherfas] joined the channel
# 21:20 
Zegnat Actually, it looks like Python may already be overwriting empty values, so could be we already have parser implementation on that
# 21:21 
gRegorLove I'll defer to others on that, but my inclination is to keep it if it's authored.
# 21:21 
Zegnat But both values are authored.
# 21:22 
Zegnat Basically I am saying “keep the first non-empty value”, rather than “keep the first value”.
# 21:22 
Zegnat Please do comment with hesitations though! All is important :)
# 21:23 
gRegorLove Understood. But it's discarding the first (empty) authored one. It "feels" like the parser's trying to fix a publisher mistake.
chrisaldrich joined the channel
# 21:25 
KartikPrabhu Zegnat: which mf2py are you looking at for that?
# 21:26 
Zegnat I always look at your dev version these days KartikPrabhu
# 21:27 
KartikPrabhu hmm it really shouldn't discard the empty values!
# 21:28 
tantek indeed, consider the author perspective before any theoretical / academic / purity perspectives
# 21:28 
tantek if an author expliictly provides an empty attribute, they went to some work to do so, therefore there is likely some intent there
# 21:29 
Zegnat I was considering an empty title for a URL, followed in the same document by a specified title for the same URL, to be likely not worth keeping. Instead keeping the authored value.
# 21:29 
tantek every title attribute is an authored value
# 21:29 
KartikPrabhu yeah it seems mf2py is discarding empty values. I thought I fixed that
# 21:29 
Zegnat Yes, but we keep only one of them. Even though all of the ones authored are valid ones.
# 21:30 
Zegnat So if all of the authored ones are valid, it made sense to me to atleast keep the first non-empty one as the one we (already arbitrarily) pick
# 21:30 
gRegorLove Is a better question: should these rel attributes store multiple values? array?
# 21:31 
tantek example of keeping only one?
# 21:32 
gRegorLove Eh, backing up on that a bit -- would want examples before asking my question.
# 21:32 
Zegnat For that we will only set the `text` property for `#` to "Author". And skip "Permalink too!".
# 21:32 
Zegnat <a href="#" rel="author">Author</a><a href="#" rel="bookmark">Permalink too!</a>
# 21:32 
Zegnat We arbitrarily decide that the first one must be the one the author *really* wanted.
# 21:33 
KartikPrabhu ha! it seems mf2py always gets the last attribute for hreflang and others which is definitely a bug
# 21:33 
tantek Zegnat: no, not arbitrary, rather, it's a simple and predictable model
# 21:34 
Zegnat Yes, always last value is definitely a bug, KartikPrabhu :)
# 21:35 
Zegnat Where `text` will be "" and not "Permalink" if we keep empty values.
# 21:35 
Zegnat <link href="#" rel="author"><a href="#" rel="bookmark">Permalink</a>
# 21:35 
Zegnat I would say first non-empty is just as simple and predictable. Especiall for e.g.:
# 21:36 
KartikPrabhu it should be first value current spec right?
# 21:36 
Zegnat Yes
# 21:36 
KartikPrabhu ok
# 21:36 
Zegnat Once you set the key, you never overwrite it
# 21:36 
Zegnat You only add additional keys if they are defined by properties later in the document
# 21:37 
KartikPrabhu right, that check is missing from mf2py
# 21:37 
gRegorLove php-mf2 doesn't get rel-urls.rels "author": http://pin13.net/mf2/?id=20180322213622296 Does that work in your latest PR, Zegnat?
# 21:37 
tantek Zegnat, except then you remove the ability for the author to force a blank value explicitly
# 21:37 
tantek which we currently have
# 21:37 
tantek I think you're overthinking it with a theoretical example
# 21:37 
tantek and IMO that's enough to reject that line of thinking
# 21:37 
tantek that is, a theoretical example is insufficient to justify a change
# 21:38 
tantek whereas I've just given you a theoretical *feature* that exists in the status quo. you provide no reason to remove it.
# 21:39 
tantek you cannot make assumptions like "Clearly the empty string adds no information about the URL"
# 21:39 
tantek the author's explicit authoring was considered, without any opinion about whether it adds information or not
# 21:41 
Zegnat Except it does not add any information is this specific case. An empty string for `media` means nothing at all. While if the author specifies a correct media value later in the page we should probably tell our consumer about that useful information.
# 21:41 
KartikPrabhu ok bug filed https://github.com/kartikprabhu/mf2py/issues/65
# 21:41 
Loqi [kartikprabhu] #65 rel-urls use last value instead of first one
# 21:41 
Zegnat But please comment on the issue so the spec change proposal can be adjusted :)
# 21:43 
Zegnat This isn’t arbitrary text like a p- property, we are talking about metadata on URLs. Empty meta data has no meaning. If the author doesn’t have the correct meta data to provide, just leaving the HTML attributes off is perfectly fine.
# 21:43 
Zegnat But I am happy to revise the proposal again if I am the only one who thinks this
# 21:43 
KartikPrabhu actually from a debugging HTML point of view I would think an empty value would be more of a red flag
# 21:43 
Zegnat And, yes, gRegorLove. I do believe my patch will get `author` there
# 21:44 
tantek again you cannot assume this: "Empty meta data has no meaning." it's like saying zero has no meaning.
# 21:45 
tantek Zegnat "if I am the only one who thinks this" does not matter how many people agree on a theoretical
# 21:45 
KartikPrabhu no, now I am on tantek's side
# 21:45 
tantek you should challenge yourself to propose changes for empirical reasons
# 21:45 
KartikPrabhu i think the empty check should be there
# 21:45 
tantek and to reject changes for theoretical reasons
# 21:46 
KartikPrabhu should not*
# 21:47 
Zegnat The resource at URL is written in language "" and of type "".
# 21:47 
Zegnat I can say that because we are talking about very specific metadata here. What exactly does it mean when you say:
# 21:47 
Zegnat When the HTML provides us with a specific language and type later on the page, I would want to know about these. Because they only mean something when they are defined.
# 21:47 
Zegnat That’s the point I am trying to make. We are talking about very specific metadata about URLs with specific relationships.
# 21:47 
KartikPrabhu Zegnat: mf2 consumers are free to neglect that as meaningless but I think the mf2 parsed output should be closer to authored HTML
# 21:48 
Zegnat Then I propose it returns all the things it finds, if you want it to reflect everything authored in the HTML.
# 21:48 
KartikPrabhu as in not just the first value?
# 21:49 
KartikPrabhu that does seem fine to me
# 21:49 
Zegnat Yeah, return an array, like we usually do with multiple values
# 21:49 
Zegnat We aren’t keeping only the first rel-value we find for a URL either. We compound that from all the different elements as well.
# 21:49 
KartikPrabhu this time doc ordered array
# 21:50 
Zegnat source order definitely makes sense for title and text. I am wondering if hreflang, media, and type are unordered sets like rel or not.
# 21:50 
KartikPrabhu what was the consuming side use of the rel-urls properties again? They were added later
# 21:51 
Zegnat No clue. I only ever use the rels property. Do not think I have ever consumed rel-urls. I was just working on the PHP parser for it and wanted to iron out some things I thought were inconsistent.
nitot joined the channel
# 21:57 
KartikPrabhu !tell aaronpk snarfed: do you guys consume the rel-urls from the parsed mf2?
# 21:57 
Loqi Ok, I'll tell them that when I see them next
nitot joined the channel
# 22:01 
KartikPrabhu http://microformats.org/wiki/microformats2-parsing-brainstorming#more_information_for_rel-based_formats
# 22:01 
Loqi microformats2 parsing brainstorming
# 22:05 
KartikPrabhu so [kevinmarks]  would be the one to ask
# 22:05 
Zegnat “no need for array for "name"/textContent - since there is always only one at most” - I don’t understand this argument
# 22:06 
Zegnat The spec incorporated adding more properties from links encountered later in the document. So it was already known that there could be multiple values.
# 22:07 
Zegnat Should add all of this to the issue later...
[kevinmarks] joined the channel
# 22:14 
Zegnat Pfff. Done. I can finally go to bed:
# 22:14 
Zegnat https://wiki.zegnat.net/media/textparsing.html - describes and implements an algorithm that extracts a plain text value from an element. Removes STYLE/SCRIPT and replaces IMG per customary mf2 rules, but also adds \n for P and BR elements per https://pin13.net/mf2/whitespace.html
# 22:15 
Zegnat !tell aaronpk Sneakpeak of text content extraction that (should) match all of your whitespace examples: https://wiki.zegnat.net/media/textparsing.html
# 22:15 
Loqi Ok, I'll tell them that when I see them next
# 22:24 
[kevinmarks] The rel-urls use case was xfn where you have <a href="http://tantek.com" rel="friend colleague met" >t</a> and you don't want to ha even to walk the rels and collate urls yourself
# 22:24 
Loqi Tantek Çelik
# 22:25 
KartikPrabhu [kevinmarks]: yes. but why only get the first value? why not the array of all of them
# 22:25 
KartikPrabhu sorry first value of stuff like "text" and "media"
# 22:26 
gRegorLove Zegnat++ for textparsing algorithm!
# 22:26 
Loqi zegnat has 15 karma in this channel (187 overall)
# 22:27 
[kevinmarks] I think empirically we didn't find multiple ones
# 22:28 
KartikPrabhu hmm
# 22:29 
KartikPrabhu it is still a departure from mf2 conventions where almost everything is an array
# 22:29 
KartikPrabhu or dictionary
# 22:29 
KartikPrabhu is surprised he never noticed that
[cb] and chrisaldrich joined the channel
# 22:53 
@ChrisAldrich @kaushalmodi Also, on your site I'm seeing rel=me instead of the rel="me" with the proper quotes around me. See http://microformats.org/wiki/rel-me for examples. (twitter.com/_/status/976954987778539522)
kaushalmodi and KartikPrabhu joined the channel
# 23:35 
@ChrisAldrich @huby plain old semantic HTML with microformats in combination with the webmention protocol allow one to post "likes" to one's own website and send them to others. Here's a simple example: http://boffosocko.com/2018/01/11/1-million-webmentions/ (twitter.com/_/status/976965497429352450)
[tantek] joined the channel
# 23:35 
[tantek] KartikPrabhu the parsed rels were added afterwards, as kevinmarks notes, for specific use cases and real world examples
# 23:35 
[tantek] So I’m going to oppose all theoretical proposed changes to anything real related, especially if reasoned from a “consistencies in the [parser] code” perspective. That’s very bad pluming-centric reasoning
# 23:36 
[tantek] rel* related
# 23:37 
[tantek] Seriously if you don’t have a real world example that you need to consume and the current parser spec is failing you, please stop proposing such changes. It’s a waste of time to pursue theoretical purity
# 23:37 
[tantek] That’s math/philosophy, not science. And here we are doing science
# 23:38 
[tantek] (As any spec / code that deals with actual human published data / content should )
# 23:44 
KartikPrabhu ok I am not very invested in this anyway. not my hill