2018-03-22 UTC
[tantek], sebsel, webchat52, tantek, webchat249, webchat140, j12t and [unoabraham] joined the channel; Myth0s left the channel
# 06:06 Loqi [gRegorLove] #161 Add failing tests and fixes for #158, #160
# 06:08 Loqi gregorlove has 27 karma in this channel (227 overall)
KartikPrabhu joined the channel
# 06:12 aaronpk way too late for me to look at this right now but I will review tomorrow!
tantek, [unoabraham], edsu_, nitot, voxpelli, echarlie and [pfefferle] joined the channel
# 07:42 Zegnat Hmm, I might just have time to file a PR for the rel parsing changes today, gRegorLove! :D
# 07:43 Zegnat About time I get my name on the contributors board for the PHP code
# 07:48 Zegnat All the rel parsing seems to be inside a single method, so it is a good way to get my toes wet with the php-mf2 project
# 07:52 Zegnat Oh, gRegorLove, you didn’t need to escape the hyphens in the regex for the new classes.
# 07:53 Zegnat Hyphens only have a special meaning within character classes ([]). Though it doesn’t hurt to escape them outside it is unneccessary.
# 07:54 Zegnat (The regex looked like the one I wrote for KartikPrabhu, except for the ^ $ and the hyphen escapes, which is why I spotted them in the first place.)
tantek, [kevinmarks] and nitot_ joined the channel
# 09:28 Loqi Ok, I'll tell them that when I see them next
nitot joined the channel
# 09:53 Zegnat When I read things like this, I feel like tantek missed his calling as a rapper:
# 09:53 Zegnat if url is not in the array of the key rel-value in the rels hash then add url to the array
# 09:58 Zegnat Alright, just had to push one more fix. I think it now matches Python’s output for all the things and adheres to the spec better.
[kevinmarks] joined the channel
# 11:25 Zegnat “ "text": the text content of the element if any ” - does “any” include empty strings or not?
# 11:25 Loqi [Tantek Çelik] microformats2 parsing specification
# 11:26 sknebel mh. I don't see the value in including empty strings here
# 11:27 Zegnat Neither do I, but e.g. empty title-attributes do get included as there is no such check on there
# 11:28 Zegnat PHP and Python both support empty string for title. Python accepts an empty string for text, but PHP does not.
# 11:30 Zegnat So expected behaviour is “the textContent of the element unless that is an empty string”?
# 11:36 sknebel I'd think so. a <link> doesn't have textContent anyways, so there can be cases where there is no content right?
# 11:44 Zegnat I think textContent is an empty string on link elements
# 11:44 Zegnat Pretty sure of that, actually. I think I made them clarify that in the DOM spec :P
# 11:46 Zegnat Double checked. And yes: textContent of a link element will be an empty string. So discarding empty strings there sounds right.
barpthewire joined the channel
# 12:15 Loqi [Zegnat] #32 Clarify attribute properties added to objects in rel-urls.
# 12:40 sknebel Zegnat: sorry for bugging you about it, don't have my wiki login handy: spam to kill ^^^
# 12:42 Zegnat Wow. Only the word “SHUTDOWN” is enough to pass the captcha now!
# 12:50 Zegnat GitHub confirms it, this is my first PR on the mf2 PHP parser. I almost can’t believe that. Guess my contributions so far have just been bickering about the spec itself.
[kevinmarks] and Garbee joined the channel
# 14:47 KartikPrabhu Zegnat: I think textContent should be specified more clearly in almost all places it occurs
# 14:48 Zegnat Yes, although for the parsing of values on hyperlinks, I think it is fine to at least specify “text content” as being the true textContent property of the element
# 14:49 KartikPrabhu sure but whether empty strings are allowed or not (and before/after leading space stripping) should be mentioned
# 14:50 Zegnat Ah, I didn’t bring up space stripping on that issue I believe
# 14:50 KartikPrabhu also not sure if the "remove <style> and <script>" and "replace img" is relevant
# 14:52 Loqi [Zegnat] #32 Clarify attribute properties added to objects in rel-urls.
[manton] joined the channel
# 14:52 Zegnat Since we are making some good process on the rel parsing now, it would be nice to get that clarified (and shipped) before the next stable versions
# 14:57 Loqi [kartikprabhu] `text` should also specify the following
1. Is empty string checked before/after stripping leading and trailing spaces i.e. is `text: " "` considered valid?
2. Should child `<style>` and `<script>` elements be dropped before?
3. Should child `...
# 15:10 KartikPrabhu it might be better in the spec to specify textContent in one place and refer it from others
# 15:10 Zegnat Huh, that must have been a weird GitHub & HTML bug. I coped this from #17. Will fix in a second
# 15:13 Zegnat I think bringing textContent out into its own chapter is something that must be done for #17, so I am willing to wait with that if we can get rel parsing clarified sooner
# 15:14 Zegnat Ha, when I click edit on my comment, the <img> shows! Well done GitHub Markdown. I’ll go put ` around it
[kevinmarks] joined the channel
# 15:45 Zegnat Who has commit access to microformats/tests ? [kevinmarks], you are in the org right? Can you check?
# 15:48 Zegnat would also like to submit a motion to grant more people access
[eddie], [cb], [tantek], KartikPrabhu, [colinwalker], j12t, nitot and tantek joined the channel
# 18:09 Zegnat Working on the aaronpk-plaintext-whitespace-variant again :D
# 18:12 Zegnat I have that testing JS implementation. But actually writing it down into a implementable (and comprehendable) spec, rather than just pointing at that code, is a different matter
# 18:28 Loqi [Zegnat] #32 Clarify attribute properties added to objects in rel-urls.
nitot, Kyle-K and KartikPrabhu joined the channel
# 20:11 KartikPrabhu Zegnat: I have +1 ed it. Maybe get gRegorLove's thoughts too and we will satisfy "change control"
# 20:11 Zegnat Yep, hoping gRegorLove will find the time to review my PR against php-mf2 anyway :)
# 20:22 KartikPrabhu Zegnat: maybe add a test example with expected output that verifies your change. I can put this change in experimental mf2py too
# 20:23 Zegnat I am not sure there are any proper tests for current expected output. So would need multiple tests to be sure to cover all cases.
# 20:24 Zegnat Currently (still) busy writing up and testing a textContent algo, so those tests will have to wait a minute
hurdygurd, tantek and nitot joined the channel
# 21:18 gRegorLove Zegnat: Re #32, do we have any real world examples of markup like `<a href="#a" rel="a" hreflang=""></a>
# 21:19 gRegorLove I'm mostly +1 on that issue, but not sure about the "and not an empty string" part
# 21:19 Zegnat No. That’s a synthetic example of why we would not want to keep empty strings when the document may later provide an actual value
[jeremycherfas] joined the channel
# 21:20 Zegnat Actually, it looks like Python may already be overwriting empty values, so could be we already have parser implementation on that
# 21:21 gRegorLove I'll defer to others on that, but my inclination is to keep it if it's authored.
# 21:22 Zegnat Basically I am saying “keep the first non-empty value”, rather than “keep the first value”.
# 21:22 Zegnat Please do comment with hesitations though! All is important :)
# 21:23 gRegorLove Understood. But it's discarding the first (empty) authored one. It "feels" like the parser's trying to fix a publisher mistake.
chrisaldrich joined the channel
# 21:26 Zegnat I always look at your dev version these days KartikPrabhu
# 21:28 tantek indeed, consider the author perspective before any theoretical / academic / purity perspectives
# 21:28 tantek if an author expliictly provides an empty attribute, they went to some work to do so, therefore there is likely some intent there
# 21:29 Zegnat I was considering an empty title for a URL, followed in the same document by a specified title for the same URL, to be likely not worth keeping. Instead keeping the authored value.
# 21:29 KartikPrabhu yeah it seems mf2py is discarding empty values. I thought I fixed that
# 21:29 Zegnat Yes, but we keep only one of them. Even though all of the ones authored are valid ones.
# 21:30 Zegnat So if all of the authored ones are valid, it made sense to me to atleast keep the first non-empty one as the one we (already arbitrarily) pick
# 21:30 gRegorLove Is a better question: should these rel attributes store multiple values? array?
# 21:32 gRegorLove Eh, backing up on that a bit -- would want examples before asking my question.
# 21:32 Zegnat For that we will only set the `text` property for `#` to "Author". And skip "Permalink too!".
# 21:32 Zegnat <a href="#" rel="author">Author</a><a href="#" rel="bookmark">Permalink too!</a>
# 21:32 Zegnat We arbitrarily decide that the first one must be the one the author *really* wanted.
# 21:33 KartikPrabhu ha! it seems mf2py always gets the last attribute for hreflang and others which is definitely a bug
# 21:33 tantek Zegnat: no, not arbitrary, rather, it's a simple and predictable model
# 21:34 Zegnat Yes, always last value is definitely a bug, KartikPrabhu :)
# 21:35 Zegnat Where `text` will be "" and not "Permalink" if we keep empty values.
# 21:35 Zegnat <link href="#" rel="author"><a href="#" rel="bookmark">Permalink</a>
# 21:35 Zegnat I would say first non-empty is just as simple and predictable. Especiall for e.g.:
# 21:36 Zegnat You only add additional keys if they are defined by properties later in the document
# 21:37 tantek Zegnat, except then you remove the ability for the author to force a blank value explicitly
# 21:37 tantek I think you're overthinking it with a theoretical example
# 21:37 tantek and IMO that's enough to reject that line of thinking
# 21:37 tantek that is, a theoretical example is insufficient to justify a change
# 21:38 tantek whereas I've just given you a theoretical *feature* that exists in the status quo. you provide no reason to remove it.
# 21:39 tantek you cannot make assumptions like "Clearly the empty string adds no information about the URL"
# 21:39 tantek the author's explicit authoring was considered, without any opinion about whether it adds information or not
# 21:41 Zegnat Except it does not add any information is this specific case. An empty string for `media` means nothing at all. While if the author specifies a correct media value later in the page we should probably tell our consumer about that useful information.
# 21:41 Loqi [kartikprabhu] #65 rel-urls use last value instead of first one
# 21:41 Zegnat But please comment on the issue so the spec change proposal can be adjusted :)
# 21:43 Zegnat This isn’t arbitrary text like a p- property, we are talking about metadata on URLs. Empty meta data has no meaning. If the author doesn’t have the correct meta data to provide, just leaving the HTML attributes off is perfectly fine.
# 21:43 Zegnat But I am happy to revise the proposal again if I am the only one who thinks this
# 21:43 KartikPrabhu actually from a debugging HTML point of view I would think an empty value would be more of a red flag
# 21:43 Zegnat And, yes, gRegorLove. I do believe my patch will get `author` there
# 21:44 tantek again you cannot assume this: "Empty meta data has no meaning." it's like saying zero has no meaning.
# 21:45 tantek Zegnat "if I am the only one who thinks this" does not matter how many people agree on a theoretical
# 21:45 tantek you should challenge yourself to propose changes for empirical reasons
# 21:45 tantek and to reject changes for theoretical reasons
# 21:47 Zegnat The resource at URL is written in language "" and of type "".
# 21:47 Zegnat I can say that because we are talking about very specific metadata here. What exactly does it mean when you say:
# 21:47 Zegnat When the HTML provides us with a specific language and type later on the page, I would want to know about these. Because they only mean something when they are defined.
# 21:47 Zegnat That’s the point I am trying to make. We are talking about very specific metadata about URLs with specific relationships.
# 21:47 KartikPrabhu Zegnat: mf2 consumers are free to neglect that as meaningless but I think the mf2 parsed output should be closer to authored HTML
# 21:48 Zegnat Then I propose it returns all the things it finds, if you want it to reflect everything authored in the HTML.
# 21:49 Zegnat Yeah, return an array, like we usually do with multiple values
# 21:49 Zegnat We aren’t keeping only the first rel-value we find for a URL either. We compound that from all the different elements as well.
# 21:50 Zegnat source order definitely makes sense for title and text. I am wondering if hreflang, media, and type are unordered sets like rel or not.
# 21:50 KartikPrabhu what was the consuming side use of the rel-urls properties again? They were added later
# 21:51 Zegnat No clue. I only ever use the rels property. Do not think I have ever consumed rel-urls. I was just working on the PHP parser for it and wanted to iron out some things I thought were inconsistent.
nitot joined the channel
# 21:57 KartikPrabhu !tell aaronpk snarfed: do you guys consume the rel-urls from the parsed mf2?
# 21:57 Loqi Ok, I'll tell them that when I see them next
nitot joined the channel
# 22:05 Zegnat “no need for array for "name"/textContent - since there is always only one at most” - I don’t understand this argument
# 22:06 Zegnat The spec incorporated adding more properties from links encountered later in the document. So it was already known that there could be multiple values.
[kevinmarks] joined the channel
# 22:15 Loqi Ok, I'll tell them that when I see them next
# 22:24 [kevinmarks] The rel-urls use case was xfn where you have <a href="http://tantek.com" rel="friend colleague met" >t</a> and you don't want to ha even to walk the rels and collate urls yourself
# 22:25 KartikPrabhu [kevinmarks]: yes. but why only get the first value? why not the array of all of them
# 22:26 Loqi zegnat has 15 karma in this channel (187 overall)
# 22:29 KartikPrabhu it is still a departure from mf2 conventions where almost everything is an array
[cb] and chrisaldrich joined the channel
kaushalmodi and KartikPrabhu joined the channel
[tantek] joined the channel
# 23:35 [tantek] KartikPrabhu the parsed rels were added afterwards, as kevinmarks notes, for specific use cases and real world examples
# 23:35 [tantek] So I’m going to oppose all theoretical proposed changes to anything real related, especially if reasoned from a “consistencies in the [parser] code” perspective. That’s very bad pluming-centric reasoning
# 23:37 [tantek] Seriously if you don’t have a real world example that you need to consume and the current parser spec is failing you, please stop proposing such changes. It’s a waste of time to pursue theoretical purity
# 23:37 [tantek] That’s math/philosophy, not science. And here we are doing science
# 23:38 [tantek] (As any spec / code that deals with actual human published data / content should )