#microformats 2017-06-27

2017-06-27 UTC
KartikPrabhu, j_juran, [miklb], tantek, [grantcodes], gRegorLove, AngeloGladding, barpthewire, [pfefferle], nitot and nitot_ joined the channel
# 08:49 
britishdress edited /get-started () "(-4014) Britishdress is a large website for dress sale online" (view diff)
# 08:55 
Zegnat shouts loudly at the mf wiki for throwing captchas in his face but letting bots do whatever they want
# 08:55 
zegnat edited /get-started (+4014) "Undo revision 66445 by [[Special:Contributions/Britishdress|Britishdress]] ([[User talk:Britishdress|Talk]])" (view diff)
j_juran, hober2, tommorris_, cheim_, [kevinmarks], Garbee, adactio, AngeloGladding, sebsel, barpthewire, tantek, [markmhendrickso, [miklb], TallTed, [eddie], gRegorLove, nitot, [cleverdevil] and nitot_ joined the channel
# 19:15 
sknebel http://microformats.org/wiki/microformats2-parsing#parsing_a_u-_property has a step that says "if there is a gotten value, return the normalized absolute URL of it"
# 19:16 
Loqi [Tantek Çelik] microformats2 parsing specification
# 19:16 
sknebel what's supposed to happen with a value that is not a valid URL?
# 19:17 
sknebel (https://github.com/tommorris/mf2py/issues/79 is a crash on a url of "http://www.southside.de]")
# 19:17 
Loqi [snarfed] #79 ValueError: Invalid IPv6 URL
# 19:18 
gRegorLove I don't think the parser should do anything additional in that case
[pfefferle] joined the channel
# 19:25 
Zegnat It should do what it says there, normalise http://www.southside.de] based on how the HTML spec wants you to do, right? So how does HTML handle these faulty URLs?
# 19:27 
KartikPrabhu Zegnat: I don't think there is any URL validation inside HTML
# 19:27 
sknebel Zegnat: that's what I'm trying to figure out. I think it returns an error case, which has to be handled upstream somehow, but I'm not all that familiar with this stuff
# 19:28 
sknebel gRegorLove: not doing anything additional means what? return the url unmodified? abort processing the property?
# 19:28 
KartikPrabhu sknebel: I would return it as is without doing anything
# 19:28 
Zegnat I wouldn’t surprise me if HTML says that it isn’t relative (because it has a scheme) and just return it as is as an already absolute URL that needs no processing
# 19:28 
gRegorLove Yeah, I would return "http://www.southside.de]"
# 19:29 
Zegnat But I am not 100% sure at what HTML spec says. And can’t D&D and read spec at the same time.
# 19:30 
gRegorLove The HTML has <a href="http://www.southside.de]" so the URL normalization shouldn't be a factor
# 19:31 
gRegorLove I mean, every URL is being run through the normalization method I'm sure; mf2py just needs to not die if it's an invalid URL
# 19:32 
sknebel that's the question. what's the "normalized absolute URL" of something that isn't a URL
# 19:33 
Zegnat Per spec the question is what HTML resolves "http://www.southside.de]" to. I imagine HTML resolves it to that literal string, so the parser should return the literal string as well.
# 19:34 
gRegorLove mf2py should see it starts with http:// or https:// and if it does, just return the value
# 19:34 
gRegorLove Otherwise, normalize
# 19:36 
sknebel normalization is more than just resolving relative links
# 19:37 
sknebel and the html5 specification (which relies on the whatwg url spec) seems to say "return an error"
# 19:37 
sknebel ( https://w3c.github.io/html/infrastructure.html#infrastructure-urls )
# 19:39 
sknebel but ok, general votes towards "just pass crap through", I guess a consumer has to expect that anyways since other steps don't care about the URL-ness of things
[miklb] joined the channel
# 19:39 
gRegorLove Good point, hadn't considered other normalization, like making sure scheme is lowercase.
# 19:39 
Zegnat yeah, this is the oposite of dt-. vcp of u- never gets normalised, vcp of dt- normalises more than other ways to provide the value ...
# 19:42 
Zegnat Did we find out that vcp for dt- validation returned "" (empty string) if no values were found? Would that be expected for failed u- validation?
# 19:43 
gRegorLove mf2py and php-mf2 don't normalize scheme capitalization. That could be a nice-to-have.
# 19:43 
gRegorLove Not sure about the others here: https://en.wikipedia.org/wiki/URL_normalization#Normalizations_that_preserve_semantics
# 19:44 
gRegorLove I don't think the *lack* of those causes problems for consumers, though
# 19:47 
sknebel yeah. I think since you can't rely on the output, just passing it through and letting the consumer blow up if it doesn't handle it probably is acceptable
# 19:51 
Zegnat I wonder if that should get clarified in the mf2 spec. Change away from “return the normalized absolute URL of it” to something like “if the URL does not start with a scheme, apply the containing document's language's rules for resolving relative URLs, else return the gotten value”?
# 19:58 
sknebel possibly. gotta compare what exactly the various parsers do and don't do normalization wise. and of course read the HTML stuff again when I'm more awake
# 19:59 
Zegnat I don’t think the original intent of the mf2 spec was to have to study the HTML spec though. So might be easier to clarify within mf2 spec.
# 20:06 
gRegorLove Agreed, clarification between normalization and absolutizing would be good for the mf2 spec.
# 20:06 
gRegorLove pages tantek
# 20:06 
sknebel I can try to summarize it in a GH issue tomorrow
# 20:07 
Zegnat That would be nice sknebel :)
# 20:07 
Zegnat returns to D&D
tantek, [eddie], Garbee, [shanehudson], KartikPrabhu, [cleverdevil], [manton], [chrisaldrich] and [miklb] joined the channel