#microformats 2017-06-27

2017-06-27 UTC
KartikPrabhu, j_juran, [miklb], tantek, [grantcodes], gRegorLove, AngeloGladding, barpthewire, [pfefferle], nitot and nitot_ joined the channel
#
britishdress
edited /get-started () "(-4014) Britishdress is a large website for dress sale online"
(view diff)
#
Zegnat
shouts loudly at the mf wiki for throwing captchas in his face but letting bots do whatever they want
j_juran, hober2, tommorris_, cheim_, [kevinmarks], Garbee, adactio, AngeloGladding, sebsel, barpthewire, tantek, [markmhendrickso, [miklb], TallTed, [eddie], gRegorLove, nitot, [cleverdevil] and nitot_ joined the channel
#
sknebel
http://microformats.org/wiki/microformats2-parsing#parsing_a_u-_property has a step that says "if there is a gotten value, return the normalized absolute URL of it"
#
Loqi
[Tantek Çelik] microformats2 parsing specification
#
sknebel
what's supposed to happen with a value that is not a valid URL?
#
sknebel
(https://github.com/tommorris/mf2py/issues/79 is a crash on a url of "http://www.southside.de]")
#
Loqi
[snarfed] #79 ValueError: Invalid IPv6 URL
#
gRegorLove
I don't think the parser should do anything additional in that case
[pfefferle] joined the channel
#
Zegnat
It should do what it says there, normalise http://www.southside.de] based on how the HTML spec wants you to do, right? So how does HTML handle these faulty URLs?
#
KartikPrabhu
Zegnat: I don't think there is any URL validation inside HTML
#
sknebel
Zegnat: that's what I'm trying to figure out. I think it returns an error case, which has to be handled upstream somehow, but I'm not all that familiar with this stuff
#
sknebel
gRegorLove: not doing anything additional means what? return the url unmodified? abort processing the property?
#
KartikPrabhu
sknebel: I would return it as is without doing anything
#
Zegnat
I wouldn’t surprise me if HTML says that it isn’t relative (because it has a scheme) and just return it as is as an already absolute URL that needs no processing
#
gRegorLove
Yeah, I would return "http://www.southside.de]"
#
Zegnat
But I am not 100% sure at what HTML spec says. And can’t D&D and read spec at the same time.
#
gRegorLove
The HTML has <a href="http://www.southside.de]" so the URL normalization shouldn't be a factor
#
gRegorLove
I mean, every URL is being run through the normalization method I'm sure; mf2py just needs to not die if it's an invalid URL
#
sknebel
that's the question. what's the "normalized absolute URL" of something that isn't a URL
#
Zegnat
Per spec the question is what HTML resolves "http://www.southside.de]" to. I imagine HTML resolves it to that literal string, so the parser should return the literal string as well.
#
gRegorLove
mf2py should see it starts with http:// or https:// and if it does, just return the value
#
gRegorLove
Otherwise, normalize
#
sknebel
normalization is more than just resolving relative links
#
sknebel
and the html5 specification (which relies on the whatwg url spec) seems to say "return an error"
#
sknebel
but ok, general votes towards "just pass crap through", I guess a consumer has to expect that anyways since other steps don't care about the URL-ness of things
[miklb] joined the channel
#
gRegorLove
Good point, hadn't considered other normalization, like making sure scheme is lowercase.
#
Zegnat
yeah, this is the oposite of dt-. vcp of u- never gets normalised, vcp of dt- normalises more than other ways to provide the value ...
#
Zegnat
Did we find out that vcp for dt- validation returned "" (empty string) if no values were found? Would that be expected for failed u- validation?
#
gRegorLove
mf2py and php-mf2 don't normalize scheme capitalization. That could be a nice-to-have.
#
gRegorLove
I don't think the *lack* of those causes problems for consumers, though
#
sknebel
yeah. I think since you can't rely on the output, just passing it through and letting the consumer blow up if it doesn't handle it probably is acceptable
#
Zegnat
I wonder if that should get clarified in the mf2 spec. Change away from “return the normalized absolute URL of it” to something like “if the URL does not start with a scheme, apply the containing document's language's rules for resolving relative URLs, else return the gotten value”?
#
sknebel
possibly. gotta compare what exactly the various parsers do and don't do normalization wise. and of course read the HTML stuff again when I'm more awake
#
Zegnat
I don’t think the original intent of the mf2 spec was to have to study the HTML spec though. So might be easier to clarify within mf2 spec.
#
gRegorLove
Agreed, clarification between normalization and absolutizing would be good for the mf2 spec.
#
gRegorLove
pages tantek
#
sknebel
I can try to summarize it in a GH issue tomorrow
#
Zegnat
That would be nice sknebel :)
#
Zegnat
returns to D&D
tantek, [eddie], Garbee, [shanehudson], KartikPrabhu, [cleverdevil], [manton], [chrisaldrich] and [miklb] joined the channel