#social 2018-04-30
2018-04-30 UTC
xmpp-social, fr33domlover, vasilakisfil, timbl, Loqi, mahmudov and cdchapman joined the channel
# fr33domlover Hi, where do I ask stuff about URIs?
# fr33domlover I found something weird
# fr33domlover And Idk what to do about it
cdchapman, prydt and cwebber2 joined the channel
# fr33domlover csarven, hmm ok i guess i'll just ask here
# fr33domlover suppose there is a base URI scheme:/p/a/t/h
# fr33domlover Now I want to resolve the relative URI against the base
# fr33domlover The result is: scheme://y
# fr33domlover In other words 'y' used to be a path component but now 'y' is the domain name
# fr33domlover Is this correct? Is this a problem? The meaning of the URI changed
# fr33domlover Ah oops I did it incorrectly :p
# fr33domlover the base URI would be scheme:
# fr33domlover resolve "scheme:" "/x/..//y" ===> "scheme://y"
# aaronpk here are a bunch of test cases if you want to test your relative URL resolver https://github.com/indieweb/php-mf2/blob/master/tests/Mf2/URLTest.php
# fr33domlover aaronpk, I resolved it manually using the URI spec
# fr33domlover And that's what I got
# fr33domlover It's weird
# fr33domlover aaronpk, if you have a moment to double check me please do :)
# fr33domlover aaronpk, a base URI is not required to have a host, and neither is a relative URI
# aaronpk i believe the situation you're encountering is this: "If a URI does not contain an authority component, then the path cannot begin with two slash characters" https://tools.ietf.org/html/rfc3986#section-3.3
# fr33domlover aaronpk, in the algorith it doesn't have a host
# fr33domlover aaronpk, in the algorithm you take "scheme:" and append a path "//y" to it, there's no domain name
# fr33domlover The problem is,
# fr33domlover That after you append,
# fr33domlover scheme://y just happens to look like y is a domain name
# aaronpk "host" is part of the "authority" http://tantek.com/2011/238/b1/many-ways-slice-url-name-pieces
# fr33domlover and the spec doesn't address that case
# fr33domlover aaronpk, scheme://y is constructed without a host, but, when you *parse* this URI now, 'y' gets parsed as the host
# fr33domlover aaronpk, nope if you parse it, scheme://y has a scheme "scheme" and authority "y" and empty path and no query or fragment
# fr33domlover it's like https://fsf.org
# fr33domlover aaronpk, I agree it's weird, but that's what the URI spec makes my code do, so I'm wondering whether to throw some error there instead of following the spec
# fr33domlover aaronpk, real problem
# fr33domlover aaronpk, I don't actually have empty segments in my paths but my code should handle them because the spec allows them, I want my code to correctly handle all test cases :p
# fr33domlover aaronpk, the RFC says empty path segments are allowed, and the ABNF allows them too, and indeed at the same time, it doesn't contain a single example involving empty path segments
# fr33domlover I'm with you, it should be an error to have URIs like that, the spec should say a word about it
# fr33domlover Updating the ABNF for that would be totally super ugly, if even sanely possible, but at least it should recommend or require that applications refuse to fetch resources when a URI is resolved in that way
# fr33domlover and throw an error instead
# fr33domlover Idk why the URI spec even allows empty path segments
# fr33domlover When are they ever even useful
# fr33domlover That's not the only weird case though
# fr33domlover removal of dot segments from arbitrary URIs can cause problems too, eg. "./https:/y" becomes "https:/y" which is an invalid relative URI because the first path segment is not allowed to contain a colon
# aaronpk probably worth reading through https://url.spec.whatwg.org/ as that reflects modern usage of URLs in browsers and such
# fr33domlover That spec contains a parsing algorithm instead of syntax rules