#social 2018-04-30

2018-04-30 UTC
xmpp-social, fr33domlover, vasilakisfil, timbl, Loqi, mahmudov and cdchapman joined the channel
#
fr33domlover
Hi, where do I ask stuff about URIs?
#
fr33domlover
I found something weird
#
fr33domlover
And Idk what to do about it
#
csarven
just ask?
#
csarven
Possibly also in freenode #swig
cdchapman, prydt and cwebber2 joined the channel
#
fr33domlover
csarven, hmm ok i guess i'll just ask here
#
fr33domlover
suppose there is a base URI scheme:/p/a/t/h
#
fr33domlover
and there's a relative URI /x/..//y
#
fr33domlover
Now I want to resolve the relative URI against the base
#
fr33domlover
The result is: scheme://y
#
fr33domlover
In other words 'y' used to be a path component but now 'y' is the domain name
#
fr33domlover
Is this correct? Is this a problem? The meaning of the URI changed
#
fr33domlover
Ah oops I did it incorrectly :p
#
fr33domlover
the base URI would be scheme:
#
fr33domlover
resolve "scheme:" "/x/..//y" ===> "scheme://y"
#
aaronpk
that sounds like incorrect relative URL resolution
#
aaronpk
the hostname part of the URL doesn't change when you resolve a relative URL that contains just a path
#
aaronpk
here are a bunch of test cases if you want to test your relative URL resolver https://github.com/indieweb/php-mf2/blob/master/tests/Mf2/URLTest.php
#
fr33domlover
aaronpk, I resolved it manually using the URI spec
#
fr33domlover
And that's what I got
#
fr33domlover
It's weird
#
aaronpk
i don't believe you : )
#
aaronpk
looks like you missed the host component
#
fr33domlover
aaronpk, if you have a moment to double check me please do :)
#
fr33domlover
aaronpk, a base URI is not required to have a host, and neither is a relative URI
#
aaronpk
then the resolved URI also wouldn't have a host
#
aaronpk
i believe the situation you're encountering is this: "If a URI does not contain an authority component, then the path cannot begin with two slash characters" https://tools.ietf.org/html/rfc3986#section-3.3
#
fr33domlover
aaronpk, in the algorith it doesn't have a host
#
fr33domlover
aaronpk, in the algorithm you take "scheme:" and append a path "//y" to it, there's no domain name
#
fr33domlover
The problem is,
#
fr33domlover
That after you append,
#
fr33domlover
scheme://y just happens to look like y is a domain name
#
Loqi
[Tantek Çelik] How many ways can you slice a URL and name the pieces?
#
fr33domlover
and the spec doesn't address that case
#
fr33domlover
aaronpk, scheme://y is constructed without a host, but, when you *parse* this URI now, 'y' gets parsed as the host
#
aaronpk
"scheme://y" isn't a valid URI in your case because the path begins with two slashes
#
fr33domlover
the problem is that absolute URIs are not allowed to start with // but if you have a URI /x/..// and you resolve the .. segment it ends up starting with //
#
fr33domlover
aaronpk, nope if you parse it, scheme://y has a scheme "scheme" and authority "y" and empty path and no query or fragment
#
fr33domlover
it's like https://fsf.org
#
aaronpk
what i'm saying is you should never have created that in the first place since it's not valid
#
aaronpk
anyway is this an actual problem or just a brain teaser?
#
fr33domlover
aaronpk, I agree it's weird, but that's what the URI spec makes my code do, so I'm wondering whether to throw some error there instead of following the spec
#
fr33domlover
aaronpk, real problem
#
aaronpk
my understanding is it should throw an error since you can't create a URI with no host where the path begins with two slashes
#
fr33domlover
aaronpk, I don't actually have empty segments in my paths but my code should handle them because the spec allows them, I want my code to correctly handle all test cases :p
#
aaronpk
did you find this example in a test suite somewhere?
#
aaronpk
it isn't listed in the tests in the RFC
#
fr33domlover
aaronpk, the RFC says empty path segments are allowed, and the ABNF allows them too, and indeed at the same time, it doesn't contain a single example involving empty path segments
#
aaronpk
hm, there have been a few updates to that RFC, i wonder if any of them have addressed it
#
fr33domlover
I'm with you, it should be an error to have URIs like that, the spec should say a word about it
#
fr33domlover
Updating the ABNF for that would be totally super ugly, if even sanely possible, but at least it should recommend or require that applications refuse to fetch resources when a URI is resolved in that way
#
fr33domlover
and throw an error instead
#
fr33domlover
Idk why the URI spec even allows empty path segments
#
fr33domlover
When are they ever even useful
#
fr33domlover
That's not the only weird case though
#
fr33domlover
removal of dot segments from arbitrary URIs can cause problems too, eg. "./https:/y" becomes "https:/y" which is an invalid relative URI because the first path segment is not allowed to contain a colon
#
aaronpk
probably worth reading through https://url.spec.whatwg.org/ as that reflects modern usage of URLs in browsers and such
#
fr33domlover
That spec contains a parsing algorithm instead of syntax rules