2016-11-02 UTC
# 00:11 tantek hey so I'm filling out the Webmention implementation report and I have a silly question
# 00:14 aaronpk loopback address is an IP address that corresponds to the same machine that is making the request, also known as localhost. Requests made to this address bypass the network hardware, so are often used when testing websites while offline. The IPv4 space reserves all 127.*.*.* IP addresses as loopback addresses.
# 00:19 tantek ok I think I'll submit without that and then add support
# 00:21 aaronpk yeah, i did the same. telegraph doesn't take that into account yet
# 00:33 tantek all 21 discovery tests (still) pass! implementation report submitted!
KevinMarks_, KevinMarks and tantek joined the channel
# 04:37 tantek hmm - I'm wondering how early to reject loopback
# 04:38 tantek I'm wondering if the intent there was ignore that endpoint, rather than not send
# 04:38 tantek i.e. if there's a second rel=webmention, why not use that?
# 04:40 tantek I want to interpret that test as: "During the discovery step, if the sender discovers the endpoint is localhost or a loopback IP address (127.0.0.0/8), it SHOULD NOT send the Webmention to that endpoint." - note the addition of "to that endpoint"
# 04:49 tantek well the implementation report says "The sender avoids sending a Webmention to a loopback address (SHOULD)"
# 04:50 tantek which means my interpretation of the spec would be consistent with what the implementation report is expecting
# 04:51 tantek I'm thinking loopback should be rejected for *any* rel value discovery
# 04:52 KartikPrabhu oh dang! I don't even know how to detect loopback URLs in python. Might steal bear's kaku code
# 04:52 tantek what? how about host is an IP address that starts with 127. ?
KevinMarks joined the channel
# 04:53 KartikPrabhu yeah I don't know how to do that! :P I am really a n00b masquerading around here
# 04:53 KartikPrabhu i just search the Web for solutions and then implement them in my code
# 04:53 tantek anyway I'm considering rejecting any 127.* IP host as a rel= href
# 04:54 KartikPrabhu do you consume rel-tags? I thought those wre deprecated for u-category?
# 04:55 tantek I don't think it ever makes sense for a site to imply rel target that is 127.
# 04:56 tantek but yea, u-category is much better for "part of a microformat object" like in an h-card, h-event, h-entry
# 05:02 tantek I'm not sure I've ever seen it - but it could happen accidentally as a bug, a default slipping through
# 05:03 tantek thanks KartikPrabhu, I'll file an editorial issue
# 05:04 tantek KartikPrabhu: always good to have someone verify your analysis
KevinMarks joined the channel
# 05:07 tantek KartikPrabhu: it's served me quite well so far, applying more scientific rigor to web standards development
# 05:10 KartikPrabhu if I am interpreting this correctly, you mean to suggest the language so that other discovered webmention endpoints ( which are not loopbacks) can still be used
# 05:11 KartikPrabhu which brings me to my next question, do sites advertise more than one webmention endpoints?
cweiske joined the channel
KevinMarks joined the channel
KevinMarks_ and ChrisAldrich joined the channel
# 07:24 tantek alright for now I'll limit the loopback filtering to webmention and pingback discovery
# 08:00 bear KartikPrabhu i'll check on that later - right now i'm heading to bed after a crazy evening where I had to save someone from themselves because they didn't realize they were having a stroke
# 08:00 bear but yea, I think I implemented it but let's be sure
chrisaldrich1 joined the channel
# 08:14 tantek and just got loopback testing working on endpoints
tantek joined the channel
# 08:34 Loqi Ok, I'll tell them that when I see them next
cweiske joined the channel
# 09:22 tantek voxpelli: no explicit deadline per se, however, implicitly, the sooner the better, especially for Webmention, since now 500+ W3C members are looking at it and deciding on whether to vote for it to advance to Recommendation
# 09:23 voxpelli tantek: so higher priority with Webmentions? I should give KevinMarks_ PR a look then
# 09:24 tantek yes, higher priority for Webmentions at the moment
# 09:26 tantek plus if you know any W3C members, encourage them to vote YES to advance Webmention to Recommendation
mblaney joined the channel
# 11:19 mblaney tantek I just found out my landlord is a W3C member, probably pushing the relationship to bring it up though! ;-)
# 11:23 mblaney btw reverted my change to the loopback defn... had /8 masking wrong in my head :-P
# 11:39 mblaney good point cweiske. shouldn't be too hard to add.
# 11:55 mblaney the only trick being that you also need to check for optional square brackets in urls, because using a character that delimits port numbers makes sense.
# 11:58 mblaney oh and :: can collapse an arbitrary number of sections.
# 12:36 sknebel note that you should also not accept hostnames that point in DNS to loopback
# 12:37 sknebel and that there are other ways of writing IPs than separated by dots
# 12:46 sknebel using a proxy or the system firewall IMHO is the best way of enforcing it, if you run on shared hosting it's mostly your hosters job to protect themselves and other users against it
loicm_, nebulon and tantek joined the channel
gRegorLove joined the channel
# 16:54 aaronpk that will only catch cases where the URL given is literally 127.*.*.*
# 16:54 tantek no it catches all URLs with any 127.*.*.* hostname
# 16:56 aaronpk right, so it won't catch e.g. "http://localhost/"
# 16:56 aaronpk or even more sneaky, any other domain that happens to resolve to 120.0.0.1
# 17:00 Loqi gRegorLove: tantek left you a message 1 day, 19 hours ago: mind updating the home page indieweb.org with latest / next HWC / IWC event infos? Thanks!
# 17:10 tantek gRegorLove: since when can URLs take IPv6 loopback addresses?
# 17:10 tantek can you construct clickable links for those and add them to the wiki page?
# 17:10 tantek otherwise it looks quite technically theoretical
# 17:10 aaronpk depends on whether your client resolves ipv6 addresses
# 17:11 tantek no that's not a depends, what does the URL spec say?
# 17:12 bear osx does, windows doesn't, some versions of ubuntu server side dont
# 17:13 tantek gRegorLove: does that mean you have to pre-resolve the domain before calling cURL?
# 17:13 bear and then you also get localhost for ipv6 issues
# 17:13 tantek or can you tell cURL to not do localhost IPs? or?
# 17:14 bear any good tool simply takes the netaddr part of the URL and requests it to be resolved
# 17:14 bear they shouldn't care or know that it's a "localhost" or "loopback"
# 17:14 gRegorLove tantek: My code is using PHP's dns_get_record and checking the A or AAAA records, which I guess does not catch meta redirects
# 17:15 bear netaddr is what the domain portion of a URL is resolved into
# 17:16 tantek bear then I'm confused, you said netaddr is requested to be resolved, then also it is what is resolved into?
# 17:16 bear netaddr is literally the network address of the host being targeted by a URL
# 17:16 bear so that happens during DNS resolution
# 17:17 bear (answering from the point of view of a web developer now -- don't even want to bring up other transport issues)
# 17:17 tantek I think "IP address" is a more used term than "netaddr"
# 17:18 bear IP address is an address used to identify a single machine or server on the network
# 17:18 bear I use netaddr because it's referenced by a lot of libraries when parsing urls
# 17:19 bear but i'll stop doing that as it's not web dev
# 17:19 bear but you do need to seperate "network location" from "ip address"
# 17:20 bear localhost is a network location that also happens to be commonly resoved to 127.0.0.1
# 17:20 bear network location is the FQDN part of a URL
# 17:21 bear i'm going to have to go back thru them and fill out some details when i'm not work distracted
# 17:21 tantek wish when we tunneled like that that Loqi would know enough to go back and wikilink the previous dfn use of the next jargon term
# 17:22 bear well, if i'm being snarky I would say yes, it's what .io domains get resolved into as their domain registrar is flakey
# 17:24 tantek gRegorLove: are you calling the dns resolve call on every webmention URL host before you send your webmention
# 17:24 bear for the loopback test - I'm wondering if a simple list of domains to avoid would work
# 17:25 tantek not if aaronpk is going to keep creating new ones
# 17:27 bear yea, that code would fail on quite a few loopback/localhost domains
# 17:27 gRegorLove I don't think I'm checking for loopback on received webmentions yet
# 17:28 bear localhost is a convention - the only way to know is to get the ip address of it
# 17:28 bear loopback ip addresses are a well defined set
# 17:28 gRegorLove I use: filter_var($ip, FILTER_VALIDATE_IP, FILTER_FLAG_NO_RES_RANGE)
# 17:29 bear most domain-ipaddress libraries have a way of telling if an IP is private
# 17:30 tantek hmm the localhost6 subdomain didn't resolve for me
# 17:32 bear checking if the returned IP address is one of the 127.* or 169.* is the safest (or ::1 for ipv6)
# 17:32 tantek interesting, neither my local machine nor my server can resolve that domain
# 17:32 aaronpk on the command line, try `dig aaaa localhost6.webmention.rocks`
# 17:33 bear localhost6.webmention.rocks. 7199 IN AAAA ::1
# 17:34 bear dig is a command line tool used to query DNS servers for information about a domain's Zone definition
KevinMarks joined the channel
# 17:36 tantek gRegorLove: mind starting a new "How To" section on /loopback that documents your techniques for detection / avoidance?
# 17:38 bear gRegorLove - yea, I always get a nice spike in metrics when networky things are worked on
# 17:40 gRegorLove Loqi needs a "remind me to" feature so I can queue up stuff like that for later. A la Slack
# 17:41 aaronpk 2 minutes until gRegorLove don't forget to do the thing
# 17:41 Loqi I added a countdown scheduled for 2016-11-02 5:43pm GMT+0000 (#5930)
# 17:41 bear 8 hours until tell me to review/edit new wiki links
# 17:41 Loqi I added a countdown scheduled for 2016-11-02 9:41pm EDT (#5931)
# 17:41 tantek User:Gregorlove.com << mind starting a new "How To" section on [[loopback ]] that documents your techniques (2016-11-02 in irc) for detection / avoidance?
# 17:41 Loqi ok, I added "mind starting a new "How To" section on [[loopback ]] that documents your techniques (2016-11-02 in irc) for detection / avoidance?" to the "See Also" section of /User:Gregorlove.com
# 17:42 aaronpk notice that bear's timer is in his local timezone :)
# 17:42 bear aaronpk++ on human centric bot design
# 17:42 Loqi aaronpk has 13 karma in this channel (1128 overall)
# 17:43 tantek waits for !todo kind of like !tell but adds to people's "To Do" section like << does to a "See Also"
# 17:44 Loqi gRegorLove don't forget to do the thing
# 17:45 tantek hey at least Loqi is not creating / adding to an "Inbox" section on your user page ;)
# 17:47 bear isn't that the "add a see also link to wiki page" syntax
# 17:49 Loqi loqi has 1 karma in this channel (409 overall)
# 17:49 Loqi aaronpk has 14 karma in this channel (1129 overall)
# 17:53 tantek gRegorLove++ for asking for it (the feature that is, “a "remind me to" feature so I can queue up stuff like that for later. A la Slack”)
# 17:53 Loqi gregorlove has 7 karma in this channel (87 overall)
KevinMarks joined the channel
# 18:11 aaronpk hm i seem to have lost my notes on setting up my screenshot->micropub workflow
chrisaldrich_ joined the channel
# 18:18 tantek aaronpk, check your queue of stuff to write posts about!
# 18:20 sknebel bear, do you see any reason to handle localhost in your Webmention code vs in the system firewall (for systems where you controll that)?
# 18:20 sknebel hacking DNS resolution in requests is a bit annoying to do
# 18:21 aaronpk the loopback address doesn't go over the network though, so the firewall doesn't apply
# 18:32 bear if the person (or network person or whoever) has defined a domain to point to localhost then we should honour it
# 18:32 bear in the "real world" localhost is invalid IMO for received webmentions
# 18:33 tantek we should not honor obvious mistakes are (perhaps unintentional) attempts to access local (webmention sender) resources by the external (webmention receiver) host
# 18:33 bear right - your saying what I was thinking in a better way
# 18:34 bear things coming into my site need to be checked to make sure the source url is safe and also to clean the target url
# 18:34 bear as an attacker could make a target url contain a malformed url
# 18:34 bear I will, gathering up info and chewing it over in my brain now
# 18:35 bear I think I will make it very secure by default and add a debug flag to allow dev tests to use local host
# 18:36 bear (i'm realizing that my python libs are now viewed by more folks and used as a source of patterns so I have to be very purposeful in the changes I make to them)
# 18:36 sknebel I've firewalled off private RFC 1918 IP space too, but that's because I know I don't have services that might want to use webmention there
# 18:37 sknebel published software might be used on an intranet where that isn't the case
# 18:37 tantek bear, and that's a good explanation for why I don't open source the rest of Falcon
# 18:37 bear yea, I think the pattern should be block everything and only if needed allow a whitelist of ips
# 18:39 bear ipv6 means we also have to look into RFC 4193 now
# 18:49 tantek e.g. if a webmention endpoint is on a different domain than where I discovered it, consider not sending unless it's on a whitelist (bridgy, webmention.io, webmention.heroku etc.)
# 18:50 aaronpk hm it sounds like what you actually want is for the endpoint to confirm it is an endpoint and that it handles webmentions for a given domain
# 18:51 tantek you can't ask the endpoint anything if it's already a malformed URL
# 18:52 aaronpk you're going to have to check if it's a malformed URL, and if you don't, your HTTP client will fail out anyway
barryf joined the channel
# 18:59 bear loopback checking, IMO, is adding a third layer of checks to the domain matching and vouch checks already in place
# 18:59 bear *after* you have a good domain, then resolve it to find out if it's a loopback
# 19:00 barryf Hello all. I'm almost done with my Micropub.rocks tests but I've hit 804: rejecting an unauthorized access token. I need to generate a token but don't know of a quick way to do so. Does anyone have a live tool I could use to log in and generate one?
# 19:00 aaronpk where do your tokens come from right now? are you using tokens.indieauth.com or your own server?
# 19:02 aaronpk I can't think of a quick way to do that, but you need to generate a token that doesn't have "create" scope. You could log in to Quill and change the scope that it's requesting for example
# 19:04 barryf I thought about hacking together something for that purpose. Sounds like it might be a useful tool. When you say I could change the scope via Quill, is there a way of configuring the scope it requests?
# 19:06 aaronpk but yeah you should be able to change the URL that quill redirects you to and adjust the scope value that's in the query string
KevinMarks joined the channel
# 19:06 barryf Nice! Creating my own endpoint is on my list. Need to finish off my new server software first. Nearly there.
ChrisAldrich joined the channel
chrisaldrich1 joined the channel
barryf joined the channel
KevinMarks, tantek, KevinMarks_ and gRegorLove joined the channel
# 23:22 tantek alright let me sync a few more changes with what I'm selfdogfooding and I can cut a release
# 23:22 tantek what's live on my server has been stable for a while, and no API breaking changes
# 23:43 aaronpk gRegorLove: get_dns_record isn't a built-in function, what is that?
# 23:44 sknebel gRegorLove: you should mention that the resolved IP should be used for all communication -> if you do this check, but then give the full domain to e.g. curl it will redo the resolve process and could get a different answer
# 23:45 sknebel (really should be in the general description, but I can't come up with a nice way of explaining it right now... really should go to sleep and try tomorrow ;))
# 23:46 gRegorLove filter_var will return the IP address if it's valid and not in the reserved range, otherwise it returns boolean false.
# 23:47 aaronpk also what happens if the domain is a CNAME to something?
# 23:47 aaronpk (I made localcname.webmention.rocks to test but it won't be active for another 15 minutes or so)
# 23:48 aaronpk localcname.webmention.rocks -> CNAME to localhost.webmention.rocks -> A 127.0.0.1
# 23:48 gRegorLove I don't think it will catch CNAME. Guess it depends if dns_get_record follows the chain
# 23:48 bear yea, just like url redirects - you have to follow the whole chain
# 23:51 sknebel I looked into teaching python's requests library to run such a check for each resolving it does... have to check if I can find the code for that again
# 23:52 gRegorLove sknebel: Interestin re: using the IP. Is the idea an attacker could server a legit IP and then quickly change it to loopback?
# 23:52 bear yea, even the python libraries depend on the system's dns resolution for that - it becomes hard to check things
# 23:52 aaronpk gRegorLove: that's the basis of the attack i linked
# 23:53 sknebel bear: in requests there are some hooks you can plug into, but in the end that's among the reasons why I decided to go with just firewalling the process of
# 23:54 aaronpk yeah it seems like this needs to be part of HTTP client libraries
# 23:54 aaronpk i mean the other alternative is make sure you don't have any services listening on 127.0.0.1
# 23:54 bear ^^ that is the only true solution IMO
# 23:55 bear sure it's possible, but how likely is it for 99% of what your doing
# 23:55 tantek or just have a logging service, for weakness/attack detection
# 23:56 gRegorLove Hah, from PHP docs "Because of eccentricities in the performance of libresolv between platforms, DNS_ANY will not always return every record, the slower DNS_ALL will collect all records more reliably."
# 23:56 bear I think it could be done for webmention and micropub using a "sanity check" helper
# 23:56 bear and the sanity checker would need to have a paranoia flag - how insane do you want it to check