2016-03-25 UTC
snarfed1 joined the channel
snarfed joined the channel
# 01:17 aaronpk snarfed: I've started using the value that comes back in from the token endpoint as the ID which lets you type stuff like "xyz.withknown.com" in the web sign in box and have the server return an ID of xyz.withknown.com/username
# 01:17 aaronpk that also means as long as the site returns the http/https as the identity consistently, it doesn't matter which they type in when they log in
# 01:18 snarfed the "web sign in box"...meaning we'd redirect to indieauth first, and they'd type their URL/domain there instead of on our site?
# 01:20 snarfed so if client_id is missing scheme, indieauth defaults to...http...and follows redirects if necessary?
# 01:21 aaronpk man I wish I had called indieauth.com something else. Makes these conversations very confusing
yakker, M-RyanRix, begriffs and tantek joined the channel
# 02:57 KartikPrabhu KevinMarks: I am confused. Is the problem that someones else is annotating their site on a third party extension and they don't want it?
# 02:58 KevinMarks yes, which has implications for webmentions etc. But the thing is that they are pulling her site and oevrlaying on it
# 02:59 KevinMarks well, she should be able to block their scraper with robots.txt
begriffs joined the channel
KartikPrabhu joined the channel
singpolyma joined the channel
# 03:11 aaronpk that is tough, anyone can do anything with your content once it's online. but i feel like it's disingenuous to copy not just the quoted snippets but the full article *plus* the whole layout of the site, making it look like something she supports
# 03:14 KevinMarks there were a whole series of 'annotate the web' startups that died
mlncn joined the channel
# 03:22 aaronpk also what? genius.it redirects to genius.com which is "Genius is the world’s biggest collection of song lyrics and crowdsourced musical knowledge."
# 03:25 KevinMarks I din't see an obvious fetch from them in the logs when I tried it on one of my sites
# 03:29 KartikPrabhu yeah same-ish. It runs my JS but again does not dowload icons and avatars
snarfed joined the channel
# 03:30 KevinMarks ah, not the webmentiosn are failing because the url being sent is genius.it/kevinmarks.com
yakker, wolftune and j12t joined the channel
j12t, yakker, strugee, friedcell and loic_m joined the channel
j12t, glennjones, loic_m, dietrich, tantek, loic_m_, Tristitia, friedcell, jrenslin, mlncn, j4y_funabashi, sivoais, Pierre-O, finchd, hs0ucy, snarfed, nitot, shiflett and Lancey joined the channel
# 15:22 tantek does that mean we have 3+ implementations of sending & receiving salmentions?
# 15:22 Loqi Salmentions are a protocol extension to Webmention to propagate comments and other interactions upstream by sending a webmention from a response to the original post when the response itself receives a response (comment, like, etc.) https://indiewebcamp.com/salmention
# 15:23 aaronpk ooh i'd better double check I still receive them after the rewrite
# 15:24 tantek like an indiewebify.me feature? or feature request?
wolftune joined the channel
# 15:41 KartikPrabhu are salmentions sent only to the in-reply-to post or all links linked to in the post?
# 15:41 KartikPrabhu afaik when you update a post you should resend webmentions to all links
j12t joined the channel
# 15:44 aaronpk in fact it says all previous webmentions sent, which means it also includes webmentions that were previously sent that may now be removed
[kevinmarks] joined the channel
# 16:05 [kevinmarks] Well, we lost an implementation when Kyle switched to known from redwind
# 16:06 kylewm and I'm a bit hesitant to add salmentions to Known since everything is synchronous
# 16:13 aaronpk sounds like it might be time to add something like wp-cron to known?
begriffs, KartikPrabhu, snarfed and shiflett_ joined the channel
snarfed1, shiflett_, shiflett and snarfed joined the channel
# 16:47 snarfed switching all bridgy instagram users to scraping in a minute
j4y_funabashi joined the channel
# 16:55 aaronpk i think i have a bunch of updates to launch there actually
begriffs, gRegorLove and quails joined the channel
# 17:14 snarfed switching bridgy instagram back to API for now. found scraping bugs to fix!
# 17:24 kylewm re; time for Known to get a queue, is this a use case for Amazon SQS? like a hosted service where I can send events, and it would dequeue them and send them back to my site as POSTs to a given endpoint
# 17:26 Loqi [kylewm] and I'm a bit hesitant to add salmentions to Known since everything is synchronous
# 17:27 snarfed SQS would definitely work but maybe a bit heavy, esp since it would require an AWS account, either known's or the users' own.
# 17:28 snarfed some lightweight PHP package would maybe be ideal
# 17:28 snarfed reaches the end of his knowledge and stops talking.
# 17:28 aaronpk there are queue mechanisms that use a swappable backend
# 17:29 kylewm like, a lightweight php package that people could self-host?
# 17:29 aaronpk that way withknown.com could run a production one on SQS, but self-hosted ones could use a built-in one by default
# 17:30 snarfed kylewm: yeah, but not even self host, just link in directly like the rest of the third party libs known uses
# 17:31 sknebel can connect to all kinds of AWS services, but not talk to the outside world
# 17:33 ben_thatmustbeme huh.... looking at how slack integrations work... i wonder if i could just use a personal slack account as my interface to my site
# 17:34 ben_thatmustbeme when i get a webmention bring it in to slack channel, and my responses generate a post on my site
# 17:34 aaronpk that's basically how my new Monocle is going to work
# 17:34 kylewm snarfed: you'd still need to run a separate process somehow right? (re: link in directly like the rest of the third party libs)
# 17:35 snarfed kylewm: depends. not if it uses the filesystem as the backing store, or if you just say durability is best effort
# 17:35 aaronpk you can use the DB as the backing store too. it's just a question of when the task actually runs
# 17:35 snarfed oh you mean for execution, not storage. yeah, or wp-cron style, or who knows
# 17:41 aaronpk KevinMarks: kinda, but most of the time the web server has a limit on how long a script can run anyway
# 17:41 kylewm KevinMarks: yeah, Known does stuff like that in some places
# 17:43 kylewm it's a PITA because apache and nginx handle it differently
# 17:43 kylewm so like when I try to do teh export, I always get a Gateway Timeout
# 17:44 aaronpk that's not actually going to be fixed by using wp-cron or another method that runs the queue via POST request, since you'll still be limited by that timeout
# 17:45 aaronpk it's more like the super long-running jobs need to be rewritten to perform their tasks incrementally and in a way that can be resumed part way through
# 17:45 gRegorLove Yes, PHP can keep executing after the page is delivered. That's how the "lazy cron" works in ProcessWire and the PW Webmention plugin uses it for async
# 17:46 KevinMarks what I was thinking is that you could have a queue endpoint that you post to, it returns and keeps executing, and you then exit
# 17:46 aaronpk that only works if the task takes less time than the nginx/apache timeout
# 17:48 KevinMarks right, but once you have that you can do what you said and split up the long list of things into a chain of them
# 17:48 aaronpk and in the case of nginx with php-fpm, it's actually nginx that's hanging up, not php quitting on its own
# 17:48 KevinMarks I'm assuming PHP yields when waiting on network etc here, whcich may not be right
# 17:49 aaronpk "gateway timeout" means nginx didn't get a response from the backend in time, so even if php is letting itself run forever, nginx will give up
# 17:49 kylewm aaronpk: really interesting point about wp-cron, i hadn't thought of that
# 17:50 aaronpk but yes step 1 is going to be rewriting anything that runs >30 seconds to be able to run in smaller pieces on multiple requests
# 17:50 aaronpk odds are async webmention verification won't take 30 seconds, so you can get away with the lazy cron approach for that
# 17:52 kylewm and if you hit the cron every five minutes or something, then a big job could take a really long time
# 17:52 aaronpk yeah but if you run it on every request like wp-cron it would go faster
# 17:52 KevinMarks when testing salmentions with acegiak I had to turn up appengine's timeout from 5 seconds as sometimes fetching her posts was blocked by the feedreader task running
# 17:53 KevinMarks though I suppose if you POST and then load the posted page you're guaranteed at least one
# 18:00 kylewm aaronpk: does nginx actually kill the thread before it's done executing, or just stops waiting for it to return a response?
Pierre-O joined the channel
tvn joined the channel
begriffs and wolftune joined the channel
mlncn joined the channel
# 20:20 bear nginx will close the socket which should cause php-fpm to kill the process and recycle it
# 20:21 bear to make sure your php has time I would use php-fpm pools and have one pool with a longer request_terminate_timeout and then configure nginx to set the pool for a location to the higher threshold one
# 20:21 bear nginx is acting as a normal proxy so it will always just close the socket for a send/read timeout
hober joined the channel
# 20:38 kylewm i'm thinking maybe the way forward with Known is to continue to do things like POSSE and PuSH and webmentions synchronously by default but add hooks so that a plugin could offload those tasks to a queue instead
# 20:40 bear IMO known should always allow for sync processing because the average person will not want to work with a job queue
# 20:41 kylewm i just want to have a way to experiment first ya know
# 20:41 bear and if the code for doing a job is exactly the same but called from different entry-point helpers... then it's a total win IMO
# 20:42 kylewm bear: do you think background jobs should be triggered by a long-running background process totally outside of the request/response world, or triggered by a POST?
# 20:42 bear POST as that is what web folks expect
# 20:42 kylewm that's the fundamental confusion i have right now
# 20:43 bear and you don't have the pain of maintaining a daemon
# 20:43 bear using a POST means you can isolate the timeout settings in php-fpm
# 20:43 kylewm ok so like google appengine does, rather than like celery
# 20:43 bear if your location is /jobs/ * then you can have that handled by a different php-fpm pool
# 20:44 bear production php ops (I have tons of scars and hard fought lessons)
# 20:45 aaronpk oh for the record, AWS SQS totally does work with non-amazon services
# 20:49 kylewm so if i do my roll-your-own thing that way then it should be possible to abstract to SQS for the hosted service if they ever want to
# 20:49 bear I don't see why not - it would just be another plugin like thing to define the helper entry point
# 20:50 kylewm wonders how much of my total time on iwc irc has been me trying to understand task queues
# 20:50 bear sync, sync+post, async+sqs, async+postgres
# 20:51 kylewm bear: it kind of seems like you are doing a Will Shortz puzzle there
# 20:52 bear WE Sunday is one of my all time fav shows
# 20:53 aaronpk now i'm considering writing a queue adapter for Laravel which does what bear is suggesting
# 20:53 aaronpk queues the job in the DB like normal, but then makes a POST to some /jobs/ * URL to actually process it on a different fpm pool
# 20:53 aaronpk that does sound easier to maintain than a separate background process
# 20:53 bear and also runnable in environments where a lot of php folk find themselves in
# 20:54 bear php + db + a bit of apache/nginx config
# 20:54 aaronpk even if it runs on the same fpm pool it would be not terrible
# 20:54 aaronpk most of my tasks don't actually last >30 seconds anyway
# 20:54 bear yea, worse case is you have one long timeout
# 20:55 bear heck, even if you do - break the tasks up into 25 second chunks and chain them
# 20:56 bear the public facing endpoint never changes
# 20:56 aaronpk and if i make it just another Laravel queue adapter, I can always run a real background task if I want to
snarfed joined the channel
# 21:10 kylewm Huh you could end up running some jobs in parallel that way
[shaners] joined the channel
# 21:12 [shaners] !tell gregorlove: Yeah. I need to build up the redirects feature in Dark Matter better to handle the new URL scheme. (I’ve dropped the nth of day bit from my paths.)
# 21:12 Loqi Ok, I'll tell them that when I see them next
# 21:13 [shaners] I’ve been working out of the Pivotal office in Santa Monica, CA for the past couple few weeks.
snarfed joined the channel
# 21:14 [shaners] Today I signed up to do a lightning talk about #indieweb & Dark Matter. (on April 26)
# 21:14 [shaners] And I’ll prolly setup an evening session and if it has interest, a recurring evening session (weekly/monthly).
snarfed joined the channel
# 21:31 bear is the server on UTC or a EU timezone that hasn't shifted yet?
KartikPrabhu joined the channel
# 21:32 aaronpk and cron isn't smart enough to support different timezones per job
# 21:34 bear yea, it's all one TZ - if you want fancy - use fcron
begriffs joined the channel
yakker joined the channel
# 22:10 sknebel hm, why is IWC Nuremberg only as text, but not linked in there?
# 22:10 aaronpk the newsletter is generated from the microformats on the event page
# 22:16 sknebel hm, I wanted to fix, bit it looks identical to the düsseldorf one to me?
begriffs joined the channel
# 22:34 bear wouldn't the no-iframes content header prevent that kind of appropriation?
# 22:36 KevinMarks it isn't iframe as far as we can tell; I think they fetch it from the browser
# 22:38 KevinMarks when I tried it on one of my sites I didn't see a robots.txt probe or a non-browser useragent
# 22:39 bear not honouring robots.txt is a cardinal sin -- one of the many small reasons I wish the ops world would adopt a code-of-conduct
[shaners] joined the channel
# 22:41 Loqi Ok, I'll tell him that when I see him next
# 22:41 bear oh, so they have the server act as a retrieval proxy
# 22:42 KevinMarks then they inject their JS into it and have my browser fetch the rest
# 22:42 bear that's borderline sketchy - slippery slope stuff to not think of honouring robots
# 22:43 bear that's like an ISP allowing pizzahut to put overlays of coupons to every visitor of dominos
# 22:44 bear archival viewing is still different than current viewing with injected content IMO
# 22:46 sknebel yeah, they load all the assets directly, but the main page html comes from their server
tantek joined the channel
# 22:51 sknebel that's why scripts and co partly break, it's suddenly a cross origin request and a security violation the browser stops
snarfed and yakker joined the channel