2016-09-02 UTC
AngeloGladding and loicm__ joined the channel
KevinMarks_, tantek and KevinMarks joined the channel
KevinMarks_ and tantek joined the channel
cweiske, KevinMarks and loicm__ joined the channel
# 07:33 cweiske regarding my ES performance problems: I need to play around with keep-alive connections
KevinMarks, loicm__ and cmal joined the channel
# 08:41 cweiske ES performance: keep-alive did not change anything. replacing an does-document-exists query with a get-document query shrinked time for document with some hundred urls from 18s to 1.5s
# 08:43 cweiske haha. HTTP_Request2's socket adapter is 1.5 times faster than the curl adapter
# 09:38 cweiske funny which bugs you can find once you dig into a project
# 10:08 cweiske .. but the crawler is now in 2015 already, which is a huge improvement compared to the old code
rMdes, loicm__, cmal, KevinMarks and miklb joined the channel
cmal joined the channel
cmal joined the channel
tantek joined the channel
cweiske joined the channel
cmal joined the channel
# 15:38 cweiske chat.indieweb.org, config file: are channels given as "#chan" or as "chan"?
# 15:39 aaronpk heh this config file got to be a little more than just config. i'll post a copy
# 15:40 cweiske I had to manually add static functions to it to make it work
# 15:43 aaronpk depends on whether it's used for URL matching or locating files on disk
# 15:43 aaronpk meh it's fine, i'm not really motivated to change it
# 15:49 aaronpk btw in case it wasn't obvious, the fact that this uses a database is only a temporary measure right now
# 15:50 aaronpk it's actually supposed to be reading the log files from disk, but I didn't finish migrating all the logs before I launched newloqi and this site
# 15:51 aaronpk yeah just wanted to warn you so you don't end up doing a lot of work with the DB since i'm going to rip all that out soon
# 15:54 aaronpk oh weird. i thought i looked up the right format for that
# 15:56 cweiske it was nearly correct but gave "UTC" when it wanted 0000
# 16:08 cweiske the one major issue is that join messages are not ignored but the full text of the page is shown in the result
# 16:10 cweiske probably the db vs. logfile thing that aaronpk mentioned
# 16:11 aaronpk yeah it's unfortunately harder to query for "nearby" entries with the file approach
# 16:12 aaronpk they are. i didn't say it was impossible, just harder
# 16:12 aaronpk with SQL it's super easy to be like "give me the previous 10 things"
# 16:13 aaronpk but with the file, the previous 10 things might span to other files
# 16:13 cweiske everyone stop now and document that as #1 reason against flat file storage!!
# 16:13 tantek huh, I did implement that for my next/prev buttons in Falcon but I'm not sure what the trick was
# 16:13 aaronpk finding the next 1 item is easier, but finding the next 10 can span up to 10 files
# 16:14 aaronpk (super low traffic channels might have only one message a day)
# 16:14 tantek sure. my "most recent n articles" code has to traverse backwards through files similarly
# 16:15 tantek the traverse backwards/forwards through storage files is common code
# 16:15 aaronpk i'll do it eventually, but it's a lot of code i didn't have to write for the DB approach
# 16:16 aaronpk i need that kind of seeking functionality for QuartzDB anyway so i'll only have to write it once
# 16:16 voxpelli Had to write a similar thing against a Rest API that had a fixed upper count limit that was lower than what I needed
# 16:16 tantek that's what I ended up doing, writing it once, and then depending on it in new and interesting ways with various features
# 16:19 aaronpk i will merge/release that later but in the middle of something else right now
gRegorLove, miklb, tantek and cmal joined the channel
KevinMarks joined the channel
KevinMarks joined the channel
# 18:37 tantek.com edited /annotation (+145) "link to use-cases, separate News Genius page, silo vs other examples, specific Criticism subhead, link to WG, note CG was previous" (
view diff )
# 18:51 aaronpk "The last missing piece: one of the best features about dynamic blogs is the ability to drag-and-drop images right into a post." ... "I fixed this by hacking together a simple service which lets me drag-and-drop images and automatically upload them to my server. The site gives you back a URL and a markdown image tag ready to use."
# 18:51 aaronpk that sounds similar to the idea of the micropub media endpoint
KevinMarks and tantek joined the channel
# 20:29 voxpelli tantek: at work we build our site with React and it works fully without js (and actually doesn't yet or ever use React on frontend – not yet needed in any way)
rMdes and tantek joined the channel
snarfed and dmaczka joined the channel
# 21:21 snarfed dmaczka sknebel: rejecting doesn't have to be synchronous. you can return a sync 202 and still reject later, out of band
# 21:21 tantek re: redirect responsibilities, I think it tends to fall more on the receiver than the sender, EXCEPT for webmention endpoint discovery, which I think the sender MUST follow reidrects
# 21:21 tantek one example is shortdomains. you may have multiple URLs for a post (short domain, long) and a sender may try to webmention any of them as the target
# 21:22 snarfed also, on the receiver side, "check that target is a valid resource for which it can accept Webmentions" is generally expected to be internal. ie you're the receiving site, you know your pages, you can check whether a URL is for a page or for something else that would redirect without making an HTTP request to yourself
# 21:22 aaronpk from my understanding of the question, it's also a matter of whether you want to accept webmentions for short URLs that you didn't create that point to your post
# 21:23 sknebel snarfed: sure, but if it is an external URL you can't tell internally
# 21:23 aaronpk i'm not actually sure i'd *want* to accept those webmentions, but not 100% sure on that
# 21:23 dmaczka tantek, yep, that's how this started, I was testing my webmention sender/receiver, had been using a nice happy source/target, then tried on a tweet I made linking back to a post, but of course twitter shortened my url
# 21:23 tantek if they're not domains you control, I think you may reject
# 21:24 dmaczka and so my nice-and-simple reject-all-hosts-I-don't-control rejected it
# 21:24 aaronpk it's like someone says to you "hey I wrote a reply to your post, and also made my own alternate URL for your post here"
# 21:24 tantek can we punt on that until Twitter itself supports sending Webmentions?
# 21:24 aaronpk tho the indieweb equivalent of that is if I were to wrap all links in my posts through my own short URL redirector
# 21:25 aaronpk which incidentally I used to do on my website pre-2003
# 21:25 snarfed dmaczka: so yeah, one alternative is to return 202, verify async, and follow target redirects
# 21:25 snarfed tantek: not really, lots of people and sites generate and use short urls, not just twitter
# 21:25 aaronpk so that i could check which outgoing links people were clicking on
# 21:25 dmaczka I think I will do that for now... because right now will something like brid.gy send webmentions on urls in tweets?
# 21:26 dmaczka which I'd want to catch
# 21:26 tantek snarfed, is this primarily a silo problem then? Twitter, Tumblr, FB?
# 21:26 snarfed dmaczka: yes but bridgy is careful to follow (unwrap) all urls to be nice to receivers
# 21:26 tantek I'm more interested in the indie to indie case
# 21:26 aaronpk tantek: probably primarily, although i just gave you an indieweb example that I actually did!
# 21:26 tantek where the source post uses its own URL shortener for all outbound links
# 21:26 snarfed yeah, primarily but not solely. people manually link with short urls too. just not globally like twitter, tumblr, some wp.com, etc
# 21:26 dmaczka snarfed: but doesn't that negate the reason for including target in the webmention in the first place: to make it easy for receiver to scan the source for it?
# 21:27 tantek so the question is how does a receiver verify that?
# 21:27 aaronpk dmaczka: the source URL parameter still has to be in the source HTML as an exact match
# 21:27 snarfed tantek: like i mentioned, you 202 and follow target redirects async
# 21:28 tantek I think there is a "respect hyperlinking and giving others search-juice" value we could use here
# 21:28 dmaczka well, the one I'm making will:)
# 21:28 tantek that is, if a site is "up" enough for you to be sending a webmention to, then you really should be openly linking directly to it
# 21:28 snarfed yup that's the beauty of making it optional in the spec, with maybe an opinionated recommendation
# 21:28 aaronpk yeah deciding whether a target URL is "a valid resource for which it can accept webmentions" is up to you
# 21:28 tantek the only reason I'm considering making my own short URL wrappers for sites I link to is if I don't trust them to stay up - usually silos
# 21:29 snarfed tantek: but even those silos can accept wms sometimes
# 21:29 aaronpk i'm pretty sure more of my outgoing links to indieweb sites are now gone
# 21:30 tantek a-ha - some silos allow you to edit your templates sufficiently to provide webmention endpoint discovery?
# 21:31 aaronpk yep and also wow i forgot how powerful the discovery step of webmention is
# 21:31 sknebel since you don't actually have to fetch and parse the targer-URL, but just do a HEAD requests (and follow 30x codes) the work is not as bad
# 21:31 tantek skenebel I doubt any of those silos allow you add LINK HTTP headers to HEAD requests
# 21:32 aaronpk tantek: no that's about following the e.g. t.co redirect
# 21:32 sknebel tantek: sorry, wrong context. I meant for just checking if the URL redirects to my page at the end
# 21:33 snarfed i actually doubt *any* current wm receivers follow redirects on target urls. bridgy for blogs doesn't.
# 21:33 tantek ok. but the sender still has to set the *target* to the literal link that is in the *source* right?
# 21:33 aaronpk it's like if I send a webmention that looks like source=aaronpk.com/foo target=t.co/12345 where t.co/12345 actually redirects to tantek.com/foo, I still need to have t.co/12345 in my post.
# 21:33 tantek so for a tweet source, you'd have to set the target to the tco URL
# 21:33 snarfed wordpress's webmention plugin doesn't (follow target redirects). iirc known doesn't.
# 21:33 aaronpk but the receiver would need to check if t.co/12345 redirects to a URL on tantek.com in order to know what post it's for
# 21:34 tantek so do we require receivers to follow redirects on the 'target' param before checking to see if they handle it?
# 21:34 tantek or can they do a simple domain prefix check on the target URL?
# 21:34 GWG snarfed, I changed the redirects following in WordPress
# 21:35 snarfed GWG: you're thinking about sending wms? this is for receiving them, and checking the target url
# 21:35 tantek aaronpk: does webmention.io verify the target pre-redirect, and then follow redirects on the target and provide that final target destination?
# 21:35 snarfed looking at the current repo head, the wp plugin definitely doesn't follow target redirects when receiving
# 21:36 aaronpk because it's meant to support receiving for multiple domains on your same account
# 21:36 aaronpk but that actually probably means following redirects on the target is more important
# 21:36 aaronpk because right now there are likely "orphaned" webmentions sitting there
# 21:37 tantek yeah - that's what I was wondering - does it provide both pre and post following redirects on the target
# 21:37 dmaczka another complication... so say the target is a t.co/foo... does my receiving server save that, or the expanded target uri for the purposes of later detecting when a webmention is re-sent?
# 21:37 dmaczka to e.g. update it
# 21:37 tantek dmaczka: I don't think you should have to save either?
# 21:39 tantek once the receiver determines the target, they check to see if they already have some record of the *source*
# 21:39 tantek there's no reason to save the target from a webmention AFAIK
# 21:39 tantek says he who hasn't yet implemented a webmention receiver :P
# 21:40 aaronpk speaking of which, just checked tantek's account in webmention.io
# 21:40 aaronpk it has received webmentions with a target domain of: tantek.com, ttk.me, t.co and snarfed.org
# 21:41 Loqi [Ryan Barrett] likes
Kyle Mahan: A reply from 2014-03-26.
# 21:41 snarfed aaronpk: ooh how many total? and of those, how many are from twitter to his homepage? :P
# 21:41 snarfed heh, that snarfed.org post is a timeless classic. one of my all time faves.
# 21:42 aaronpk hm looks like the t.co didn't end up with any verified webmentions on it
# 21:42 snarfed hey btw GWG while you're here, any idea why my WP site isn't sending webmentions? it's using the wm plugin at git repo head. hasn't sent them for months now. nothing relevant in the debug log. :(
# 21:45 aaronpk wow webmention.io has received 64106 home page webmentions for tantek
# 21:45 tantek is really not looking forward to debugging those
# 21:46 Loqi bridgy has 49 karma (1 in this channel)
# 21:47 snarfed the wp wm receiving plugin(s) had some teething troubles for a while as they learned to handle mf2 post types
# 21:48 tantek snarfed, just curious if this is a plugin defaults issue, or something to do with how kylewm marked up his reply post or ... ?
# 21:49 snarfed it's an old artifact from bugs/missing features at that time
# 21:49 sknebel could you just resend the mention and have it fix itself?
# 21:50 snarfed sknebel: definitely! just hasn't been a priority. this conversation here is more than i've thought about it in years :P
# 21:51 sknebel now I wonder if there is value in randomly re-evaluating old WMs every now and then (like, one every night or every few hours) to automate that and clean up dead links
# 21:51 sknebel (which would be a reason to store the target-URL)
# 21:53 snarfed sknebel: what would you do when you find a dead link?
# 21:54 sknebel snarfed: good question, not sure what my personal preference would be
# 21:56 sknebel either treat it as delete (but on the other hand our convention is explicit delete with 410, but not all external sites do that of course), or hide the mention?
# 22:16 GWG I am thinking about receiving them
# 22:17 GWG snarfed, this is why we need better de debugging and unit tests.
# 22:22 GWG snarfed, I had the same problem. I thought it was a cron issue
# 22:22 GWG I have a few ideas for debugging improvements.
# 22:22 snarfed ugh. so your site isn't sending outbound wms right now either?
# 22:23 GWG I wanted to try and figure it out.
# 22:23 GWG It is why I changed so there is a hook for logging outbound
# 22:24 GWG snarfed, give me some time. I am trying to reimplement Webmentions.
# 22:25 GWG I want an implementation that might get into Core.
# 22:25 snarfed we have different goals then, i'd just like this bug fixed :P
# 22:26 GWG But either way, I intend to figure it out. I am hoping one will help me figure out the other.
tantek joined the channel
# 22:29 GWG snarfed, I know it is a problem. I think I may have created it.
# 22:29 GWG pfefferle committed a bunch of changes.
# 22:30 GWG But singpolyma was sending webmentions since then, so it seems to work.
# 22:31 GWG I decoupled the sending from the receiving code
# 22:32 GWG I want to continue working with the existing sending code.
# 22:33 GWG snarfed, they have no relationship.
# 22:34 GWG But the current code lacks error handling and unit testing beyond the endpoint discovery..
# 22:34 GWG snarfed, can you test sending with webmention.rocks?
# 22:36 snarfed GWG: sure. i expect it won't send though, so i doubt that will help
AngeloGladding joined the channel
# 22:37 GWG Well, will be dissecting the code this weekend and will send anything I learn to the issue.
# 22:37 GWG snarfed, I also haven't forgotten about Micropub
# 22:38 GWG And after I got jorbin to IWC, I thought I might be able to work on my agenda
# 22:39 GWG snarfed, agreed. But the window to talk feature projects is smaller.
# 22:40 GWG I have something on track for 4.7 that would let me delegate Pingbacks to webmention.io
# 22:42 GWG A lot of the tickets I am gardening are webmention related...if subtly.
# 22:45 snarfed it did back in april when i original posted that. not sure which plugin version i was on then
# 22:45 GWG snarfed, will get back to you. I know what changed since April.
# 22:46 snarfed thanks! no guarantees, i may have been running code from before april
# 22:47 GWG snarfed, what do you think about logging sent Webmentions?
# 22:47 snarfed i mostly just care about actually sending them :P
# 22:48 GWG snarfed, to know what is going on, I need to have better debugging that can be turned on.
# 22:49 GWG And WordPress changed the HTTP API in 4.6 and for all I know that is an issue.
# 22:51 GWG I doubt it, but in the June rewrite I made some changes there.
# 22:52 GWG The committed changes we made were in June, so if there has been a problem since April...
# 22:53 GWG WordPress 4.5 was released April 12th. Possible relationship?
# 22:57 GWG I guess I should read through 372 bugs fixed
KevinMarks joined the channel
# 23:28 gRegorLove My ProcessWire Webmention plugin follows redirects on target_url
# 23:32 gRegorLove Though that's after verification, and verification checks if the hostname matches, so it would always fail if target=t.co/foo
# 23:46 KevinMarks Hm, an outbound redirection might be a good idea for link rot
# 23:48 aaronpk Not really. Might as well just change the URL in your post once you detect link rot
# 23:48 aaronpk Tho an outbound redirect would let you capture that functionality in an external service
# 23:49 tantek or on your own page, because you could detect (like Google does) if someone clicks on a link on your page, and then quickly goes *back* to your page
# 23:49 tantek that's a good red flag to use as an indicator that the link they clicked on is broken
# 23:50 tantek (you don't need every user to do that - i.e. folks that just open tabs - as long as even *a few* users do the click / go back, you can use it)