#Loqidenschub has 1 karma in this channel over the last year (2 in all channels)
#[tantek]aaronpk, as someone who has had to implement bits of AP I must confess I'm surprised at your response. Or did you mean not interested in participating in an explicitly global public conversation?
#[tantek]In contrast to public but contained discussion e.g. here in dev?
#DenSchuboh, i don't mind that response! i can totally understand it, especially in such a controversial topic, which can also be a huge time sink
#[tantek]I for one found aspects in denschub's post that could be used to improve nearly any standards discussion or community, including IndieWeb specs, microformats vocabularies etc
#[tantek]FWIW learning from others's mistakes is one of the cheapest ways to learn
#aaronpkI meant I don't have anything to add to the public conversation that's going on in those blog posts referenced
snarfed, KartikPrabhu and [eddie] joined the channel
#[eddie][cleverdevil] you mentioned previously that your podcast listens didn’t have the correct time entered on your site but since then you mentioned that you fixed it. Was it an issue with your script from Overcast or an issue in Known?
ichoquo0Aigh9ie, ichoquo0Aigh9ie_, KartikPrabhu, swentel, [tantek], cweiske, swentie, strugee, [mrkrndvs], leg, eli_oat and [kevinmarks] joined the channel
#[kevinmarks]This bit “But it's not good enough: for example, people have expressed that they want others to be able to read messages, but not reply to them.
#[kevinmarks]You can't stop people from replying. You can stop displaying their replies.
#[kevinmarks]So adding a "don't @ me" flag to your posts does what? Gives notice that webmentions of it will be ignored?
[jgmac1106] joined the channel
#sknebeland potentially tells all other conforming implementations to discard posts that claim to be replies
#sknebelor, depending on the protocol design, never reaches others. E.g. if I remember correctly, in Diaspora a reply is only distributed through the thing it's replying to, so that server has control over it
krychu and [mrkrndvs] joined the channel
#jeremycherfasIs there a name for the `.=` operator in PHP?
#snarfedmaybe first decide what you want it to be, eg permanent archive for even some people, or none. then we can figure out architecture to support that choice
#sknebelshouldn't old stuff that's not accessed "only" take space, for the most part?
#aaronpksknebel: yea but indexes get updated and such around those old entries
#aaronpksnarfed: yeah i'm leaning towards dropping the whole idea of it being any sort of permanent archive, since that greatly simplifies the requirements
#aaronpkthe problem is i still do want some sort of permanent archive of (some) of the channels i have set up
#snarfedyou have seemingly had good luck decomposing things into many microservices
[cleverdevil] joined the channel
#snarfedalso try it out dropping archiving from aperture and make sure actually fixes the problem. seems likely but not guaranteed
#[cleverdevil][aaronpk] I may be able to help as well. I’ll check and see if I can get some infra for you.
#sknebelhave you made sure the DB can do everything based on the indexes?
#aaronpk[cleverdevil]: thanks but i think throwing more hardware at the problem is just going to push the same issue down the road til later
#aaronpksknebel: i spot checked a few indexes of some of the slower and most common queries and it was using them
#snarfedeh if it's enough hardware it can be a long way down the road, ie many years. esp if it's just you or maybe just a few ppl permanently archiving
#sknebelI think the idea behind not doing that was that people like you (or me) write one that's easy to install so it's not everyone using Aperture :D
#aaronpkwonders what the plan is for Yarns around archiving content
#aaronpkthe ironic part of this is my original plans for building aperture (before microsub even) were to treat channels as folders on disk of text files, the same way I store my GPS data
#snarfedkey difference w/yarns is it's single user
#aaronpkeh pretty sure most of the load on my hosted aperture is from myself
#snarfedpossible! you may also be an outlier in usage terms
#sknebelhow important is to have access to the archive through aperture?
#snarfedwould be interesting to break down load by user
#aaronpkmy GPS database is 6.2gb and over 10 million records and it contributes nothing to the overall load of the server
#sknebele.g. you could see if the DB is happier if you move old posts to an archive table, or do the text-file export for that
#aaronpkand also i don't actually want to archive *all* channels, only a few of them
#aaronpkso maybe i set up something separate that pulls content from an aperture channel and saves it as text files, totally outside of the aperture code base
#GWGI think jackjamieson wants a working version before we worry about archiving
#GWGI know my opinion was that archiving would be a bookmark post
#aaronpkone of the other challenges is around how to handle content from feed pages, since sometimes entries have bad/missing uids, or the dates are missing or super old even though the content is new, so another thought I had was to basically store only the current items in a feed page and anything not in that page gets deleted
#GWGaaronpk, how much do you want to archive personally?
#snarfedGWG: i think most of our discussion of "archiving" here has been about keeping just the feed data itself long term/permanently, not fetching and archiving entire posts
#snarfedprimarily around managing server load over time, not archiving as a feature
#aaronpkmy theory is that if these tables aren't just infinitely growing in size that things will go faster
#aaronpkselect count(*) as aggregate from `entries` where `entries`.`source_id` = 599 and `entries`.`source_id` is not null;
#aaronpkhere's an example of a query that is currently very slow even though it's using an index:
#snarfedyou could probably add indices or tune to improve it. and you'll still have a decent write i/o burden even with a fixed size table, but load should stay fixed, not growing without bound
#aaronpkoh i could do my old trick that saved my butt during my startup days of moving the longblob column to a new entries_data table to keep the entries table smaller and fixed width columns
#j4y_funabashialso might not need the not null clause? doesn't the id = xx negate the second clause?
#j4y_funabashiheh yeah it is a lot but mySQL can definitely do sub second queries on multi million row tables
#j4y_funabashihow long does the query take? sorry not offering solutions, just curious
#aaronpkvaries between 1-40 seconds depending on the rest of the server load of course
#jackythinks this'll make for an interesting blog post :)
#aaronpkthe trick with these things of course is that sometimes completely unrelated queries show up in the slow log when the whole server is under heavy load
#j4y_funabashiyeah on your graph it is io wait time that is spiking so might not bee query efficiency related at all
#aaronpkbut looking at the slow query log that one comes up a lot
#j4y_funabashigiven my limited understanding of explain output that one looks OK as long as the a actual count is around 78690 then it isn't doings unnecessary scans
#gRegorLoveI forget, but is it more efficient to count() on an indexed column instead of count(*)?
#aaronpk(also weird that the laravel ORM is adding that)
#aaronpki thought i removed all instances of that code and switched to denormalizing it instead, but something somewhere is still calling it
#aaronpkturns out all i needed to know was whether there are any entries for that source, not the exact number, so i just added a column to the sources table
#[chrisaldrich]aaronpk, is there something my site is doing that's causing it to dump out that much data? (I'm presuming it's the known site, but do others do that too?)
#[chrisaldrich]I think my wordpress site has a reasonable RSS limit of maybe 40 which I'd upped since all my microposts stream by so quickly....
#[chrisaldrich]what are method are you using that returns so much data? And is it paginating all the way down?
#LoqiIt looks like we don't have a page for "method are you using that returns so much data" yet. Would you like to create it? (Or just say "method are you using that returns so much data is ____", a sentence describing the term)
#aaronpkit's that i do actually store everything from a feed
#[chrisaldrich]Trying to compete with Google and Facebook are we? 😉