#dev 2019-01-15

2019-01-15 UTC
snarfed, [jgmac1106], [tantek] and [cleverdevil] joined the channel
#
[tantek]
denschub++
#
Loqi
denschub has 1 karma in this channel over the last year (2 in all channels)
#
[tantek]
aaronpk, as someone who has had to implement bits of AP I must confess I'm surprised at your response. Or did you mean not interested in participating in an explicitly global public conversation?
#
[tantek]
In contrast to public but contained discussion e.g. here in dev?
#
DenSchub
oh, i don't mind that response! i can totally understand it, especially in such a controversial topic, which can also be a huge time sink
#
[tantek]
I for one found aspects in denschub's post that could be used to improve nearly any standards discussion or community, including IndieWeb specs, microformats vocabularies etc
#
[tantek]
FWIW learning from others's mistakes is one of the cheapest ways to learn
#
aaronpk
I meant I don't have anything to add to the public conversation that's going on in those blog posts referenced
snarfed, KartikPrabhu and [eddie] joined the channel
#
[eddie]
[cleverdevil] you mentioned previously that your podcast listens didn’t have the correct time entered on your site but since then you mentioned that you fixed it. Was it an issue with your script from Overcast or an issue in Known?
[cleverdevil] joined the channel
#
[cleverdevil]
@eddie It was an issue in my script.
#
[cleverdevil]
And in my endpoint in Known.
#
[eddie]
Is the current gist fixed? Or was it beyond that script?
#
[cleverdevil]
I’ve updated it slightly since then. I can update the gist later this evening.
#
[eddie]
Cool, that’d be great. I’m gonna revise it to send via Micropub and do a single test run to see how it turns out.
#
[cleverdevil]
Awesome 😀
#
[cleverdevil]
I’ve had several requests from people for a service.
#
[eddie]
Haha wow. Yeah I’d definitely shy away from a service until Marco gets back to you 😆
#
[eddie]
But having a script floating around is fine 🙂
#
[eddie]
And it’ll be nice to have some different outputs: Known, Micropub, etc
#
[cleverdevil]
Agreed 😉
#
[cleverdevil]
If I were to do a service I’d want to update my approach to use Micropub against Known.
#
[cleverdevil]
That way it could be consistent.
snarfed joined the channel
#
[eddie]
That makes sense 👍
snarfed joined the channel
#
@chrisbiscardi
↩️ Starting to publish the posts I wrote while in Portland this weekend. First is Building gatsby-plugin-webmention https://www.christopherbiscardi.com/post/building-gatsby-plugin-webmentions
(twitter.com/_/status/1085065234690129920)
ichoquo0Aigh9ie, ichoquo0Aigh9ie_, KartikPrabhu, swentel, [tantek], cweiske, swentie, strugee, [mrkrndvs], leg, eli_oat and [kevinmarks] joined the channel
#
[kevinmarks]
This bit “But it's not good enough: for example, people have expressed that they want others to be able to read messages, but not reply to them.
#
[kevinmarks]
Had ActivityPub been a capability-based system instead of a signature-based system, this would never have been a concern to begin with: replies to the message would have gone to a special capability URI and then accepted or rejected” of https://blog.dereferenced.org/activitypub-the-worse-is-better-approach-to-federated-social-networking is confusing me
#
[kevinmarks]
The but about "read but not reply" links to https://github.com/tootsuite/mastodon/issues/8565
#
[kevinmarks]
You can't stop people from replying. You can stop displaying their replies.
#
[kevinmarks]
So adding a "don't @ me" flag to your posts does what? Gives notice that webmentions of it will be ignored?
[jgmac1106] joined the channel
#
sknebel
and potentially tells all other conforming implementations to discard posts that claim to be replies
#
sknebel
or, depending on the protocol design, never reaches others. E.g. if I remember correctly, in Diaspora a reply is only distributed through the thing it's replying to, so that server has control over it
krychu and [mrkrndvs] joined the channel
#
jeremycherfas
Is there a name for the `.=` operator in PHP?
#
swentel
something with concatenating ?
KartikPrabhu joined the channel
#
jeremycherfas
That's what it does, but maybe it just doesn't have a name. Anyway, thanks.
#
swentel
concatenating assignment operator
KartikPrabhu, [Rose], [svandragt], [voss], [jgmac1106], [pfefferle], swentel, [eddie], snarfed, gRegorLove, [kim_landwehr], [tantek] and [cleverdevil] joined the channel
#
[cleverdevil]
!tell [eddie] I updated the gist just now, fell asleep last night before I remembered to do it 😉
#
Loqi
Ok, I'll tell them that when I see them next
[eddie] joined the channel
#
[eddie]
Sweet, sounds good 🙂
KartikPrabhu, krychu, snarfed, [zak], [chrisaldrich] and j4y_funabashi joined the channel
#
aaronpk
so, Aperture is officially getting slow now, and I need to figure out a solution
#
aaronpk
the database machine's disk usage is at like 100%
#
aaronpk
notice how the latency climbs steadily from when aperture was launched in july
#
snarfed
ugh. everything is i/o bound eventually
#
snarfed
aaronpk: happy to help if you want someone to bounce ideas off of
#
aaronpk
i put the 7-day limit on public accounts to build myself this escape route
#
aaronpk
but my own account has been archiving everything forever
#
aaronpk
and i think it just might not be sustainable to do that
#
jacky
> disk usage is 100%
#
snarfed
it can be if you decide to, you'd just need a different architecture
#
snarfed
maybe first decide what you want it to be, eg permanent archive for even some people, or none. then we can figure out architecture to support that choice
#
sknebel
shouldn't old stuff that's not accessed "only" take space, for the most part?
#
aaronpk
sknebel: yea but indexes get updated and such around those old entries
#
aaronpk
snarfed: yeah i'm leaning towards dropping the whole idea of it being any sort of permanent archive, since that greatly simplifies the requirements
#
aaronpk
the problem is i still do want some sort of permanent archive of (some) of the channels i have set up
#
snarfed
you have seemingly had good luck decomposing things into many microservices
[cleverdevil] joined the channel
#
snarfed
also try it out dropping archiving from aperture and make sure actually fixes the problem. seems likely but not guaranteed
#
[cleverdevil]
[aaronpk] I may be able to help as well. I’ll check and see if I can get some infra for you.
#
sknebel
have you made sure the DB can do everything based on the indexes?
#
aaronpk
[cleverdevil]: thanks but i think throwing more hardware at the problem is just going to push the same issue down the road til later
#
aaronpk
sknebel: i spot checked a few indexes of some of the slower and most common queries and it was using them
#
[cleverdevil]
Fair enough!
#
snarfed
eh if it's enough hardware it can be a long way down the road, ie many years. esp if it's just you or maybe just a few ppl permanently archiving
#
snarfed
up to you all thuogh
#
snarfed
throwmoneyattheproblem++
#
Loqi
throwmoneyattheproblem has 1 karma over the last year
#
snarfed
throwsomeoneelsesmoneyattheproblem++
#
Loqi
throwsomeoneelsesmoneyattheproblem has 1 karma over the last year
#
aaronpk
heh true
#
sknebel
throwsomeoneelsesmoneyattheproblemaslongasyoudonthavetospendtonsoftimeinreviewmeetings++
#
Loqi
throwsomeoneelsesmoneyattheproblemaslongasyoudonthavetospendtonsoftimeinreviewmeetings has 1 karma over the last year
#
[cleverdevil]
Heh, you can get pretty far just vertically scaling 😉
#
[cleverdevil]
Another recommendation would be to consider making it super easy to install for folks like me.
#
[cleverdevil]
I am currently putting load on hosted Aperture and would be fine self-hosting if it were a quick and easy thing to do.
#
[cleverdevil]
You could also start charging money 😉
#
[cleverdevil]
(Which I would happily pay!)
#
sknebel
I think the idea behind not doing that was that people like you (or me) write one that's easy to install so it's not everyone using Aperture :D
#
[cleverdevil]
Ah, good point.
#
aaronpk
yep there's that, and also i can't start charging until i at least know the cost and how this will need to scale
#
[cleverdevil]
Well, I do think that someone was working on a WordPress-based microsub server.
#
snarfed
honestly i'd discourage most of us from charging for our indieweb services unless it's someone who really wants to do a full fledged startup
#
snarfed
otherwise it'll usually be much more added pain (collecting money, support) than gain
#
[cleverdevil]
That's true... makes sense.
#
GWG
cleverdevil, jackjamieson. Yarns. It's in beta
#
@qubyte
↩️ @jgarber @glitch The site is there, but I'm getting a 400 when I try to auth (no issue with http://indieauth.com).
(twitter.com/_/status/1085246867670253570)
#
aaronpk
wonders what the plan is for Yarns around archiving content
#
aaronpk
the ironic part of this is my original plans for building aperture (before microsub even) were to treat channels as folders on disk of text files, the same way I store my GPS data
#
snarfed
key difference w/yarns is it's single user
#
aaronpk
eh pretty sure most of the load on my hosted aperture is from myself
#
snarfed
possible! you may also be an outlier in usage terms
#
sknebel
how important is to have access to the archive through aperture?
#
snarfed
would be interesting to break down load by user
#
aaronpk
my GPS database is 6.2gb and over 10 million records and it contributes nothing to the overall load of the server
#
sknebel
e.g. you could see if the DB is happier if you move old posts to an archive table, or do the text-file export for that
#
snarfed
disk space isn't load, obvs
#
aaronpk
so far, my actual use of Aperture/Monocle/etc hasn't really involved diving into super old archives
#
snarfed
(eg your graph was i/o)
#
aaronpk
and also i don't actually want to archive *all* channels, only a few of them
#
aaronpk
so maybe i set up something separate that pulls content from an aperture channel and saves it as text files, totally outside of the aperture code base
#
GWG
I think jackjamieson wants a working version before we worry about archiving
#
GWG
I know my opinion was that archiving would be a bookmark post
#
aaronpk
one of the other challenges is around how to handle content from feed pages, since sometimes entries have bad/missing uids, or the dates are missing or super old even though the content is new, so another thought I had was to basically store only the current items in a feed page and anything not in that page gets deleted
#
GWG
aaronpk, how much do you want to archive personally?
#
snarfed
GWG: i think most of our discussion of "archiving" here has been about keeping just the feed data itself long term/permanently, not fetching and archiving entire posts
#
snarfed
primarily around managing server load over time, not archiving as a feature
#
aaronpk
my theory is that if these tables aren't just infinitely growing in size that things will go faster
#
aaronpk
select count(*) as aggregate from `entries` where `entries`.`source_id` = 599 and `entries`.`source_id` is not null;
#
aaronpk
here's an example of a query that is currently very slow even though it's using an index:
#
aaronpk
no joins even
#
snarfed
yeah good guess
#
snarfed
you could probably add indices or tune to improve it. and you'll still have a decent write i/o burden even with a fixed size table, but load should stay fixed, not growing without bound
#
GWG
Okay. I was focusing on user need
#
aaronpk
i guess at least one thing i can do right now is try to remove that query
j4y_funabashi joined the channel
#
j4y_funabashi
how many items are in the entries table?
#
aaronpk
about 400,000
#
aaronpk
oh i could do my old trick that saved my butt during my startup days of moving the longblob column to a new entries_data table to keep the entries table smaller and fixed width columns
#
j4y_funabashi
also might not need the not null clause? doesn't the id = xx negate the second clause?
#
aaronpk
oops 400k was my dev copy
#
aaronpk
it's actually 1.8 million
[schmarty] joined the channel
#
[schmarty]
wow that's a lotta rows
#
[cleverdevil]
Have you run an EXPLAIN on the query to be sure its actually using the indexes?
#
aaronpk
`| 1 | SIMPLE | entries | ref | entries_source_id_url_index | entries_source_id_url_index | 8 | const | 78690 | Using index |`
#
j4y_funabashi
heh yeah it is a lot but mySQL can definitely do sub second queries on multi million row tables
#
j4y_funabashi
how long does the query take? sorry not offering solutions, just curious
#
aaronpk
varies between 1-40 seconds depending on the rest of the server load of course
#
jacky
thinks this'll make for an interesting blog post :)
#
aaronpk
the trick with these things of course is that sometimes completely unrelated queries show up in the slow log when the whole server is under heavy load
#
j4y_funabashi
yeah on your graph it is io wait time that is spiking so might not bee query efficiency related at all
#
aaronpk
but looking at the slow query log that one comes up a lot
#
j4y_funabashi
given my limited understanding of explain output that one looks OK as long as the a actual count is around 78690 then it isn't doings unnecessary scans
#
gRegorLove
I forget, but is it more efficient to count() on an indexed column instead of count(*)?
#
aaronpk
i thought that didn't matter anymore
#
aaronpk
spot testing it doesn't seem to matter
#
gRegorLove
hm. how about without the `is not null`?
#
aaronpk
(also weird that the laravel ORM is adding that)
#
aaronpk
i thought i removed all instances of that code and switched to denormalizing it instead, but something somewhere is still calling it
#
aaronpk
turns out all i needed to know was whether there are any entries for that source, not the exact number, so i just added a column to the sources table
#
aaronpk
oh lol that's a different query
#
aaronpk
something is trying to return all entries from a source without a limit and it's chrisaldrich's feed so there are 36,000 rows
#
@12voltnews
So great to see MTX back in the North Hall at CES! The company’s line lup of Powersports products was very impressive. In addition for 2019 the new Jackhammer amplifiers, microsub woofer… https://www.instagram.com/p/BsqvJRPApp4/?utm_source=ig_twitter_share&igshid=1j7yevmcbdsc9
(twitter.com/_/status/1085264802476376067)
krychu joined the channel
#
j4y_funabashi
rogue queries are fun
#
aaronpk
ohh it's when someone adds an existing feed, it tries to go add all the entries to their channel
#
aaronpk
i clearly need some limit there
KartikPrabhu joined the channel
#
jacky
hopes it wasn't me
#
aaronpk
i do need some way to report feed errors back to the user
#
aaronpk
looking at the logs there are a bunch that are failing for various reasons
#
aaronpk
some of them are tumblr returning http 401 for the request, some are granary.io instagram 401s
#
aaronpk
chrisaldrich's known site is down right now
KartikPrabhu joined the channel
#
aaronpk
omg the whole thing is faster now
#
[schmarty]
when chrisaldrich's site went down?
#
aaronpk
no when i fixed two things
#
aaronpk
1) stop trying to add *all* past entries from a feed into a new channel, limit to the most recent 100
#
aaronpk
2) stop counting records when all you wanted to know was whether there are any
[chrisaldrich] joined the channel
#
[chrisaldrich]
I'll get around to fixing my secondary site sometime soon... 😉
#
[chrisaldrich]
Sounds a little like this issue I ran across last week: https://twitter.com/lmorchard/status/1083790866941132800
#
[chrisaldrich]
aaronpk, is there something my site is doing that's causing it to dump out that much data? (I'm presuming it's the known site, but do others do that too?)
#
aaronpk
it's just that you have so many posts
#
[chrisaldrich]
seems like it's been a while since I've been able to be an edge case that breaks something... 🙂
#
[chrisaldrich]
I think my wordpress site has a reasonable RSS limit of maybe 40 which I'd upped since all my microposts stream by so quickly....
#
[chrisaldrich]
what are method are you using that returns so much data? And is it paginating all the way down?
#
Loqi
It looks like we don't have a page for "method are you using that returns so much data" yet. Would you like to create it? (Or just say "method are you using that returns so much data is ____", a sentence describing the term)
#
aaronpk
it's that i do actually store everything from a feed
#
[chrisaldrich]
Trying to compete with Google and Facebook are we? 😉
#
jacky
lolol
[tantek] joined the channel
#
[tantek]
was just about to say 😂
#
[tantek]
aaronpk, sounds like your "archiving" use-case could be delegated to a separate service
#
[tantek]
and whoa scrollback
#
[chrisaldrich]
Have you managed to get aaronpk.archive.org yet?
#
[tantek]
though yes, archiving "everything" (for some value of "everything") *and* indexing it sure does start to sound like a search engine
#
[tantek]
aside: when you try to have one monolithic specification/protocol solve everything: https://www.amazon.com/Wenger-16999-Swiss-Knife-Giant/dp/B001DZTJRQ/
[eddie] joined the channel
#
[tantek]
(must read the reviews)
#
jacky
-> #-chat
[jgmac1106] joined the channel
#
aaronpk
look at that latency drop
#
snarfed
aaronpk++
#
Loqi
aaronpk has 82 karma in this channel over the last year (261 in all channels)
#
aaronpk
that'll last me a while longer now
#
j4y_funabashi
throwmoneyattheproblem--
#
Loqi
throwmoneyattheproblem has 0 karma over the last year
#
gRegorLove
throwlimit100attheproblem++
#
Loqi
throwlimit100attheproblem has 1 karma over the last year
KartikPrabhu, [benatwork] and [cleverdevil] joined the channel
#
[cleverdevil]
[aaronpk]++
#
Loqi
[aaronpk] has 83 karma in this channel over the last year (262 in all channels)
j12t, krychu and [kevinmarks] joined the channel
#
[kevinmarks]
i wonder if my tweet list via granary is causing pain - it has ~2000 unread usually
#
aaronpk
i doubt it. the problem was really caused because multiple people subscribed to the same feed
#
@nhoizey
One of the features I feared missing if I left @jekyllrb for @eleven_ty was Webmention. It looks like it's pretty easy, after all, thanks Max for showing the way!
(twitter.com/_/status/1085300789684432902)
[Rose], [smerrill] and [cleverdevil] joined the channel
#
[cleverdevil]
FYI, Marco Arment has told me that the current rate limit on his OPML export is 10 hits per day per user.
#
[cleverdevil]
But he made it clear that he makes no promises that it’ll stay that way.
#
[cleverdevil]
So, if you are planning on using my scripts or writing your own to track your listens, I’d recommend staying well below that limit.
snarfed joined the channel
#
[Rose]
Reasonable.
blueyed joined the channel
#
[smerrill]
re DNS: I use CoreDNS with this Corefile: https://github.com/skpy/Dockerfiles/blob/master/coredns/Corefile
#
[smerrill]
I’ve run it in Docker, but I also run it on a pair of Raspberry Pis. I configure my home router to point to my Pis for DNS. works a treat.
#
blueyed
[smerrill]: Thanks!
[jgmac1106] joined the channel
#
jacky
the more I look into microsub, the less likely I see me implementing this within Koype
#
jacky
like I'd want this to be externalized