#dev 2017-01-20

2017-01-20 UTC
#
@call_user_func
[New]pfefferle/wordpress-webmention-for-comments Webmention support for (threaded) comments https://packagist.org/packages/pfefferle/wordpress-webmention-for-comments
(twitter.com/_/status/822232191857922048)
tantek, KevinMarks, gRegorLove, KevinMarks_ and KartikPrabhu joined the channel
#
@sl007
@aaronpk Friend analysed changes : "This time they don't shoot in the knee they cripple complete thigh" + last in https://github.com/aaronpk/IndieAuth.com/issues/130
(twitter.com/_/status/822343134617235456)
cweiske, sknebel_ and tantek joined the channel
marcthiele and tantek joined the channel
#
seblog.nl
edited /Webmention (+125) "/* Extensions */ added summaries"
(view diff)
#
@schestowitz
#Webmention is latest #W3C Recomm. (tracking) What about #privacy ? What about #swpats -free standards? Rather than #drm and other rubbish?
(twitter.com/_/status/822392899971055616)
tantek, marcthiele and arush joined the channel
#
GWG
Good morning, all.
tantek and KevinMarks joined the channel
#
@dissolve333
@schestowitz I don't really understand what you are saying? How does DRM relate at all to #webmention?
(twitter.com/_/status/822460679189921792)
#
tantek.com
edited /tinbox (-52) "done invite people to 2017-01-25 HWC SF"
(view diff)
gRegorLove and tantek joined the channel
#
aaronpk
oh gosh i broke something and now my twitter posse is going rogue
#
tantek.com
edited /100DaysOfIndieWeb (+185) "start a How to section with info from aaronpk and other stuff on the page"
(view diff)
#
aaronpk
oh... i think i'm running into concurrency issues writing to disk
#
tantek
with p3k?
#
aaronpk
i have one process syndicating my post, and another one expanding the reply context
#
tantek
you posting too much? ;)
#
tantek
interesting!
#
aaronpk
and then they clobber each other when they try to update the file
#
aaronpk
my last couple posts glitched out in different ways. some were missing the reply context, some were missing the syndication URL
#
aaronpk
i'm just gonna scale back to a single background process for now
#
aaronpk
i guess this is a problem with storing the whole post data in one atomic unit
#
aaronpk
i would have the same problem with mysql if I were storing the whole post contents in a single JSON field for example
#
voxpelli
aaronpk: can't you lock the file with flock()?
#
voxpelli
and have the process that fails to acquire the lock wait and sleep until the lock is released?
#
aaronpk
sure, but that's a bunch of work i haven't done yet :)
#
aaronpk
oh that wouldn't actually help me anyway
#
aaronpk
because they'd still clobber each other cause they read the post contents into memory in order to manipulate the properties
#
aaronpk
also i'm using this storage wrapper which allows storing files on things that might not be the filesystem so i don't actually have access to filesystem locking https://laravel.com/docs/5.2/filesystem
#
voxpelli
right, yeah, that would complicate things
#
aaronpk
either way, the problem is when two processes load the post into RAM, manipulate it there, then try to write it back to disk
#
aaronpk
breaking up the data into separate units (either by using different columns or tables in an RDBMS, or using different files for a file-based approach) is probably the best way to solve it
#
voxpelli
sounds like the easiest one at least
#
aaronpk
otherwise i basically have to build my own locking mechanism, and processes the might write the file would have to request a write lock at the time they open the file
#
tantek
exactly. building your own locking mechanism = probability of building your own deadlocking mechanism ;)
#
aaronpk
so for now, since everything is queued and processed in a background task, i'm just going to reduce to one process so that everything happens sequentially
#
tantek.com
edited /100DaysOfIndieWeb (+241) "/* Brainstorming */ 100 days of positive posts"
(view diff)
#
sknebel_
aaronpk: that's why I put different generated bits into different files for now
#
sknebel_
but that probably also gets annoying once you get to the point where changed fields start to overlap between tasks
#
aaronpk
this wasn't a problem for me until i had two tasks that often take a long time. POSSEing being one, and fetching reply context (or repost content) being the other.
#
tantek
aaronpk: I'm surprised you're not fetching/caching the reply context as part of the authoring UI / flow
#
aaronpk
tantek: that would involve every client being responsible for that task
#
aaronpk
instead, i can have a client that sends only the "repost-of" or "in-reply-to" property as a URL
#
tantek
I suppose I would expect good UIs (clients) to *have to* do that purely for UX reasons
#
aaronpk
my IRC client certainly doesn't need to
#
tantek
like helping the user be sure they are replying to the right thing
#
aaronpk
i have that in a private IRC server
#
tantek
of course you do :)
#
tantek.com
edited /100DaysOfIndieWeb (+256) "/* More 100 days projects */ 100 days of positive news"
(view diff)
#
aaronpk
I also trust my own website to fetch the parts of the repost/reply post that I need more than I trust other clients
#
tantek
that makes sense
#
tantek
at least from a validation / updating perspective too
#
aaronpk
of course if a client sends me an in-reply-to value that is a full h-cite object, it'll store it just the same
#
tantek
I'm still uncertain it makes sense (for me at least) to store *someone else's data* in the same storage / file as *my data*
#
aaronpk
for the repost case i think it makes sense
#
tantek
cache vs. persistence data policies
#
aaronpk
i'm less certain about it for reply contexts, but it was convenient to do so
#
tantek
hmm, convenience seems like a bad reason to do that, and how data stores end up getting bloated with things they shouldn't have
#
tantek
repost makes sense because you as the author want to absolutely capture the snapshot in time of the thing you reposted
#
aaronpk
well i have to store it *somewhere* in order to render it. it was easier to put it in the post file than come up with a scheme for storing it outside the post file
#
voxpelli
I would store reply-contexts the way I store webmentions I think
#
tantek
right, my approach for storing that kind of thing is the same for webmentions - stuff from other sources, not me
#
tantek
heh voxpelli :)
#
voxpelli
in a threaded conversation the reply-context could even be the exact same data as the webmention I have already received on another post
#
aaronpk
voxpelli: yeah that was what i was originally diagramming for this project, but i still haven't come up with a long term plan for storing webmention content
#
aaronpk
my disk storage in p3k-v1 ended up having some issues that i haven't figured out how to resolve yet
#
tantek
basically for each storage file I have (bim) I plan store a second file of the cached stuff from others
#
tantek
and yes that means two file loads instead of one, but it also means a very simple data / file persistence / cache policy
#
voxpelli
I store everything as json keyed on their normalized URL:s and then make them into likes, reply-contexts etc based on relations between that url and another url
#
aaronpk
"normalized"?
#
aaronpk
and does that mean you have one giant file with everything?
#
tantek
what is a normalized URL?
#
Loqi
It looks like we don't have a page for "normalized URL" yet. Would you like to create it?
#
aaronpk
or is the URL the filename?
#
voxpelli
aaronpk: I store it in PostgreSQL so my keys are in there, but I guess I could just as well eg. sha-hash them and put them as files on disk
#
aaronpk
ah postgres okay
#
aaronpk
yeah i was considering a sha-hash as well. v1 made a filename based on the URL, so I had files like this which is great for readability https://media.aaronpk.com/Screen-Shot-2017-01-20-10-57-52.png
#
voxpelli
can't really come up with a good definition of normalized URL
#
aaronpk
start with what do you do to normalize a URL?
#
voxpelli
I think I do more normalization than many would think as okay
#
tantek
more than following rel-canonical?
#
aaronpk
the only normalization that I did was lowercase domain name and remove :80 and :443 if present
#
voxpelli
tantek: I don't think I follow rel-canonical, but maybe I do after I salmentioned my code
#
aaronpk
i don't think i did the percent decoding thing but probably should have
#
tantek
that sounds like enough material to start stubbing an article aaronpk, especially since you have a real world implementation!
#
aaronpk
well, it's an old implementation, not in use anymore
#
KevinMarks
Is that the same as spiderpig does?
#
aaronpk
KevinMarks: it's similar, but spiderpig had different constraints so it ended up doing a little more than that
#
voxpelli
a normalized URL is a URL with non-significant alternatives removed, like the default :80 port
#
loqi.me
created /normalized_URL (+124) "prompted by tantek and dfn added by voxpelli"
(view diff)
#
voxpelli
Which eg. means that I treat http and https as the same currently
#
aaronpk
oh yeah i did too
#
tantek
voxpelli: you may find fortune.com violates that
#
voxpelli
and I remove any double /
#
voxpelli
as well as any trailing /
#
voxpelli
I also remove any www. subdomain
#
tantek
what is no-www?
#
Loqi
no-www is a movement to deprecate use of "www." at the start of URLs as being redundant, unnecessary, and a waste of resources https://indieweb.org/no-www
#
aaronpk
wow that's bold
#
voxpelli
but I only use the normalized URL for matching different URL:s against each others, to avoid duplicates and to allow for my embed code to actually find all mentions
#
aaronpk
presumably two different URLs could overwrite each other in your storage tho?
#
voxpelli
yes, but only if someone has some very weird implementation
#
voxpelli
double /, trailing / and www. most often have the very same content as any URL:s without it
#
tantek.com
edited /100DaysOfIndieWeb (+62) "/* Brainstorming */ positive before negative"
(view diff)
#
aaronpk
what is the dutch word for the dutch language?
#
Loqi
It looks like we don't have a page for "dutch word for the dutch language" yet. Would you like to create it?
#
aaronpk
Nederlands?
#
tantek
aaronpk: you may find this link handy for that question in general (language-name word for the language-name language) https://en.wikipedia.org/wiki/Main_Page#p-lang-label
#
aaronpk
good call
sknebel joined the channel
#
tantek.com
edited /100DaysOfIndieWeb (+29) "/* 100 Days of 500 Words */ 100dagen500woorden"
(view diff)
#
tantek
hmm, besides aaronpk, only sebsel is using a complete hashtag
#
sebsel
tantek with # you mean?
#
sebsel
I'm just copying aaronpk :)
#
sebsel
oh, I see #indieweb now
KartikPrabhu joined the channel
#
tantek.com
edited /100DaysOfIndieWeb (+128) "100 Days of Positive Posts moved from Brainstorming to Other"
(view diff)
#
KartikPrabhu
lost all my tags while moving databases! ;(
#
KartikPrabhu
database--
#
Loqi
database has -1 karma in this channel (-2 overall)
#
voxpelli
no backups? :(
#
tantek
KartikPrabhu: wat? they were not in the export?
#
tantek
I think I gave up on "meta" tags that sit outside the content storage
#
KartikPrabhu
they were but they are stored as some "relational" thing and not as plaintext. So all tags got exported but the "relation" between them and posts broke :(
#
tantek
inline hashtags in the content are harder to "lose"
#
tantek
sorry to ehear that KartikPrabhu :(
#
KartikPrabhu
i did all this moving to make it easier to finally move to file-storage
#
voxpelli
very interesting though in relation to the earlier discussion of splitting up data among many files – can make data loss easier
#
tantek
any chance you can check archive.org for your tags on your permalinks?
#
tantek
voxpelli: not quite sure I follow. the problem was the ethereal "relaional" things, rather than different plain text stores
#
KartikPrabhu
right, my original database setup was that "tag" is an object and "post" is an object and there is a "relation" betwen them that MySql magically manages
#
loqi.me
created /static_site_generation (+34) "prompted by tantek and dfn added by tantek"
(view diff)
#
KartikPrabhu
i am no DB expert so I made this after reading things on the web
#
voxpelli
tantek: well, KartikPrabhu thought he had all pieces, but he didn't – the more spread out ones stuff is, the easier to forget one part. Having everything at the same place makes that impossible
#
tantek
voxpelli: having everything in files in the file *system* is a form of everything in the same place, especially if they're in the same root folder
#
voxpelli
KartikPrabhu: usually no magic relations in MySQL, relations are something you most often define at query time there – matching one value against another. (Compare to eg. Neo4j where a relation is a first class object)
#
KartikPrabhu
that shows how much I know :P
#
voxpelli
tantek: well, formatting drives or erasing databases – both times you need to ensure that you have extracted everything you want or else rely on you having backups of things from before you erased it
#
voxpelli
KartikPrabhu: but since theres nothing magical, then maybe you have exported it afterall?
#
KartikPrabhu
yeah it is possible, I am looking at my export JSON to check that
#
voxpelli
if you can put parts of the export up somewhere we can have some more eyes
#
KartikPrabhu
my text editor is hainvg trouble with the large JSON file :P
#
KevinMarks
Usually you have an id on each, post_id and tag_id and table that has post_id, tag_id pairs in
#
tantek
right, ids for tags just in case you want to "rename" the tags and have them stay assigned to all the posts you assigned them to, instead of having the name of the tag *be* its ID
#
KartikPrabhu
yeah I think that database which has the post-tag pair didn't get exported
#
KevinMarks
Well, also to make the table fixed size
#
KartikPrabhu
now I am thinking of just storing the tags as a comma-separated text field
#
KevinMarks
The question is if you need to make tag pages, are you better off with a db or a generation script
tantek joined the channel
#
KartikPrabhu
KevinMarks: yes, that is what I need to look into
#
KevinMarks
With a static generator it will build the tag pages for each update
#
tantek
hey at least you have tag pages
#
KartikPrabhu
my tag pages are now all blank! :P
#
sknebel
KartikPrabhu: does the internet archive maybe have old copies you could throw through a microformat parser for recovery?
#
tantek
it is probable
#
KartikPrabhu
sknebel: yeah that is a way to recover them
#
sknebel
seems like not very recent ones though
#
KartikPrabhu
my last article post is from 2015 so should be fine :P
#
sknebel
oh, ok then ;)
#
loqi.me
created /calendar_heatmap (+220) "prompted by tantek and dfn added by sknebel"
(view diff)
#
loqi.me
created /forestry.io (+112) "prompted by tantek and dfn added by [keithjgrant]"
(view diff)
#
loqi.me
edited /visualization (+23) "sknebel added "[[calendar heatmap]]" to "See Also""
(view diff)
#
tantek.com
edited /Events (+123) "/* January */ update confirmed locations for 2017-01-25 HWC"
(view diff)
AbeEstrada joined the channel
#
aaronpk
eep sorry to hear that KartikPrabhu!
#
KartikPrabhu
it isn't that bad. Only my article posts had tags and so only about 50 of them
#
KartikPrabhu
but I'm glad this happened before i started tagging notes
tantek joined the channel