2013-01-14 UTC
hober, tantek, mxuribe, tilgovi, Nadreck, andreypopp and zztr joined the channel
friedcell and jaquecoustaeau joined the channel
tantek, eschnou, danbri and barnabywalters joined the channel
danbri__ joined the channel
mxuribe joined the channel
eschnou joined the channel
eschnou, tantek and ianloic joined the channel
Ajis and barnabywalters joined the channel
danbri___ joined the channel
tantek, zztr and sdboyer joined the channel
barnabywalters joined the channel
# 19:19 tommorris tantek: ah, it's not a space-separate hashtag, it's just a different hash symbol from Unicode
# 19:20 barnabywalters I should dig out the twitter parsing messups I’ve found — although most of them are just inconsistencies between different twitter apps :/
# 19:21 tommorris I'm gonnna hopefully do the <ins datetime> and <del datetime> stuff shortly. been a bit busy with wikidrama.
# 19:21 tantek tommorris - twitter also fails to link irc: links
# 19:22 barnabywalters tommorris: I’ll have a crack at implementing it too. As it’s only working on bare tags, a simple regex replace should suffice
# 19:22 tommorris tantek: will post a few more tweets later with lots more obscure protocols.
# 19:23 tommorris (I wonder if the IETF have a process for deprecating old protocols? I'm pretty sure nobody is using wais:// anymore.)
# 19:23 tommorris well, they removed gopher from Firefox a while back, which is pretty much the death-knell.
# 19:24 tantek tommorris - check out the WHATWG URL spec for the latest
# 19:24 tommorris anyway, my server does a full parser of a post before storing anyway, so it's trivial to add another callback for checking for ins and del.
# 19:25 tommorris well, it does markdown->html, then parses the HTML for a variety of things. can't remember what.
# 19:26 barnabywalters tommorris: ah, love that ;) I am aware of the dangers. In this case I think it’ll be safe
# 19:28 tantek regexing HTML is usually a good way to leave yourself open to various injection attacks
# 19:34 barnabywalters tantek: I cannot find any examples (nor think of any, but I’m no cracker) which would affect finding <ins> and replacing it with <ins datetime="" cite="">
eschnou joined the channel
# 20:30 barnabywalters tommorris: I wrote the auto ins/del datetime/cite adder! currently deploying…
# 20:34 tantek barnabywalters - where did you see the cite attribute on ins/del?
# 20:35 tommorris heh, so HTML5 is keeping ins/del @cite but not @longdesc. There seems to be some inconsistency here. ;-)
# 20:36 tommorris and I agree with the removal of it. just seems rather inconsistent.
# 20:38 tantek tommorris - cite attr on blockquote is similar, but perhaps slightly more useful
# 20:38 tommorris it's actually quite usable: something like Wikipedia could use it to point to a diff.
# 20:38 barnabywalters tantek: apparently browser makers have missed or ignored both of the attributes, as neither do anything useful
# 20:38 tantek barnabywalters/tommorris - how are you actually using the cite attr on ins/del? I've found no use for it (though as I was typing it, just thought of one).
# 20:39 tommorris well, if you had version controlled posts, being able to say "this chunk of text was added/removed in this revision"
# 20:39 barnabywalters tantek: well, initial plan is to write a bit of js which allows the user to select an update. js toggles various ins/del elements depending on their datetime, to recreate the document as it was at the time of the selected update
# 20:40 tommorris have any of you guys seen Wikitrust? it's a Firefox extension that lets you see how much you should "trust" Wikipedia edits.
# 20:41 tommorris they use web of trust type algorithms to do a sort of 'blame' on the page
# 20:41 tommorris it's based on whether the person who added it is a new or experienced user, and how often their edits have been reverted.
# 20:42 tommorris I'm hoping that having +sysop and 50,000+ edits means mine don't go orange, but I haven't tried it recently. ;-)
# 20:42 tommorris but each of those 'blame' fragments could be marked up with <ins cite="{diff URL}" datetime="{when it was changed}
" />
# 20:44 barnabywalters I’m actually thinking I might implement the reverter as a browser extension which adds a timeline scrubber to any webpages marked up with ins/dels with @datetime
# 20:44 tantek barnabywalters - for the scenario you describe, viewing changes by date/time - all you need are datetime attrs on ins/del.
# 20:44 tantek tommorris's answer seems more interesting, and is roughly akin to the idea that popped in my head as soon as I typed "found no use"
# 20:45 tantek the use I thought is clustering ins/del elements into transactions
# 20:45 tantek but for that all you need is a unique URL per "transaction"
# 20:45 tantek you don't actually need a *description* of the transaction
# 20:45 tantek in particular, I've often wanted to cluster one del with one ins
# 20:45 tommorris I still need to build a modern version of acts_as_git for history-browsing.
# 20:46 barnabywalters arguably you could do that with datetime too, as it’s highly unlikely for multiple edits to happen at exactly the same second
# 20:46 tantek but the "same second" also doesn't work for paired ins / del edits
# 20:46 tommorris barnabywalters: generally, they'll edit conflict, but sometimes you'll have people editing different sections of the same page at the same time.
# 20:46 tantek I'll often delete the text many seconds before I finish inserting the new text.
# 20:46 tommorris remembers trying to edit 'Amy Winehouse' the day she died. That was not fun.
# 20:47 tantek so if you're depending on precise seconds for ins/del pairing that will fail
# 20:47 tantek also, seconds is usually too imprecise as well
# 20:47 barnabywalters tantek: ah, I was still thinking under the assumption of the attributes being automatically filled in
# 20:47 tommorris one's editor might be able to actually give that kind of precision
# 20:48 tommorris Vim's undo-tree for instance: someone might be able to use that.
# 20:48 tommorris but if you are doing clustering, you probably want to cluster ins/del's
# 20:48 tantek undo stacks usually keep track of edits / inserts differently
# 20:49 tantek whereas what cite would be useful for is a higher level of "as the author I consider these part of the same 'edit' "
# 20:49 tommorris but I believe we may have reached Pareto's point of stop-what-the-fuck-are-you-thinking about now.
# 20:50 tantek ok, I'll note cite attr on ins/del as potential point for future extension if I ever get around to needing to pair ins/del more than on a date-specific granularity
# 20:50 barnabywalters yeah. My site is there for experimentation anyway — there’s no harm in getting this data now, as a use might emerge as I generate more and more of it
# 20:54 barnabywalters tommorris: wikitrust is not working for me, I just get a weird TEXT_NOT_FOUND error
# 20:54 tommorris barnabywalters: weird. I haven't tried it for a while. might want to email them or poke 'em on some social media service.
# 20:55 tommorris I think it's a university project, so it could just be they've graduated and buggered off
# 20:58 barnabywalters hm, no news and no contact links :/ Probably safe to assume it’s a dead project
# 21:43 tommorris I'm dorky enough to use my own posting UI, but across the various platforms (iOS, Android, web etc.) Twitter is pretty ubiquitous.
# 21:46 tommorris tantek: I think I may have found a way to use OSM data relatively painlessly
# 21:52 tommorris now change "find.html" to "find.js" and enjoy some JSON goodness.
# 22:17 tommorris my plan is to have it so that I use OSM to lookup, but replicate the details on my own site.
# 22:18 tommorris tantek: your post on Dodgeball's picoformat-style checkin syntax is quite interesting to that end.
# 22:18 tantek oh you mean that little aside rant in my checkie praise post? ;)
# 22:20 tommorris and you could go to that URI and get a map, previous checkins and any other metadata I care to add.
# 22:32 tantek agree with storing venue URLs locally on one's own domain
# 22:32 tantek considers adding him retroactively as my apprentice since I so strongly encouraged him to show up
# 22:45 tantek aaronpk, tommorris, I find opaque IDs to be an anathema and a weakpoint architecturally and in terms of data integrity/longevity
# 22:46 tantek opaque IDs also smell of "database think" - which is typically a bad mindset for designing anything that's expected to last.
# 22:46 aaronpk true, which is why I didn't use opaque IDs for my new site, but I can't think of anything better for a venue
# 22:49 tommorris well, I might use a unique alphanumeric ID assigned by the user (i.e. me)
# 22:51 aaronpk what happens when there is more than one place with the same name, like starbucks
# 22:54 tantek so opaque ID dumb, so just use an algorithmic ID instead
# 22:55 tantek obvious suggested algorithmic ID: date of venue URL creation converted to newbase60 epoch days. add a slug if you want.
# 22:55 tommorris opaque ID may be dumb, but I don't see it as being that big of a deal.
# 22:55 tantek same as what I already have as an algorithmic id for my posts
# 22:56 tommorris ultimately, the URI is the identifier. how the URI gets constructed is fairly arbitrary.
# 22:56 tantek it's only arbitrary to folks who haven't learned that some URLs survive better than others ;)
# 22:58 tantek just as I have "t4Mz1" for a *T*ext note post I posted today (the first)
# 22:59 tantek if I created a venue today I'd use "v4Mz1" (for the first venue I created today)
# 23:00 ianloic why do you believe that these URLs will survive better? (also, hi!)
# 23:03 tantek ianloic - not "believe" - based on experience. URLs based on opaque ids (and links to them) die all the time. URL shorteners. news media sites that go down with their article IDs etc.
# 23:03 tommorris will probably do it very lazily with incrementing integers, because he's lazy. He is fully aware that tantek will probably have good reason to say "see, I was right!" in a year or so.
# 23:04 tantek whereas when those URLs happen to have *some* kind of information, a date, a slug, a topic, anything - you can typically reconstruct it or even relink it to another location
# 23:04 tantek but when you loose your database of opaque ids - forget it
# 23:04 Loqi tantek meant to say: but when you lose your database of opaque ids - forget it
# 23:04 tantek e.g. you upgrade your CMS, wipe previous database/tables etc. - oops, IDs gone.
# 23:05 tantek happens all the time with media/content sites
# 23:05 tantek if the IDs are generated and/or include other information, you have a chance of scripting it back together, perhaps even with automatic redirects
# 23:06 tommorris problem is, if I lose my database of opaque IDs, I also lose the content that goes with it. the solution to that is to, oh, make sure it's backed up. ;-)
# 23:06 tommorris and that will probably at some point hopefully include some kind of peer-mirroring
# 23:06 tantek tommorris - maybe. again, just looking at broader experience here of what's historically happened on the web
# 23:06 tantek designing by "probably at some point hopefully include" is not a good strategy
# 23:07 tantek assuming you never get to "probably at some point hopefully"
# 23:10 tommorris the aaronsw situation makes it seem like a good idea to have some peer backups. a process for backing up our friends things.
# 23:10 tantek yes, peer backups have been discussed many times. too low on anyone's priority list to ever get implemented.
# 23:10 ianloic tantek, I don't disagree that opaque IDs are bad, but I'm not convinced that your "algorithmic IDs" are better. They look opaque to me...
# 23:11 tantek ianloic - they are until you publish the algorithm openly, and then they're forever non-opaque
# 23:11 tommorris might be as simple as writing a little bash script that you run monthly to backup your friend's sites.
# 23:11 tantek and then maintaining those scripts / configurations across server moves etc.
# 23:11 tantek nevermind the auth needed to make it work securely
# 23:17 tantek ianloic - in the case of my algorithmic ids, I've published the algorithm several times, on my own site, in interviews, and now in this IRC channel. so discoverability is high :)
# 23:19 ianloic tantek, fair enough I guess.
# 23:19 tantek reimplementing an algorithm is way easier than recovering an opaque data table
# 23:20 tantek so yeah, basically, opaque IDs dumb (or maybe lazy).
# 23:20 tantek (if it's for ephemeral stuff, sure go right ahead, authentication tokens etc.)
# 23:30 ianloic tantek, at this stage I'd assume that I won't - I'm doing very non-web stuff professionally at the moment.
# 23:30 ianloic It's kind of great :)
# 23:33 tantek ianloic - it wasn't a professional invitation :)
# 23:34 ianloic tantek, I have a mortgage, I'd have a hard time justifying this as anything but a vacation to my wife
# 23:35 ianloic tantek, though these days she's more involved in web stuff than me. She was at CityCamp Oakland in a professional capacity.
# 23:35 tantek ianloic - perhaps you can save up for your "hobby" budget then ;)
# 23:36 tantek assumes there is life beyond the union of "professional" and "wife".
# 23:48 ianloic tantek, potentially.
# 23:52 ianloic tantek, w/ your URL shortener - how do you find your type prefix in practice? Do you find it useful? Do you think of it more as a namespace or as hungarian-notation?