#indiewebcamp 2013-05-25

2013-05-25 UTC
tantek joined the channel
# 00:15 
aaronpk holy cow, I've already downloaded 63 megs of HTML from external pages in my reply context code
# 00:16 
tantek aaronpk - perhaps we could add cheap archiving to business-models :)
# 00:16 
tantek or maybe we should just shove archive HTML pages into static github pages
# 00:16 
aaronpk whoa!
# 00:17 
tantek wonders if that can be done programmatically with a reasonable URL structure
# 00:17 
aaronpk probably just use archive.org's URL structure
# 00:17 
tantek makes sense
# 00:17 
tantek re-use!
# 00:17 
aaronpk http://web.archive.org/web/20020223202325/http://www.aaronparecki.com/
# 00:18 
tantek so what's a short github user name we can create just for this sake
# 00:18 
aaronpk http://domain/YYYYMMDDHHMMSS/url
# 00:18 
tantek (and encourage people to mirror
# 00:18 
Loqi !calc (and encourage people to mirror
# 00:18 
tantek )
# 00:18 
tantek we could all share the same github user/community
# 00:18 
tantek just for this
# 00:18 
aaronpk interesting
# 00:18 
aaronpk could be indieweb/archive
# 00:18 
aaronpk github.com/indieweb/archive
# 00:18 
tantek and since we keep snapshots by datetime index… we won't collide
# 00:18 
tantek I'm thinking even shorter
# 00:18 
tantek just to keep the URLs shorter
# 00:18 
tantek and a new account
# 00:18 
aaronpk ah
# 00:18 
tantek so we don't need to say /archive
# 00:19 
tantek archive.org says "web" twice for no reason
# 00:19 
tantek web.
# 00:19 
aaronpk oh right! root-level gh pages
# 00:19 
tantek and /web/
# 00:19 
tantek dumb dumb dumb
# 00:19 
tantek exactly
# 00:19 
aaronpk yea, probably some technical reason
# 00:19 
tantek lame
# 00:19 
aaronpk this is sad https://github.com/web
# 00:20 
aaronpk but https://github.com/webarchive is available
# 00:20 
tantek plus I'd suggest, ahem, NewBase60 encoding of epoch days + t + NewBase60 seconds into the day
# 00:20 
tantek to make it even shorter
# 00:20 
aaronpk heh
# 00:20 
aaronpk that tends to be harder to handle for people though, cause it's not built in to every language's date library
# 00:21 
tantek it's ported to over a half dozen languages
# 00:21 
tantek fully open source
# 00:21 
tantek worth the shorter URLs
# 00:21 
aaronpk hm there is a ruby gem. we should get a composer package up for PHP
# 00:21 
aaronpk you kind of lose the readability of the URLs though
# 00:22 
tantek aaronpk - I think there is a composer package setup for CASSIS.js which has the PHP implementation :)
# 00:22 
tantek barnabywalters helped me with that
# 00:22 
aaronpk ah nice
# 00:22 
tantek uh, those numbers are not really readable
# 00:22 
tantek I used to think they were readable like 10 years ago
# 00:22 
aaronpk well I can parse it out pretty quick
# 00:22 
tantek but I've been since corrected ;)
# 00:22 
tantek that's what I used to think too
# 00:22 
aaronpk huh
# 00:22 
tantek my opinion has been changed
# 00:23 
aaronpk so... github.com/webarchive? would be webarchive.github.io/
# 00:23 
tantek aside: appears if you force-quit iTunes it will no longer see iPods until you restart?
# 00:24 
tantek people are squatting the single letter github accounts
# 00:24 
tantek that's lame
# 00:25 
aaronpk w-a.github.io is available
# 00:26 
tantek ooh
# 00:26 
aaronpk kinda weird, but short
# 00:32 
tantek plus we place no copyright claims and state it is purely for library/archive purposes only, and anyone is welcome to clone (not modify) and keep the same terms
# 00:32 
tantek this could be interesting
# 00:32 
aaronpk it is done https://github.com/i-a
# 00:32 
tantek once we get it going, we could even likely talk archive.org into maintaining a mirror of it
# 00:32 
tantek boom - nicely done
# 00:32 
aaronpk and it's git, so it's easy for anyone to mirror!
scor joined the channel
# 00:33 
tantek this is the kind of stuff that you end up figuring out / building when you scratch your own itches and follow them to their logical conclusions
# 00:33 
tantek I don't think anyone in FSW circles came up with the idea of a distributed collaborative mirroring of posts that they replied to!
# 00:33 
tantek despite it being a really simple idea
# 00:34 
tantek performance shouldn't be an issue either - as rarely should we need to reparse the HTML
# 00:34 
aaronpk everyone was so focused on building a full stack that did everything
# 00:34 
tantek yeah
# 00:34 
tantek instead of distributed systems that cooperatively grew
# 00:35 
tantek I think I may still store parsed JSON bits for speed of retrieval / display on my server, but then keep a URL to the copy of the HTML on i-a
# 00:35 
aaronpk yea I think that makes sense
# 00:38 
tantek.com created /IndieArchive (+233) "stub" (view diff)
# 00:39 
aaronpk wondering how long it takes github to build the gh-pages site or if I did this right
# 00:40 
aaronpk "Changes may take up to ten minutes to be visible." oh
# 00:43 
aaronpk what's a good tagline...
# 00:44 
Loqi [mention] http://aaronparecki.com/notes/2013/05/24/1/ linked to http://indiewebcamp.com/business-models (pingback)
# 00:45 
tantek.com edited /IndieArchive (+594) "URL design" (view diff)
# 00:46 
tantek.com edited /IndieArchive (+90) "/* URL design */ which functions to use" (view diff)
# 00:47 
tantek archiving the web that the indieweb deems worth linking to
# 00:47 
aaronpk hah nice
# 00:47 
tantek wouldn't surprise me if someone asks for a feed of all the URLs
# 00:47 
tantek as they happen
# 00:47 
tantek get linked to
# 00:48 
aaronpk wow
# 00:48 
tantek we build up the write-access contributors through social web of trust from the initial seed of indieweb camp attendees
# 00:48 
aaronpk totally. I've got a github team set up for it
# 00:48 
tantek that have their own indieweb sites setup - whatever they used to sign-into indiewebcamp.com
# 00:48 
aaronparecki.com edited /IndieArchive (+24) (view diff)
# 00:50 
tantek.com created /NewBase60 (+186) "stub" (view diff)
# 00:53 
aaronpk aaand here we go http://i-a.github.io/
# 00:54 
tantek.com edited /IndieArchive (+1253) "capture thoughts before they disappear into IRC archives - re-organize later" (view diff)
# 00:55 
aaronpk this IRC -> wiki -> web flow is really nice
# 00:56 
tantek.com edited /IndieArchive (+473) "more thoughts, live raw files" (view diff)
# 00:57 
aaronparecki.com edited /IndieArchive (+129) "/* URL design */ trailing slash required on root-level domains." (view diff)
# 00:57 
tantek.com edited /IndieArchive (-1) "/* URL design */ -/" (view diff)
# 00:58 
tantek.com edited /IndieArchive (+12) "we know" (view diff)
# 01:00 
tantek.com edited /IndieArchive (+183) "explicit https example to demonstrate non-http schemes" (view diff)
# 01:01 
tantek ok well that was an unexpected brainstorm for end of the week / Friday afternoon when we should be mentally burned out and stuff
# 01:01 
aaronpk hah totally
# 01:02 
tantek.com edited /IndieArchive (+169) "/* Thoughts */ inception was a result of one person's itch, that another thought of a scratch for" (view diff)
# 01:02 
aaronparecki.com edited /IndieArchive (+0) "/* URL design */ update archive.org examples to a timestamp when my site had more stuff on it since they are clickable links" (view diff)
# 01:02 
tantek aaronpk - wouldn't have happened without you venting about your itch, and the idea popping into my head as a result. I don't think either of us would have thought of this purely independently, at least not for a while
# 01:03 
aaronpk agreed!
# 01:03 
aaronpk not bad, less than an hour from itch to scratch to full wiki page with docs and a simple live site
# 01:06 
tantek.com edited /IndieArchive (+3) "update NewBase60 equivalents" (view diff)
# 01:06 
aaronpk oh thanks, was about to do that :)
# 01:07 
tantek I use my own site and my JSeval favelet
# 01:07 
tantek available from http://favelets.com/
# 01:07 
tantek :)
# 01:07 
aaronpk I put a newbase60 converter on http://pin13.net/ ... I use it a lot
# 01:07 
tantek so I can just paste in functions like ymd_to_sd() and execute them
# 01:08 
aaronpk nice, where does it get the function from?
# 01:08 
tantek num_to_sxg() is just one function
# 01:08 
tantek it's in CASSIS.js
# 01:08 
tantek which my site loads
# 01:08 
aaronpk oh! gotcha
# 01:08 
tantek so it's available to the JSeval favelet ;)
# 01:08 
aaronpk i will kep that in mind
# 01:08 
tantek e.g. 1. go tantek.com, 2. click JSEval, 3. type in CASSIS function to run it and get a value
# 01:08 
aaronpk i just open up the chrome console (command+option+j)
# 01:09 
aaronpk then I get autocompleteing function names too
# 01:09 
tantek yeah - FF has a console for that too
# 01:09 
tantek I just wanted a *really* simple little alert box
# 01:09 
tantek no cheesed up UI
# 01:09 
tantek getting in my way distracting me
# 01:09 
aaronpk lol
# 01:09 
tantek plus - hey - works on MOBILE
# 01:09 
tantek :P
# 01:11 
tantek.com edited /IndieArchive (+8) "XFN is the only really useful social web of trust" (view diff)
# 01:13 
tantek.com created /XFN (+336) "stub" (view diff)
# 01:14 
tantek.com created /xfn (+17) "r" (view diff)
# 01:19 
tantek.com edited /IndieArchive (+16) "full stack? yeah that's a monoculture" (view diff)
# 01:20 
tantek Loqi ate my tweet since I mentioned you ;)
# 01:21 
aaronpk ah yea saw it in my other channel again!
# 01:21 
aaronpk https://twitter.com/t/status/338102381088747523
# 01:21 
@t itch, scratch, ~1hr IRC brainstorming: @aaronpk &amp
# 01:21 
Loqi I figured out a distributed #indieweb archive http://t.co/jQYECrq12W (ttk.me t4Q91)
# 01:22 
tantek still got the & overescaping bug too
spinnerin joined the channel
# 01:26 
tantek aaronpk - note that another side effect of the modified URL structure is that you can look at what was archived for a particular day
# 01:26 
tantek since each day gets its own directory
# 01:28 
aaronpk yes, and the nice thing about that is the root folder will only ever have as many folders as there are days archived
# 01:28 
aaronpk rather than the archive.org where each full second-precision date is its own folder
# 01:30 
tantek yeah - might even be manageable in a "normal" local laptop filesystem for browsing
# 01:30 
tantek this could get really interesting
# 01:30 
aaronpk using the same format p3k uses to store right now, the subfolders will also be neatly organized since the only thing in the XXX/YYY folder would be the domain name
# 01:31 
tantek yes
# 01:32 
tantek after an initial dump, e.g. with your and Barnaby's HTML archives of your reply-to originals, we can even lock down old directories
# 01:32 
tantek to prevent unintentional modifications to the past
# 01:32 
tantek or maybe we only allow new submissions to the past day and into the future.
# 01:33 
aaronpk example for reference: http://www.flickr.com/photos/aaronpk/8809487777/in/photostream/
# 01:33 
tantek that's a reasonable time stamp trust granularity - we can also check commit log timestamps vs. claimed directory timestamps to determine archiving lag
# 01:33 
tantek nice example!
# 01:34 
tantek oh my goodness I need to get on my bike and get home before I keep coming up with more on this - I'll think on the bike ride home ;)
# 01:34 
tantek always a pleasure collaborating with you aaronpk
# 01:34 
tantek :)
tantek and mxuribe joined the channel
# 02:54 
aaronpk relating to access control for IndieArchive... the fact that it's on Github will be really convenient
# 02:55 
aaronpk anybody is able to fork the project right now, and can always be adding files to their own branch
# 02:55 
aaronpk after they have a number of successful commits, and we see that their site behaves according to the project guidelines we set up, we can begin accepting pull requests
# 02:56 
tantek makes sense
# 02:56 
tantek we still have to solve the HTTP headers + HTML file problem
# 02:56 
tantek we really should store both
# 02:56 
aaronpk we can store headers in a .headers file or something
# 02:56 
tantek especially if we expect to donate this to archive.org longer term
# 02:57 
aaronpk so there'd be two files on disk... 3_C/3Kj/aaronparecki.com/index.html and 3_C/3Kj/aaronparecki.com/index.html.headers
# 02:58 
tantek that seems reasonable - I don't have any better ideas
# 02:58 
aaronpk or... 3_C/3Kj/aaronparecki.com/.index.html
# 02:58 
aaronpk marked as a hidden file, so it doesn't show up in the file browser UIs
# 02:59 
tantek or .headers.index.html ?
# 02:59 
aaronpk that's even less likely to conflict!
# 02:59 
tantek also encodes the "meaning" of it in the name
# 02:59 
aaronpk or, outside the main folder structure entirely: headers/3_C/3Kj/aaronparecki.com/index.html
# 03:00 
tantek parallel structure, interesting
# 03:00 
aaronpk at least that way we know filenames will never conflict
# 03:01 
tantek that's why they came up with the whole .well-known nonsense
# 03:01 
aaronpk heh
mxuribe joined the channel
# 03:15 
tantek ideally I'd prefer to keep all such "meta" stuff as close to the real thing as possible
# 03:16 
aaronpk yea that is my instinct too
# 03:16 
aaronpk trying to minimize the chance of conflicting filenames, .headers.[filename] seems best
# 03:17 
aaronparecki.com edited /business-models (+48) "add Network Redux" (view diff)
# 03:17 
tantek this is interesting: http://webapps.stackexchange.com/questions/40911/web-archive-links-without-header
# 03:18 
tantek (i'm searching for how to archive a web page with headers :) )
# 03:18 
aaronpk lol!
# 03:18 
tantek so how does one view the raw http headers of something on archive.org?
# 03:18 
aaronpk i've never tried
# 03:19 
aaronpk gotta run, bye for now!
# 03:20 
tantek ttyl!
# 03:21 
tantek more interesting / seemingly related: http://www.stevesouders.com/blog/2009/10/19/http-archive-specification-firebug-and-httpwatch/
# 03:22 
tantek http://blog.httpwatch.com/2009/10/19/httpwatch-version-62-supports-data-exchange-with-firebug/
# 03:23 
tantek note to standards people: links to Google Groups messages are shit and disappear, e.g. on that previous link, it links to: http://groups.google.com/group/firebug-working-group/web/http-tracing---export-format
# 03:23 
tantek which errors out
# 03:23 
tantek back to googling
# 03:23 
tantek google groups sucks for finding things
# 03:24 
tantek google search works though
# 03:24 
tantek so google is good at search, bad at being a content silo
# 03:24 
tantek http://www.softwareishard.com/blog/har-12-spec/
# 03:24 
tantek and apparently: https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/HAR/Overview.html
# 03:25 
tantek well there's that if we want to go crazy with archiving the entire http request and response
# 03:26 
tantek which I'd rather not - seems like overkill, and wrong for a distributed shared project
# 03:26 
@alastairtouw On the one hand, there’s all this great writing on @medium. On the other hand, it’s all on @medium. #indieweb
# 03:27 
tantek what if we just keep it dumb/simple, make the requests always just be a simple "GET" - no other weird params, no cookies
# 03:27 
tantek and the response just has the entire raw HTTP headers - just as with the HTML, we're saving the raw stuff in case we want to reparse it later
# 03:28 
tantek the HAR format seems to imply some amount of HTTP header processing into JSON - which is antithetical to the methodology of archiving
# 03:29 
tantek ok, based on a quick read of that (over-designed) spec, I think saving the raw http headers as a text file is fine for us
# 03:29 
tantek and the .headers.[filename] convention is fine too.
# 03:30 
tantek I doubt most (if any) of the folders will contain more than just one file and its headers
mxuribe1, duckbillp and tantek joined the channel
# 06:41 
aaronpk I thought about saving the whole HTTP response too, but it seemed like it would just make it that much harder to deal with later. for example you wouldn't be able to open it in a browser
# 06:44 
tantek .headers.[filename].txt ?
# 06:45 
tantek what's so hard about that?
andreypopp joined the channel
# 08:22 
@xtof_fr @anthere désolé pour mon tweet cavalier hier. @robert_vinet a l'image. bisous et bon weekend http://christopheducamp.com/d/2013-05-25 #indieweb #mediawiki
andreypopp and eschnou joined the channel
# 11:43 
@domenicoperri @indiewebcamp 2013 - IndieWebCamp http://indiewebcamp.com/2013 via @t
# 11:44 
@domenicoperri I've made plans for @indiewebcamp  2013 http://awe.sm/jFAJA via @plancast
# 11:48 
@domenicoperri I've made plans for IndieWebCamp 2013 http://awe.sm/s1DL2
# 12:12 
@domenicoperri #USERMEDIA domenicoperri starred indieweb/webmention https://github.com/indieweb/webmention?utm_source=dlvr.it&utm_medium=twitter #invispide
laurian, andreypopp, xtof, peck_lx, barnabywalters and duckbillp joined the channel
# 16:02 
tantek aaronpk - IndieAuth.com copy/links need updating - they refer to 2012 ;)
duckbillp joined the channel
# 16:35 
aaronpk oops!
# 16:37 
aaronpk will genericize the text
# 16:39 
aaronpk tantek: oh I meant saving the entire HTTP response (headers\n\nbody) in the file
# 16:54 
Loqi 1 files modified, 1 new files in aaronpk/IndieAuth/master by aaronpk https://github.com/aaronpk/IndieAuth/compare/be82e944fe04...bce57947f78d
# 16:54 
Loqi aaronpk: Update indiewebcamp links to generic instead of 2012. Add app.net logo to supported provider list
barnabywalters, eschnou, scor, Nadreck and andreypopp joined the channel
# 20:21 
@xtof_fr @egadenne #pattern #chronodream ? Proposition posée sur http://mydatalabs.com/Projets#pattern_chronor?ve
eschnou, andreypopp, tantek, jalbertbowdenii and danbri joined the channel