#indiewebcamp 2013-05-25

2013-05-25 UTC
tantek joined the channel
#
aaronpk
holy cow, I've already downloaded 63 megs of HTML from external pages in my reply context code
#
tantek
aaronpk - perhaps we could add cheap archiving to business-models :)
#
tantek
or maybe we should just shove archive HTML pages into static github pages
#
tantek
wonders if that can be done programmatically with a reasonable URL structure
#
aaronpk
probably just use archive.org's URL structure
#
tantek
makes sense
#
tantek
re-use!
#
tantek
so what's a short github user name we can create just for this sake
#
tantek
(and encourage people to mirror
#
Loqi
!calc (and encourage people to mirror
#
tantek
we could all share the same github user/community
#
tantek
just for this
#
aaronpk
interesting
#
aaronpk
could be indieweb/archive
#
aaronpk
github.com/indieweb/archive
#
tantek
and since we keep snapshots by datetime index… we won't collide
#
tantek
I'm thinking even shorter
#
tantek
just to keep the URLs shorter
#
tantek
and a new account
#
tantek
so we don't need to say /archive
#
tantek
archive.org says "web" twice for no reason
#
aaronpk
oh right! root-level gh pages
#
tantek
dumb dumb dumb
#
tantek
exactly
#
aaronpk
yea, probably some technical reason
#
tantek
plus I'd suggest, ahem, NewBase60 encoding of epoch days + t + NewBase60 seconds into the day
#
tantek
to make it even shorter
#
aaronpk
that tends to be harder to handle for people though, cause it's not built in to every language's date library
#
tantek
it's ported to over a half dozen languages
#
tantek
fully open source
#
tantek
worth the shorter URLs
#
aaronpk
hm there is a ruby gem. we should get a composer package up for PHP
#
aaronpk
you kind of lose the readability of the URLs though
#
tantek
aaronpk - I think there is a composer package setup for CASSIS.js which has the PHP implementation :)
#
tantek
barnabywalters helped me with that
#
aaronpk
ah nice
#
tantek
uh, those numbers are not really readable
#
tantek
I used to think they were readable like 10 years ago
#
aaronpk
well I can parse it out pretty quick
#
tantek
but I've been since corrected ;)
#
tantek
that's what I used to think too
#
tantek
my opinion has been changed
#
aaronpk
so... github.com/webarchive? would be webarchive.github.io/
#
tantek
aside: appears if you force-quit iTunes it will no longer see iPods until you restart?
#
tantek
people are squatting the single letter github accounts
#
tantek
that's lame
#
aaronpk
w-a.github.io is available
#
aaronpk
kinda weird, but short
#
tantek
plus we place no copyright claims and state it is purely for library/archive purposes only, and anyone is welcome to clone (not modify) and keep the same terms
#
tantek
this could be interesting
#
tantek
once we get it going, we could even likely talk archive.org into maintaining a mirror of it
#
tantek
boom - nicely done
#
aaronpk
and it's git, so it's easy for anyone to mirror!
scor joined the channel
#
tantek
this is the kind of stuff that you end up figuring out / building when you scratch your own itches and follow them to their logical conclusions
#
tantek
I don't think anyone in FSW circles came up with the idea of a distributed collaborative mirroring of posts that they replied to!
#
tantek
despite it being a really simple idea
#
tantek
performance shouldn't be an issue either - as rarely should we need to reparse the HTML
#
aaronpk
everyone was so focused on building a full stack that did everything
#
tantek
instead of distributed systems that cooperatively grew
#
tantek
I think I may still store parsed JSON bits for speed of retrieval / display on my server, but then keep a URL to the copy of the HTML on i-a
#
aaronpk
yea I think that makes sense
#
tantek.com
created /IndieArchive (+233) "stub"
(view diff)
#
aaronpk
wondering how long it takes github to build the gh-pages site or if I did this right
#
aaronpk
"Changes may take up to ten minutes to be visible." oh
#
aaronpk
what's a good tagline...
#
tantek.com
edited /IndieArchive (+594) "URL design"
(view diff)
#
tantek.com
edited /IndieArchive (+90) "/* URL design */ which functions to use"
(view diff)
#
tantek
archiving the web that the indieweb deems worth linking to
#
aaronpk
hah nice
#
tantek
wouldn't surprise me if someone asks for a feed of all the URLs
#
tantek
as they happen
#
tantek
get linked to
#
tantek
we build up the write-access contributors through social web of trust from the initial seed of indieweb camp attendees
#
aaronpk
totally. I've got a github team set up for it
#
tantek
that have their own indieweb sites setup - whatever they used to sign-into indiewebcamp.com
#
tantek.com
created /NewBase60 (+186) "stub"
(view diff)
#
tantek.com
edited /IndieArchive (+1253) "capture thoughts before they disappear into IRC archives - re-organize later"
(view diff)
#
aaronpk
this IRC -> wiki -> web flow is really nice
#
tantek.com
edited /IndieArchive (+473) "more thoughts, live raw files"
(view diff)
#
aaronparecki.com
edited /IndieArchive (+129) "/* URL design */ trailing slash required on root-level domains."
(view diff)
#
tantek.com
edited /IndieArchive (-1) "/* URL design */ -/"
(view diff)
#
tantek.com
edited /IndieArchive (+12) "we know"
(view diff)
#
tantek.com
edited /IndieArchive (+183) "explicit https example to demonstrate non-http schemes"
(view diff)
#
tantek
ok well that was an unexpected brainstorm for end of the week / Friday afternoon when we should be mentally burned out and stuff
#
aaronpk
hah totally
#
tantek.com
edited /IndieArchive (+169) "/* Thoughts */ inception was a result of one person's itch, that another thought of a scratch for"
(view diff)
#
aaronparecki.com
edited /IndieArchive (+0) "/* URL design */ update archive.org examples to a timestamp when my site had more stuff on it since they are clickable links"
(view diff)
#
tantek
aaronpk - wouldn't have happened without you venting about your itch, and the idea popping into my head as a result. I don't think either of us would have thought of this purely independently, at least not for a while
#
aaronpk
agreed!
#
aaronpk
not bad, less than an hour from itch to scratch to full wiki page with docs and a simple live site
#
tantek.com
edited /IndieArchive (+3) "update NewBase60 equivalents"
(view diff)
#
aaronpk
oh thanks, was about to do that :)
#
tantek
I use my own site and my JSeval favelet
#
aaronpk
I put a newbase60 converter on http://pin13.net/ ... I use it a lot
#
tantek
so I can just paste in functions like ymd_to_sd() and execute them
#
aaronpk
nice, where does it get the function from?
#
tantek
num_to_sxg() is just one function
#
tantek
it's in CASSIS.js
#
tantek
which my site loads
#
aaronpk
oh! gotcha
#
tantek
so it's available to the JSeval favelet ;)
#
aaronpk
i will kep that in mind
#
tantek
e.g. 1. go tantek.com, 2. click JSEval, 3. type in CASSIS function to run it and get a value
#
aaronpk
i just open up the chrome console (command+option+j)
#
aaronpk
then I get autocompleteing function names too
#
tantek
yeah - FF has a console for that too
#
tantek
I just wanted a *really* simple little alert box
#
tantek
no cheesed up UI
#
tantek
getting in my way distracting me
#
tantek
plus - hey - works on MOBILE
#
tantek.com
edited /IndieArchive (+8) "XFN is the only really useful social web of trust"
(view diff)
#
tantek.com
created /XFN (+336) "stub"
(view diff)
#
tantek.com
created /xfn (+17) "r"
(view diff)
#
tantek.com
edited /IndieArchive (+16) "full stack? yeah that's a monoculture"
(view diff)
#
tantek
Loqi ate my tweet since I mentioned you ;)
#
aaronpk
ah yea saw it in my other channel again!
#
@t
itch, scratch, ~1hr IRC brainstorming: @aaronpk &amp
#
Loqi
I figured out a distributed #indieweb archive http://t.co/jQYECrq12W (ttk.me t4Q91)
#
tantek
still got the & overescaping bug too
spinnerin joined the channel
#
tantek
aaronpk - note that another side effect of the modified URL structure is that you can look at what was archived for a particular day
#
tantek
since each day gets its own directory
#
aaronpk
yes, and the nice thing about that is the root folder will only ever have as many folders as there are days archived
#
aaronpk
rather than the archive.org where each full second-precision date is its own folder
#
tantek
yeah - might even be manageable in a "normal" local laptop filesystem for browsing
#
tantek
this could get really interesting
#
aaronpk
using the same format p3k uses to store right now, the subfolders will also be neatly organized since the only thing in the XXX/YYY folder would be the domain name
#
tantek
after an initial dump, e.g. with your and Barnaby's HTML archives of your reply-to originals, we can even lock down old directories
#
tantek
to prevent unintentional modifications to the past
#
tantek
or maybe we only allow new submissions to the past day and into the future.
#
tantek
that's a reasonable time stamp trust granularity - we can also check commit log timestamps vs. claimed directory timestamps to determine archiving lag
#
tantek
nice example!
#
tantek
oh my goodness I need to get on my bike and get home before I keep coming up with more on this - I'll think on the bike ride home ;)
#
tantek
always a pleasure collaborating with you aaronpk
tantek and mxuribe joined the channel
#
aaronpk
relating to access control for IndieArchive... the fact that it's on Github will be really convenient
#
aaronpk
anybody is able to fork the project right now, and can always be adding files to their own branch
#
aaronpk
after they have a number of successful commits, and we see that their site behaves according to the project guidelines we set up, we can begin accepting pull requests
#
tantek
makes sense
#
tantek
we still have to solve the HTTP headers + HTML file problem
#
tantek
we really should store both
#
aaronpk
we can store headers in a .headers file or something
#
tantek
especially if we expect to donate this to archive.org longer term
#
aaronpk
so there'd be two files on disk... 3_C/3Kj/aaronparecki.com/index.html and 3_C/3Kj/aaronparecki.com/index.html.headers
#
tantek
that seems reasonable - I don't have any better ideas
#
aaronpk
or... 3_C/3Kj/aaronparecki.com/.index.html
#
aaronpk
marked as a hidden file, so it doesn't show up in the file browser UIs
#
tantek
or .headers.index.html ?
#
aaronpk
that's even less likely to conflict!
#
tantek
also encodes the "meaning" of it in the name
#
aaronpk
or, outside the main folder structure entirely: headers/3_C/3Kj/aaronparecki.com/index.html
#
tantek
parallel structure, interesting
#
aaronpk
at least that way we know filenames will never conflict
#
tantek
that's why they came up with the whole .well-known nonsense
mxuribe joined the channel
#
tantek
ideally I'd prefer to keep all such "meta" stuff as close to the real thing as possible
#
aaronpk
yea that is my instinct too
#
aaronpk
trying to minimize the chance of conflicting filenames, .headers.[filename] seems best
#
aaronparecki.com
edited /business-models (+48) "add Network Redux"
(view diff)
#
tantek
(i'm searching for how to archive a web page with headers :) )
#
tantek
so how does one view the raw http headers of something on archive.org?
#
aaronpk
i've never tried
#
aaronpk
gotta run, bye for now!
#
tantek
ttyl!
#
tantek
note to standards people: links to Google Groups messages are shit and disappear, e.g. on that previous link, it links to: http://groups.google.com/group/firebug-working-group/web/http-tracing---export-format
#
tantek
which errors out
#
tantek
back to googling
#
tantek
google groups sucks for finding things
#
tantek
google search works though
#
tantek
so google is good at search, bad at being a content silo
#
tantek
well there's that if we want to go crazy with archiving the entire http request and response
#
tantek
which I'd rather not - seems like overkill, and wrong for a distributed shared project
#
@alastairtouw
On the one hand, there’s all this great writing on @medium. On the other hand, it’s all on @medium. #indieweb
#
tantek
what if we just keep it dumb/simple, make the requests always just be a simple "GET" - no other weird params, no cookies
#
tantek
and the response just has the entire raw HTTP headers - just as with the HTML, we're saving the raw stuff in case we want to reparse it later
#
tantek
the HAR format seems to imply some amount of HTTP header processing into JSON - which is antithetical to the methodology of archiving
#
tantek
ok, based on a quick read of that (over-designed) spec, I think saving the raw http headers as a text file is fine for us
#
tantek
and the .headers.[filename] convention is fine too.
#
tantek
I doubt most (if any) of the folders will contain more than just one file and its headers
mxuribe1, duckbillp and tantek joined the channel
#
aaronpk
I thought about saving the whole HTTP response too, but it seemed like it would just make it that much harder to deal with later. for example you wouldn't be able to open it in a browser
#
tantek
.headers.[filename].txt ?
#
tantek
what's so hard about that?
andreypopp joined the channel
#
@xtof_fr
@anthere désolé pour mon tweet cavalier hier. @robert_vinet a l'image. bisous et bon weekend http://christopheducamp.com/d/2013-05-25 #indieweb #mediawiki
andreypopp and eschnou joined the channel
#
@domenicoperri
I've made plans for IndieWebCamp 2013 http://awe.sm/s1DL2
#
@domenicoperri
#USERMEDIA domenicoperri starred indieweb/webmention https://github.com/indieweb/webmention?utm_source=dlvr.it&utm_medium=twitter #invispide
laurian, andreypopp, xtof, peck_lx, barnabywalters and duckbillp joined the channel
#
tantek
aaronpk - IndieAuth.com copy/links need updating - they refer to 2012 ;)
duckbillp joined the channel
#
aaronpk
will genericize the text
#
aaronpk
tantek: oh I meant saving the entire HTTP response (headers\n\nbody) in the file
#
Loqi
1 files modified, 1 new files in aaronpk/IndieAuth/master by aaronpk https://github.com/aaronpk/IndieAuth/compare/be82e944fe04...bce57947f78d
#
Loqi
aaronpk: Update indiewebcamp links to generic instead of 2012. Add app.net logo to supported provider list
barnabywalters, eschnou, scor, Nadreck and andreypopp joined the channel
#
@xtof_fr
@egadenne #pattern #chronodream ? Proposition posée sur http://mydatalabs.com/Projets#pattern_chronor?ve
eschnou, andreypopp, tantek, jalbertbowdenii and danbri joined the channel