2015-04-30 UTC
# 00:02 tantek hmm - I'm having trouble with a note article distinction
# 00:04 tantek what started as a medium note (no pun intended), became a long note, then with lists (yes plural)
# 00:05 tantek and it still has the tone of a quick "note", but now is starting to look/structure like a blog post / article.
# 00:05 tantek and yet I'd almost rather prefer the informality of a long (semi-structured) note than do anything so formalizing as put a name/title on it
# 00:05 tantek does anyone else ever have these issues when writing medium/long notes?
benwerd joined the channel
# 00:13 aaronpk these are all problems with trying to map URLs to a filesystem
# 00:13 tantek aaronpk - I'm still thinking on a solution for case insensitivity.
# 00:14 aaronpk i haven't heard of a solution for the file+folder problem yet either
# 00:14 tantek not sure how it escapes it, but it "works" in the UI
# 00:14 aaronpk that doesn't sound very portable, and doesn't work on the command line so probably also doesn't work from code, but haven't tried in code yet
# 00:15 GWG If it is that long, I always switch to a note.
# 00:15 aaronpk i wouldn't be able to put that on a linux filesystem to run on a web server though
# 00:15 tantek huh? why not try the UI action then doing an ls in the terminal to see how it works?
# 00:16 aaronpk which means now I can't store files with ":" in the URL
# 00:17 tantek hmm I can't put a ":" in the name of a folder in the Finder
# 00:18 tantek oh easy hack, use "//" for a ":" - only the trailing single-slash "/" is used for folder names :)
# 00:19 tantek URL escape both ":" and any capital letters. Done
# 00:19 aaronpk all of these sound like hacks, or like clever workarounds that are osx specific
wolftune joined the channel
# 00:21 aaronpk osx is the only way i can have both "test" and "test/" on the filesystem
# 00:21 tantek you can't use ":" in the filename on other systems?
# 00:22 aaronpk i'd better document the three problems again just to double check
# 00:23 tantek just leave it to me to come up with short syntax hacks, whether for class names, or file / folder names ;)
# 00:23 aaronpk of course URL escaping capital letters loses the readability of those
# 00:23 tantek yeah, acceptable compromise for the rarer frequency of capital letters
# 00:26 tantek solving that kind of problem I find much easier than things like note/article distinction :/
cmhobbs joined the channel
# 00:31 GWG If it is that long, I always switch from a note.
# 00:33 tantek one consideration is the effect on POSSEing, specifically, how Bridgy Publish will treat it, e.g. when POSSEing to FB
# 00:34 tantek specifically, it seems Bridgy Publish ignores semantic markup in an article like lists, which are often essential to convey the meaning of a post
# 00:34 tantek whereas if I do the whole thing with plain text and whitespace, then the list bullets / numbers and formatting are all done there, which Bridgy Publish (mostly) propagates
# 00:35 aaronpk in order to be able to store 2015 2015/ and 2015/Germany
# 00:35 tantek I do find that I have to edit the Bridgy Publish FB POSSE copy and manually add back the linebreaks
mdik joined the channel
# 00:37 aaronpk this does mean this will always require code to serve these files back via http, since this won't work with apache or nginx filesystem serving
# 00:37 tantek curious what snarfed / kylewm thinks in terms of what Bridgy Publish could/should do with both markup inside articles, and whitespace inside notes
# 00:38 aaronpk lastly, regarding versioning and human-readability of the filesystem, i think it would make more sense to put the timestamp at the end
# 00:38 aaronpk archive/example.com/path/to/file/DDD/SSS or file.DDD.SSS
# 00:39 snarfed i'd love to support more kinds of formatting, but i'm not very interested in implementing it myself
# 00:39 tantek aaronpk - I think that results in more folders :(
# 00:39 aaronpk otherwise content from example.com could be spread around any number of DDD/SSS folders
# 00:39 GWG snarfed: Got a chance to look at that plugin?
# 00:39 tantek do I need to add "Got a moment?" to CommProtocols? ;)
# 00:40 GWG tantek: I flaunt social conventions.
# 00:40 aaronpk i'm imagining asking myself this later: "where is the latest version of X" or "how many versions of X do I have"
# 00:40 GWG snarfed: The Indie-Webactions one?
# 00:40 tantek aaronpk I believe archivists keep things by year in general
# 00:40 GWG snarfed: I continue to try to get to stable.
# 00:40 tantek that is, the question of, what sites did I reference / archive in year / day x?
# 00:41 aaronpk archivists also came up with the WARC format so...
# 00:41 tantek is more interesting / frequent than "where is the latest version of X" or "how many versions of X do I have"
# 00:41 aaronpk in practice i have needed to find a file on disk in order to delete the cached version more often than i have ever asked my self "what did I reference in Y"
# 00:42 aaronpk "what sites did I reference in Y" can almost as easily be answered by just reading my web pages from that year
# 00:42 GWG I'm still trying to figure out...Portland or Edinburgh
# 00:42 tantek aaronpk - another problem - putting the DDD/SSS at the end breaks the URL path
# 00:43 tantek since you end up putting the filename.ext inside DDD/SSS :(
# 00:43 aaronpk no, i would tack that on to the end in either case
# 00:43 aaronpk remember this is the filesystem representation, which has nothing to do with the URL it's served from
# 00:43 tantek which works really well for the *typical* case
# 00:43 aaronpk it's just a matter of mapping that URL to a filesystem path
# 00:45 tantek wondering if "tag." for the file would be better, and "tag" for the folder
# 00:45 tantek the advantage being, that the directory structures would actually map to URLs
# 00:45 tantek like relative paths in the files would actually work "locally"
# 00:46 tantek e.g. being able to just double-click a .html file in a folder, and have it find relative paths to .css files etc.
# 00:46 tantek IMO inspectable archives are far more reliable
# 00:46 tantek as in, inspectable *cross-platform* *without* having to *always* run custom code
# 00:47 aaronpk i wouldn't count on that working most of the time, based on my experience trying to archive several websites
# 00:47 aaronpk it is in practice, because people have css/js files on other domains, or with weird characters in them all the time
# 00:47 tantek that's a horrible mischaracterization. I'd say 80/20 is *same domain*
# 00:48 tantek except external libraries like jquery and webfonts
# 00:53 Loqi slack/snarfed: btw tantek sorry, didn't mean to be so brusque. I'd happily accept PRs against html2text!
# 00:53 tantek ok for the article with markup -> text for FB etc.
# 00:53 tantek e.g. serializing various semantic HTML elements into default presentation. p blockquote ol ul li
snarfed, KevinMarks, KevinMarks__, yakker, j12t and KartikPrabhu joined the channel
# 02:56 kylewm tantek: how do you think long notes with whitespace should look in a reader?
# 02:57 kylewm or just notes with whitespace, length is irrelevant
# 02:57 kylewm my impression is that your use of white-space: pre-wrap; is somewhat unorthodox?
# 03:02 tantek kylewm: it's not unorthodox at all, but rather following a pattern established by both Twitter and Facebook
# 03:04 tantek this is perhaps a key distinguishing and useful factor of explicitly typing notes vs. articles - notes can be expected to preserve whitespace, and auto-link / auto-embed
# 03:05 kylewm looking at twitter css now, i wouldn't have guessed that's how they did linebreaks. interesting!
# 03:05 tantek whereas articles are expected to have explicit markup instead of preserving whitespace and auto-linking / embedding
# 03:05 tantek kylewm: yeah I tried to be pretty thorough before implementing :)
# 03:07 kylewm so your photos in notes are autolinked, and not embedded with <img> markup?
# 03:08 kylewm does anyone else in IWC have pure-plaintext notes like that?
# 03:09 tantek better: tantek.com/w/Markdown#Hyperlinkswithlinktext
# 03:09 tantek kylewm: AFAIK - *everyones* plain text notes work that way
# 03:10 kylewm my notes are Markdown so linebreaks turn into <br>s etc.
# 03:10 tantek well for those that actually built a separate "note" type of post, rather than just hacking notes as title-less articles in their existing blog posting system
# 03:10 tantek yes it was less work to use white-space prewrap and it was good enough for twitter
snarfed joined the channel
# 03:12 tantek and FB too - when I was manually POSSEing to FB - I could just copy/paste my notes directly into it and it would "do the right thing" - no markup needed
# 03:12 kylewm any suggestion on how to support this in Woodwind? use whitespace pre-wrap if there's no title?
# 03:13 tantek yes, per "notes can be expected to preserve whitespace, and auto-link / auto-embed"
# 03:13 kylewm well, your post has the links already autolinked and the photos already autoembedded
# 03:14 tantek CASSIS auto_link is smart enough to not doubly do so
# 03:14 tantek you can call it twice on the same input and you get the same thing
# 03:14 tantek the bigger challenge is p-content vs e-content
# 03:16 tantek sorry what I meant was auto_link(x) == auto_link(auto_link(x))
# 03:16 kylewm yeah that's what i mean, if there's no p-name which is a title separate from the content
# 03:17 kylewm i use a slightly different algorithm in mf2util, but it's based on this one
# 03:17 tantek anyway - the "is this a note" algorithm is more complex than just "no title" that's all
# 03:17 tantek hence I keep conditionalizing it "for notes" or "if it's a note"
# 03:18 tantek I'm really curious how people see a FB post that you have explicitly hidden from your timeline/profile
# 03:18 tantek (evidence that some are seeing it - likes on it)
# 03:22 kylewm i see p3k also uses pre-wrap, not sure how i missed this detail
# 03:22 tantek (especially since we have a different approach - hence encouragement of documentation to show diversity of approaches)
# 03:29 kylewm KartikPrabhu: really nice. i'd sorta like if it rounded off to the nearest word if my selection is a little sloppy
# 03:33 kylewm looking at Fragmentions for poets to see if it uses the punctuation in a meaningful way
# 03:34 KartikPrabhu also I'm using the latest syntax with one # and %20 escaped spaces, since if this works the space -> + is not needed
# 03:34 Loqi slack/tantek: Note that space to + is only really valid in ? Query params
# 03:34 kylewm yeah I think this UI is really nice. although the document sort of illustrates that paragraph level quoting is not always specific enough
# 03:35 KartikPrabhu kylewm: true! but that is a fragmention.js issue that can be very independently fixed
# 03:37 KartikPrabhu also my code does not check if the text is first occurence or not, which might make it fail
danlyke joined the channel
# 03:40 kylewm KartikPrabhu: although it does give you immediate feedback if your selection fails
# 03:40 kylewm one of the first ones i tried, there was an earlier instance
danlyke, LCyrin and j12t joined the channel
j12t, lukebrooker and KevinMarks joined the channel
# 05:59 KevinMarks Kylewm I'm using the same note technique for app.willsomeone.com
# 06:00 KevinMarks Tantek, have you looked at inlining tweets or other note urls like you do with images?
elf-pavlik joined the channel
# 06:09 Loqi slack/snarfed: KevinMarks_: re googlebot and app engine fetch, it shouldn't
glennjones, j12t, KevinMarks__ and nloadholtes joined the channel
loic_m and pfefferle joined the channel
Jihaisse and j12t joined the channel
tantek, pfefferle and j12t joined the channel
j12t_, petermolnar and jonnybarnes joined the channel
tilgovi, Sebastien-L, j12t, adactio, LynnCyrin, evalica and j12t_ joined the channel
# 09:37 acegiak yo, if I were gonna show someone one youtube video to introduce them to the indieweb movement what should I show them?
squeakytoy, stream7, eschnou, elima and LauraJ joined the channel
LauraJ, frzn, elima and KevinMarks joined the channel
# 11:18 rhiaro adactio: I couldn't make 20/21 June in Brighton but could do 11/12 July
# 11:19 rhiaro !tell barnabywalters: less likely I could do 11/12 July in Berlin now (but not impossible); if there was one in Brighton then instead I'd go to that
# 11:19 Loqi Ok, I'll tell him that when I see him next
KevinMarks__ joined the channel
stream7_ joined the channel
friedcell joined the channel
mlncn and KevinMarks joined the channel
KevinMarks__, LauraJ and LaurieJ joined the channel
Erkan_Yilmaz, LauraJ, parzzix, danlyke, KevinMarks, j12t, bupkes, snarfed, KevinMarks__ and fourtonfish joined the channel
eschnou, glennjones, zero-gravitas, KevinMarks, KevinMarks__, chalettu, tantek, KevinMarks___, j12t, chalettu_, parzzix, tvn, csarven, friedcell, AcidNerd, elima, wolftune, evalica, danlyke and snarfed joined the channel
# 16:03 aaronpk kylewm: have you considered showing the syndication URLs on posts in woodwind?
j12t, todrobbins, yakker, mlncn and snarfed joined the channel
fourtonfish and snarfed joined the channel
# 16:33 aaronpk oh i didn't include anything about timestamps yet
indie-visitor joined the channel
# 16:38 Loqi Welcome, indie-visitor! Set your nickname by typing /nick yourname
# 16:39 snarfed false positive for PSC extractors since it ends in (10.10 Yosemite)
todrobbins joined the channel
# 16:54 aaronpk actually 10.10 is shorthand for the IP address 10.0.0.10
# 16:58 tantek snarfed, you're welcome to re-use the regex for ccTLDs from CASSIS.js auto_link :D
almereyda joined the channel
# 16:59 snarfed don't get me wrong, cassis itself is great, i just really don't want to maintain a copy of that if i can avoid it
# 17:00 tantek what are the chances that new countries will be introduced? ;)
# 17:00 snarfed when i worked on a payment processing system that supported most of the world for a few years, it happened roughly twice a year :P
# 17:03 tantek aaronpk - yeah - looks good - though what did you think of keeping foldernames the same (no : / ) and using an trailing "." for extensionless filenames like "tag" ?
# 17:03 aaronpk would you also append a "." to filenames like "styles.css"?
# 17:04 aaronpk otherwise you're back in the same boat, where a folder that has a "." can't be used as a filename (example.com/foo.css and example.com/foo.css/bar)
# 17:04 kylewm aaronpk: syndication posts on urls in woodwind -- right now I show them if the syndicated post was also found by woodwind... do you think it would be useful to show them all the time?
# 17:06 aaronpk tantek: haha actually the archive itself is going to have URLs like that
# 17:06 kylewm snarfed: tantek: I use cassis to count characters in my UI and my own python regex to shorten the actual tweet text -- bit me the other day when I was writing about hub.mode and hub.url. cassis correctly ignored them, but my code thought they were urls
# 17:07 tantek pretty sure that's why twitter gave up on auto-linking plain ccTLDs
# 17:07 tantek which is why PSC work at all - they depend on that one neat trick
# 17:07 aaronpk kylewm: i don't know if you've seen the hackernews feed in your logs, but I add syncation URLs for the posts pointing to the HN URL, and it would be useful to show those in woodwind
# 17:09 kylewm that's interesting! do you think that's an overload of "syndication", or not?
barnabywalters joined the channel
# 17:10 aaronpk depends on whether syndication is meant to mean syndicated by the author of the post
# 17:10 aaronpk I've submitted my own posts to HN before, and included the syndication url to the HN version on my post
# 17:11 aaronpk and I've seen many other HN posts link to the HN copy, whether or not it was submitted by the author
KevinMarks and Erkan_Yilmaz joined the channel
# 17:13 aaronpk tantek: another thing I was thinking is that it'd be useful to be able to keep the HTTP headers that were part of fetching the URL (in order to preserve content-type or modified date for example)
# 17:14 aaronpk so I was thinking about storing the headers in a file alongside the page, like page.headers
# 17:14 aaronpk at which point I could store the page in page.data, and then folders could just be folders without the ":"
# 17:14 barnabywalters aaronpk: taproot/archive stores the headers in a .txt file with the same name as the .html file
# 17:15 aaronpk barnabywalters: do you store css or image files too?
# 17:15 aaronpk i'm trying to make this work for all kinds of files
# 17:15 barnabywalters my archives are quite big enough without storing a bunch of CSS, much less images!
# 17:16 aaronpk i'm at 400mb now, but if I gzip that it'll be a lot smaller
Erkan_Yilmaz joined the channel
# 17:18 barnabywalters if I did more often then there would be more versions, as microformats changes will trigger a new version creation
KartikPrabhu joined the channel
# 17:20 barnabywalters when I was building taproot/archive I looked into using a zip archive as the fake filesystem. It has a bunch of benefits including vastly reduced filesize, fairly easy to inspect (just unzip) and it works like a key-value based file storage system, where trailing slashes are significant and any character can be used in the keys
# 17:21 barnabywalters the problem with that is that as soon as you unzip it, the keys break (IIRC — I was experimenting with this a looong time ago)
# 17:21 aaronpk yeah you'd have the case sensitivity and weird-chars-in-filenames problem when you unzip it
# 17:22 barnabywalters there’s also the danger that it could become corrupt more easily than a regular filesystem
# 17:22 barnabywalters but that could be mitigated by regularly unzipping the active archive onto long-term backup media
j12t joined the channel
# 17:27 barnabywalters IIRC that occurred to me when I was building taproot/archive, but I couldn’t figure out a good solution so ignored it
# 17:27 aaronpk oh actually as described they would not conflict, but my examples don't match the description!
# 17:28 aaronpk yeah all folders get a colon appended, including the domain name "folder"
# 17:28 aaronpk so it'd be http/example.org:80:/path:/to:/file and http/example.org:/80:/path:/to:/tile
# 17:29 aaronpk however my intent was not to include the colon on the domain name, but maybe it's actually necessary after all
# 17:34 tantek barnabywalters: have you told adactio about this? I bet he'd be interested, per huffduffer and all that.
# 17:35 barnabywalters icecast2 was super easy to set up on my server, but getting apache to proxy requests through was a bit trickier
# 17:36 barnabywalters so if I type Icecast2 (different capitalisation) will loqi handle it correctly?
sparveri1s joined the channel
# 17:44 kylewm wow, this is hilarious. someone on my dorm wrote an aggregator for everyone running Winamp with the SpyAmp plugin, so you could see what everyone else on the floor was listening to
natwelch, tilgovi, revere, KartikPrabhu and todrobbins joined the channel
# 17:51 barnabywalters so now I’m trying to figure out a way of automatically recording every icecast broadcast I do, storing it in an archive by mountpoint (URL) and datetime
# 17:53 barnabywalters but as I’m proxying requests to it through apache, maybe there’s some apache feature I can (ab)use
acegiak and zero-gravitas joined the channel
# 17:56 barnabywalters I wonder if there’s already a tool for shimming logfiles into pub/sub type event dispatcher systems
# 17:56 aaronpk oh you could even create a symlink from the plain filename to the latest versioned filename
# 17:57 barnabywalters aaronpk: hm, I actually hadn’t thought about making them public, but that would make total sense if I did
# 18:03 barnabywalters aaronpk: that’s certainly robust, but the lack of correct file extensions makes it much less observable
# 18:03 barnabywalters I also initially tried .headers and got fed up of not being able to hit space and quicklook at them
tvn joined the channel
# 18:04 aaronpk the problem is a URL that has no file extension might be an image, or might be a css file, and there's no way to tell from the URL
# 18:04 aaronpk or css files might end in a query string and lots of numbers styles.css?v=18345
# 18:05 barnabywalters I strip query strings from URLs, but get the feeling you want your archiver to be less opinionated :)
# 18:06 aaronpk KartikPrabhu: that kind of thing tends to be filesystem-dependent
# 18:06 KartikPrabhu aaronpk: no i mean use the mime-type of the request to decide the file extension?
# 18:07 Loqi KartikPrabhu meant to say: aaronpk: no i mean use the mime-type of the web request to decide the file extension?
# 18:07 aaronpk that sounds very prone to accidental clobbering on the filesystem
# 18:08 barnabywalters you could maintain a list of the most common mime types and their equivalents, and use .data for rarer files
# 18:08 aaronpk i'm pretty sure i could come up with a list of examples that would be impossible to store that way
# 18:09 aaronpk another problem is relying on a mime type to determine path means you can't determine the path based only on the URL, so it becomes hard to programmatically find things later
# 18:14 aaronpk there's always going to be a tradeoff between the inspectability of the files vs robustness vs assumptions made about ppls URLs
# 18:14 aaronpk with spiderpig, I made several assumptions that allowed me to create files on disk that can be served directly by a web server and result in the same website. however i had to do things like force every page to end in a slash, adding redirects that weren't on the original site
# 18:16 aaronpk but that's safe for me to do in this case because the archive is replacing the original site, so I don't need to worry about replacing that URL
# 18:22 aaronpk hmm i suppose a glob could find a file named styles.css.YMD.hms.css or photo.YMD.hms.png
tantek joined the channel
# 18:23 aaronpk that would slow down access time slightly, but that's not the end of the world
# 18:24 aaronpk also could fix that by making a symlink from .data to .css
JarOfGreen and KevinMarks_ joined the channel
# 18:29 aaronpk i kind of like that, it solves the inspectability issue
eschnou, zero-gravitas, KevinMarks and yakker joined the channel
KevinMarks__ and afrogeek joined the channel
csarven and eschnou joined the channel
# 19:08 snarfed KevinMarks_: looks like more the openstack, docket, etc crowds
j12t, KevinMarks_, frzn_, acegiak and fkooman joined the channel
almereyda joined the channel
mlncn, parzzix and j12t joined the channel
j12t, LCyrin, torrorist, tantek, elima, tantek_, lukebrooker, friedcell and wolftune joined the channel
# 21:57 Loqi [mention] Barnaby Walters posted 'So far, a large part of my experimentations with graphical dataflow programming have been using Puredata which, whilst usable for general pr...' linking to http://indiewebcamp.com/https (/articles/how-to-stream-live-audio-over-the-web-using-icecast2-and-puredata/)
friedcell1 joined the channel
# 22:41 KevinMarks_ Â Last week I wrote about Facebook’s AOL-like dominance and concluded, “What might be the broadband to Facebook’s dial-up?” The answer, I think, is this open Twitter: an identity system for the rest of the web that connects people and apps according to interests, not just superficial relationships, and monetizes accordingly.
# 22:42 snarfed sure, no biggie, throw that together in a weekend or two :P
tantek and KartikPrabhu joined the channel
# 22:58 kylewm it's kind of terrifying to think of twitter as the identity and communication substrate of the internet :p
# 23:05 KevinMarks as I last night described indieweb to people as "like an open twitter" I read that differently
# 23:22 GWG KevinMarks: I thought identi.ca was like an open Twitter
# 23:22 KevinMarks i explained the difference between a monoculture and actually open
snarfed joined the channel
# 23:32 GWG I think the Internet could use an https/http 2 push
# 23:33 tantek aaronpk - indeed - mass mozilla dev-platform thread on that with lots of debate from many sides
# 23:35 aaronpk frankly i didn't expect a resolution on that so quickly
# 23:38 tantek there's lots of caveats and to-be-scheduled/scoped in that "resolution"
frzn and snarfed joined the channel
j12t joined the channel