#dev 2019-08-06
2019-08-06 UTC
jjuran, KartikPrabhu, GWG, ben_thatmustbeme and IWSlackGateway joined the channel
KartikPrabhu, [tantek], [svandragt] and IWSlackGateway joined the channel
cweiske and [fluffy] joined the channel
IWSlackGateway, [tonz], [frank], [pfefferle] and jeremycherfas joined the channel
# petermolnar I came to learn a terrifying conclusion today: after dealing with so many archivers and bookmark managers, I went back to the idea to download everything (css, js, images), inline them, and save the html of a page. Except... I realized I'm inlining a silly amount of tracking JS as well. Which means I'd need to filter JS and CSS based on domains... for archiving. I'm starting to seriously consider we should actually go back to gophe
# petermolnar r.
# petermolnar 9.5MB HTML file O.O
# Zegnat That is basically how I have been archiving, petermolnar. Inline everything. Headless Chromium + https://github.com/WebMemex/freeze-dry and then dump the entirety to an HTML file
# Zegnat 1.90 MB for a tweet: https://wiki.zegnat.net/cache/?md5=21375e87bdb275765d602cc68a4c9c23
# petermolnar I wasn't aware of this lib, I'll try it
# petermolnar ah, this is nmp
# petermolnar npm
# Zegnat My together cobbling: https://github.com/Zegnat/node-beanstalkd-web-archiver
[svandragt], [tantek] and [jgmac1106] joined the channel
[Lewis_Cowles] joined the channel
# [Lewis_Cowles] Anyone else archiving using a proxy? If you pull into your same domain, you can use a backend language to do things like replace specific js files, use or pass to jQuery to parse DOM using DOM, then paste details elsewhere.
# [Lewis_Cowles] Very old version of a proxying scraper (takes a list of URLs), extracts details
# [Lewis_Cowles] This one was a version used to scrape an old Drupal to migrate towards WordPress-esque JSON
eli_oat, loicm and [tonz] joined the channel
# [jgmac1106] oooh so you can have them offline?
[KevinMarks] joined the channel
# [KevinMarks] 1.9MB for a tweet is something Maciej should mock
# treora Yep, freeze-dry stores original URLs of inlined resource in data-original-$
{attribute}
# treora It is something I came up with however, would be nicer to have some standard to follow.
# jeremycherfas [jgmac1106] One of the best uses of Dropbox for me is that all my PDFs (at least, the ones I have bibliographised in Bookends, live in a Dropbox folder.
[jgmac1106] joined the channel
# [jgmac1106] it was mendeley for me, now I have been just grabbing the link to a pdf and making a bookmark in Known, or just uploading to my server if I can't share
# [jgmac1106] Mendeley synced to Google Drive for awhile, but trying to own it all myself, but I have privilege of academic access, most knowledge not locked away
# [jgmac1106] most links would be to a paywall for most
# jeremycherfas Even when I have access, I download for preference. Means I can annotate, share etc.
# jeremycherfas Mebndeley sucked for me.
[snarfed] joined the channel
# jeremycherfas I highly recommend Bookends on Mac OSX
# jeremycherfas I suspect one reason online versions -- I have tried only Mendeley and, briefly, Zotero -- suck is because they do not have sufficient focus or a decent business plan.
# jeremycherfas I didn't. Too happy with Bookends. But honestly, that's one application where I would be happy to subscribe because it is so important to me.
# [jgmac1106] mainly they suck because human suck at data entry or just argued about what kind of data the files should have....no pdf parses well, most citation databases full of garbage,....and because pdf suck in general...
# [jgmac1106] will look into Bookends
# [jgmac1106] ..but I really want to keep trying to do this all from my own site....at least the notes, maybe a manual copy
[eddie] joined the channel
# jeremycherfas Bookends does a very good job of parsing from PDFs and DOIs, and Publishers these days do a very good job of providing bibtex and others for download.
# jeremycherfas Bookends is not on your site, but is on your machine. And I think it would be a simple matter to present data on your own site. I know you could do static export of HTML formatted any which way you choose with links to the PDF if you have it.
[Rasul_Kireev] joined the channel
# [jgmac1106] yeah my workflow has been this lately: https://quickthoughts.jgregorymcverry.com/2019/08/06/learning-to-learn-online-a-study-of-perceptual-changes-betweenmultiple
# [jgmac1106] I add a pdf to my server, bookmark it, go back and read and add blockquotes and notes
# [jgmac1106] been doing that more than hypothes.is lately
seki[m4 and funwhilelost[m] joined the channel
# [jgmac1106] now that I have another 50 gigs of storage I can finally move all my pdfs over
[pfefferle] and [renem] joined the channel
# [renem] [snarfed] As a follow up to my ActivityPub "problems" with Mastodon.host (if you remember) with bridgy.fed, I created a new account on mastodon.social and it worked as it should. Sorry for the work, but I didn't reach the admins of mastodon.host and also got no information that bridgy.fed or any other services are "blocked".
# [jgmac1106] what I would need to figure out first jeremycherfas is how to batch rename a ton of pdfs to just author and title.
# [jgmac1106] [Rasul_Kireev] you can use this tool to check your h-card: https://indiewebify.me/
# [jgmac1106] and we have been working on the MDN docs for microformats that has some examples: https://developer.mozilla.org/en-US/docs/Web/HTML/microformats
# [Rasul_Kireev] [jgmac1106] I have been using this tool and is the reason I'm asking you guys. Because I had trouble declaring to separate h-cards , I ended up adding a span h-cards around the whole page, which is probably not the best solution.
# [jgmac1106] so if you had p-card as a CSS selector you san use microformats too <section class="p-card h-card></section> that would be your h-card
jackjamieson joined the channel
# [jgmac1106] what is your url?
# [Rasul_Kireev] Thanks, I will head over to MDN resource you mentioned to check it out.
# [Rasul_Kireev] rasulkireev.com
# [Rasul_Kireev] [jgmac1106] Thanks for your help! I'm sorry, I didn't want to bother anyone too much with an issue like these!
# [jgmac1106] move the h-card from the span and to the div with the navbar id
# [jgmac1106] I just don't see the closing </span> but I could be missing it
# [jgmac1106] I would move the h-card up and delete the span,
# [jgmac1106] that would mean your nav items would be inlcuded in your h-card you can decide if you like that or do not like it and then act accordingly
# [jgmac1106] and never feel like it is a bother with any issue, that is why the community is here
# [jgmac1106] you could turn that entire page into one gigantic h-card, some people do that
# [jgmac1106] Who is zegnat?
# Loqi Martijn van der Ven is a long-time web tinkerer living in Sweden (CEST or Europe/Stockholm timezone). Pronouns: he or they https://indieweb.org/User:Vanderven.se/martijn/
# [jgmac1106] but I think the missing closing </span> is root of most of your trouble you don't close until the very bottom and after the </div> where the span opens
# [jgmac1106] in the above example you can see an entire page h-card....but the question of what happens when I have two h-cards on a page comes up often...and I am not sure I can give an exact answer
# KartikPrabhu what is representative h-card
# Loqi The representative h-card for a page is an h-card on that page that represents that page, if any, as not all pages are about a person or organization, a page might not have a representative h-card https://indieweb.org/representative_h-card
# KartikPrabhu that ^
[timothy_chamber joined the channel
# [jgmac1106] ohh I always thought it was an either or...I put u-uid OR u-url..I never put both...goes to fix many a thingas
# [Rasul_Kireev] Thanks all, I have a bunch of things to try updating now! Thanks for examples too!
# [Rasul_Kireev] [jgmac1106] why do you ask about zegnat? Do I have that somewhere on my page?I have no idea who or what that is 🙈
# [jgmac1106] ohh sorry, no I was asking our friendly bot to give me a url.
# [jgmac1106] zegnat has a single page...really gigantic h-card so I was showing examplle
# [jgmac1106] Who is jgmac1106?
# Loqi J. Gregory McVerry (Greg) is an educator trying to use the web to help engineer better teachers https://indieweb.org/User:Jgregorymcverry.com
# [jgmac1106] So Loqi helps us out.
# [jgmac1106] Who is Loqi?
# Loqi Loqi is a friendly and useful bot/digital therapist present in the IndieWeb discussion channels https://indieweb.org/User:Loqi.me
# [jgmac1106] You can also use Kaja to check your microformats
# [jgmac1106] !mf2 rasulkireev.com
# [jgmac1106] well you saw an example of that working above
# [jgmac1106] tonz, omz13, jeremycherfas, thx for inspiration now transferring 1,778 pdfs to my url
# [jgmac1106] I should have done this forever ago...though never had the space
# [Rasul_Kireev] Wow. That's incredible. Not something I see everyday.
[KevinMarks] joined the channel
# [jgmac1106] Loqi does lots of cool things.
# [jgmac1106] What is an h-card?
# Loqi h-card is the microformats2 vocabulary for marking up people, organizations, and venues on web sites https://indieweb.org/h-card
# [jgmac1106] we can ask questions, find about people, ask about times, make memes, and feed loqi
# [Rasul_Kireev] Ok, I have to ask. How do you feed it?)
# [Rasul_Kireev] Feed Loqi
# [Rasul_Kireev] No, doesn't work 🙈
# [jgmac1106] Feed Loqi All the Things
# jackjamieson Not Acceptable! An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.
# jackjamieson Documenting my mod_security woes in case it's useful for anyone's future reference. On my WordPress site I started getting the following error whenever I try to log into an app using the IndieAuth plugin:
# jackjamieson (whoops, reverse the order of my last two messages :)
# jackjamieson My host is bluehost. I contacted their technical support and they asked me to identify the URL of the plugin. I gave them the authorization endpoint URL (https://jackjamieson.net/wp-json/indieauth/1.0/) and the token endpoint URL (https://jackjamieson.net/wp-json/indieauth/1.0/token)
# jackjamieson The support representive disabled mod_security at those URLs, but apparently it kept re-enabling itself for some reason. So it's now been escalated to their technical team who will email me
[fluffy] joined the channel
[tantek] joined the channel
# [jgmac1106] I now own all my pdfs from my url...will have to remove some doubles that gathered over the years...but excited to never have to worry about "it is on this machine, google drive, etc" http://jgregorymcverry.com/readings/
# [jgmac1106] probably should password protect page too not to get take down notices from publishers
# [tantek] [jgmac1106] start with immediately adding a rule to your http://jgregorymcverry.com/robots.txt that blocks all bots from that directory
# [jgmac1106] What is robot.txt?
# Loqi It looks like we don't have a page for "robot.txt" yet. Would you like to create it? (Or just say "robot.txt is ____", a sentence describing the term)
# aaronpk robot.txt is /robots.txt
# [jgmac1106] this is it actually I think: https://indieweb.org/robots
# [jgmac1106] that is the link to the actual robot.txt on the wiki
# [jgmac1106] nvm this is it: https://indieweb.org/robots_txt
# [jgmac1106] User-agent: *
# [jgmac1106] Disallow: /readings/ I think that would be correct yes?
# [jgmac1106] thanks I may work on that page so it doesn't take as many clicks to learn, but first finish grades and do NYC page
# [jgmac1106] but robot.txt page is up..
# [jgmac1106] ..then I go to metya and see tantek already had same thought
# [tantek] [jgmac1106] added examples to here for you that you can copy/paste and edit accordingly: https://indieweb.org/robots_txt#Examples
# [jgmac1106] tantek++
# [jgmac1106] its a tough balance as long scrolls hurt comprehension as well...why I like accordion boxes....
# [jgmac1106] the worst is food recipes now
# [jgmac1106] similar issue....I want the cook time and I have to read about your journey to some market in Almafa
[miklb] joined the channel
# [jgmac1106] old twitter design paradigm?
[eddie] joined the channel
# aaronpk my recipes are pretty barebones https://aaronparecki.com/recipes
# Loqi [Aaron Parecki] Habanero Hot Sauce https://aaronparecki.com/2018/12/25/8/habaneros.jpg
# [fluffy] yeah my recipe-recipes are pretty barebones too, http://beesbuzz.biz/food
[grantcodes] joined the channel
# [grantcodes] One of my next projects is going to be a sort of micro.blog equivalent for recipes 🙂
# [grantcodes] Oh that's cool. Looks like it's more for storing than publishing though?
# [grantcodes] Not tried it but this looks quite interesting for recipes too: https://www.copymethat.com/ (random name though)
# [grantcodes] I'm hoping to be so something with git so you can copy other peoples recipes and then change them and show the differences too
# [grantcodes] Also for my recipes site I am planning on severely limiting the intro and outro junk that noone cares about (mentioned above) - maybe some language recognition to see if they mention "my family really loves xyz"
# [grantcodes] [miklb] It was only the twitter copy that was truncated or the content in wp as well?
# [grantcodes] Ah got it now :thumbsup: and yeah that's not great
[pfefferle] joined the channel
[tonz], jackjamieson and [KevinMarks] joined the channel
# [KevinMarks] Aaron has good h-recipe markup https://aaronparecki.com/recipes
# [KevinMarks] Ooh, works for the whole category http://www.unmung.com/indiecard?url=https%3A%2F%2Faaronparecki.com%2Frecipes
# Loqi [Aaron Parecki] Habanero Hot Sauce https://aaronparecki.com/2018/12/25/8/habaneros.jpg
[manton], [timothy_chamber and [fluffy] joined the channel
# [fluffy] Hmm, unmung doesn’t do a great job of formatting my site. Am I doing something wrong in my mf2? (probably!) http://www.unmung.com/indiecard?url=http%3A%2F%2Fbeesbuzz.biz%2Fblog%2F
loicm, eli_oat1 and KartikPrabhu joined the channel
# [KevinMarks] I'll have a look, I shouldn't throw up on the user like that
# Loqi mod_security is a web application firewall for the Apache web server https://indieweb.org/mod_security
# jackjamieson sknebel - good call. Will do so now
# [KevinMarks] talking of hacks that went too far, there is way too much conditional code in the jinja templates in unmung
# [KevinMarks] it's doing an OK job on your recipes too, though not h-recipe parsing
[pfefferle] joined the channel
# [fluffy] okay http://www.unmung.com/mf2?url=http%3A%2F%2Fbeesbuzz.biz%2Fblog%2F&html=&pretty=on is showing it correctlyish
gRegorLove joined the channel
# [KevinMarks] I may be using the author url - I'll check later
# jamietanna[m] Spent the evening restructuring how my site stores data so its easier to work with micropub - not yet there but will be awesome when it's done so I can then carry on with writing the micropub endpoint
jjuran, gRegorLove_, KartikPrabhu, [fluffy]1 and [KevinMarks]1 joined the channel