#dev 2022-08-19

2022-08-19 UTC
sp1ff, jacky, nertzy, gRegorLove_, geoffo, bterry, alex11, gRegorLove__, thelounge7384, tetov-irc, tbbrown and chenghiz_ joined the channel
#
[snarfed]
hackermention status: reached August 2009, 734k items processed, 133 wms sent successfully to 10 domains, new ones include oauth.net, raamdev.com, dangerouslyawesome.com, and more. none visible yet.
#
[snarfed]
will document more eventually, once it's gotten a bit farther
#
[snarfed]
also I found the whole HN archive on BigQuery. with that I could redesign this to look at just posts and their links, do wm discovery, and only then fetch comments and to send wms to domains with endpoints. I'll do that for Reddit
#
[snarfed]
may also end up with a dataset of ~1M ish domains and their webmention endpoints (if any)
tbbrown and jamietanna joined the channel
#
jamietanna
What's needed to make https://github.com/indieweb/micropub-extensions/issues/11 visible in q=config? Not super obvious when reading through the thread
#
Loqi
[EdwardHinkle] #11 Stable Property: visibility
[manton] joined the channel
#
[manton]
Rant: Tumblr changed their posts export format, so now I need to rewrite my importer. What can we do to get the .bar format more widely adopted? Maybe support in WordPress? Open source tools?
#
[manton]
What is blog archive format?
#
Loqi
blog archive format is a data format proposed by Manton Reece for the export of a blog, based on a zip file and top level HTML h-feed inside, that is supported by micro.blog https://indieweb.org/blog_archive_format
#
[manton]
More complaining: this is what Tumblr’s export dates look like: `<span id="timestamp"> February 26th, 2007 3:13pm </span>`. This is what .bar looks like: `<time datetime="2022-08-14T12:47:23-0500" class="dt-published">2022-08-14 17:47:23 +0000</time>`
#
GWG
[manton]: I might be convinced to help with that
#
[manton]
@GWG That would be awesome.
#
GWG
[manton]: File format is easy enough... but it's a zip file, isn't it?
#
[manton]
Maybe starting with a WP plugin that could just export posts (not even images) would be a good first step.
#
[manton]
Yep, it’s a ZIP.
#
GWG
Exporting as a single file I've done.. never generated a zip... would be worried about timeouts
#
[manton]
Yeah, it’s definitely easier if it can be down in the background outside of a web request. I wonder how other WP export plugins handle this.
#
[manton]
Or the built-in export.
#
[manton]
(“done in the background”)
#
sknebel
an incremental thing probably makes sense. e.g. something that packages 100 posts per call
#
[manton]
That sounds good. I noticed WP.com fires off a background task and then the UI hits the server every second to check if it’s done. But obviously WP.com is its own thing and they might have more flexibility than self-hosted WP.
#
sknebel
and then either a client manually or some JS on the settings page can request that repeatedly to collect the full dump
#
GWG
WordPress cron can be used to break up a job
#
GWG
But it triggers on load unless you have it triggered another way. I run it as a cron job every 5 minutes
#
GWG
How about an external one that generates it using the rest API?
#
GWG
That would work for anything public
#
[manton]
Interesting idea.
#
[manton]
Also did confirm that self-hosted WP does a posts export all in one request. It doesn’t include images so it’s fairly quick for small/medium sites.
#
[manton]
I guess with a companion service that hits the API and packages up the posts, you could have a lightweight WP plugin that linked to it too, so it’s discoverable from within WP.
#
[manton]
The domains export.blog or archive.blog are available but at $700/year. 😞
#
[manton]
Just brainstorming whether an external service that facilitated blog transfer would be worth pursuing. Don’t want to get ahead of myself, though.
#
[tantek]
what is migration
#
Loqi
migration in the context of the indieweb refers to the process of moving your indieweb site from any one or more of one CMS / web host / DNS provider / URL design / domain name to another https://indieweb.org/migration
#
[tantek]
[manton] maybe some useful notes in there ^ ?
#
[tantek]
or at least a place to add requests for help
#
[manton]
Thanks [tantek], I think I had missed that page.
kinduff joined the channel
#
[tantek]
[manton] you might also be interested in...
#
[tantek]
what is backfill
#
Loqi
backfill is the action of importing all your past posts, typically from a social media silo, into your own site https://indieweb.org/backfill
#
kinduff
Hello 👋 I was wondering if there is a way to replay webhooks from webmention.io
#
aaronpk
kinduff: hm no i don't think i built that. you should be able to get the same data from the API tho
#
kinduff
i'll take a dive, thanks for the direction!
#
aaronpk
oh and make sure you use the .jf2 URLs instead of .json since that's the format it uses for the webhook
#
kinduff
good tip! thank you!
#
[tantek]
aaronpk, would you be open to a PR to https://github.com/aaronpk/webmention.io#api adding an FAQ section? I feel like I've heard this question asked before
#
Loqi
[aaronpk] webmention.io: Easily enable webmentions and pingbacks on any web page
#
aaronpk
definitely
#
superkuh
RIP Google search, 1998-2019. Google search will no longer return more than 400 results for any query. It's not even enough to get through the SEO spam.
#
kinduff
yeah, its a bummer
#
kinduff
specially when spammers can get around any changes pretty fast
#
sknebel
google a word, click through to the last page of search results, it comes a lot sooner than you'd think
#
superkuh
I just had the 400 number confirmed on google support forums.
#
superkuh
It's only really obvious if you're logged in and have 100 results per page.
#
[tantek]
they've been trimming that over time for a while. didn't realize it had gotten down to 400!
#
sknebel
i.e. if I google "cheese", it says "approx 1.920.000.000 results", until I hit page 32 and suddenly its "approx 317 results"
#
superkuh
Yep.
#
sknebel
and I'm fairly sure there are more than 317 websites that mention cheese
alex11 joined the channel
#
superkuh
See me do that with "vector" on this youtube screen capture video: https://www.youtube.com/watch?v=1fTbnA6qOh8
#
sknebel
I could imagine some people here have blogs that have more than 317 posts mentioning cheese ;)
#
superkuh
Only 388 results for the word "vector" on the entire web.
#
superkuh
With the default 10 results per page you have to waste a lot of time getting to the upper 30s pagination, but it'll crap out before 400 every time.
#
sknebel
448 hits for "coffee". now that I'm sure some blogs have more often ;)
#
superkuh
Sorry for being so off-topic, again. This realization just shocked me.
#
[tantek]
superkuh, it's sorta on-topic, e.g. if you have more than 400 mentions for cheese on your own blog, you should use your own search indexing / UI on your own server instead of using Google to provide your search UI for your personal site
#
[tantek]
what is search?
#
Loqi
search in the IndieWeb usually refers to searching your personal site for your own content (and/or caches of content you’ve responded to), sometimes searching IndieWeb chat archives or the IndieWeb wiki, or the nascent IndieWeb Search index and service to search across community posts https://indieweb.org/search
#
superkuh
Search is the one aspect of my site I still don't have locally. :(
#
Loqi
[tantek] #180 add FAQ / replay webhooks? q&a from #indieweb-dev
#
superkuh
Comments, indieweb stuff, all that is local. But search I just have a google link.
#
[tantek]
superkuh, you could add a strong caveat about using Google for your personal site search UI here: https://indieweb.org/search#search_box_-_level_2, citing the reports of "only 400" results from Google.
superkuh joined the channel
#
kinduff
would add algolia, works pretty nice
#
[tantek]
is that what you use on your own site?
#
kinduff
i do not, but ive used it on client websites ive built
#
kinduff
mine doesn't have a search (yet)
#
[tantek]
that's usually the bar for documenting options like that, someone that has experience with making it work on their own site
#
[tantek]
or rather, I think it'd be ok to document possibilities that yet to be implemented on a personal site in a Brainstorming section
gRegorLove_ and jeroen[m] joined the channel
#
@clawfire
Bon, les webmentions c'était fun, mais personne ne se sent d'utiliser le truc. Du coup, j'ai installé un petit @commento_io et je vais auto-héberger mes comments sur le blog.
(twitter.com/_/status/1560690834500210691)
#
capjamesg
Why does XML have namespaces?
#
aaronpk
➡️🐉
#
capjamesg
What are namespaces?
#
Loqi
namespaces are a mechanism to allow multiple properties of the same name to exist in the same object while avoiding conflicts between them https://indieweb.org/namespaces
#
sknebel
good answer Loqi :D
#
capjamesg
I was just watching a video on Plan 9 namespaces (namespacing for OS, entirely separate from XML namespaces) but XML namespaces came to mind.
#
capjamesg
A big leap (!)
#
sknebel
ah, I thought you stumbled over fallout from todays w3c experiment to redirect w3.com to https :D
#
sknebel
s/w3.com/w3.org
#
capjamesg
Oh wow.
#
capjamesg
"The primary reason for this is that we wanted to avoid causing issues for software requesting machine-readable resources from www.w3.org such as HTML DTDs, XML Schemas, and namespace documents."
#
[snarfed]
capjamesg did you mean "Why does XML?"
#
sknebel
Update 19 Aug 2022: We ended the second test early, at 17:30 UTC today due to several complaints that this change was impacting production services. We plan to conduct another test in two weeks, for 48 hours starting at 17:00 UTC on Sep 1, ending at 17:00 UTC Sep 3. If you have dependencies on our web site in your production services please work to remove them, or update them to handle redirections and https.
#
capjamesg
Why does XML? [snarfed]
#
[snarfed]
exactly
#
capjamesg
sknebel For how long can that go on haha.
#
capjamesg
I don't understand [snarfed] :D
#
[snarfed]
exactly
#
capjamesg
Did you get my message about Plan 9 snarfed by the way?
#
[snarfed]
"XML, I don't understand"
#
capjamesg
T-shirt idea? :D
#
[snarfed]
plan 9 msg, a few min ago, here? yup I see it
#
[snarfed]
wait wat
#
capjamesg
"snarf" = "copy" in plan 9
#
capjamesg
You don't copy a word. You "snarf" a word.
#
sknebel
tbh, "getsnarf.io" sounds just like some hipster startup :P
#
[snarfed]
correct
#
capjamesg
sknebel :D
#
capjamesg
I want to go back to Plan 9 but it is so different to the OSes I have used.
#
capjamesg
Anyway. What was the topic of discussion in here today?
#
[snarfed]
topic was conneg--, always conneg--
#
Loqi
conneg has -10 karma in this channel over the last year (-13 in all channels)
#
sknebel
archive exports. search.
#
capjamesg
Oh! archive exports!
#
capjamesg
I limit results from my site search engine to 50 but only because I never implemented pagination.
#
capjamesg
I should get around to that.
#
Loqi
it is probable
#
[tantek]
namespaces--
#
Loqi
namespaces has -1 karma over the last year
#
[tantek]
what is ➡
#
Loqi
It looks like we don't have a page for "➡" yet. Would you like to create it? (Or just say "➡ is ____", a sentence describing the term)
#
[tantek]
xmlnamespaces--
#
Loqi
xmlnamespaces has -1 karma over the last year
#
@kevinmarks
↩️ Could webmention be made to work with IPFS urls too? Or would you need an HTTP endpoint to POST to?
(twitter.com/_/status/1560723190439186432)
tetov-irc and petermolnar joined the channel