#dev 2021-10-07

2021-10-07 UTC
#
[tantek]
I find this particularly frustrating as well. Kinda wish there was an easy path to "here's how to use your iPhone/iPod/iPad-mini to create & update your own projects for your own site" https://twitter.com/Pinboard/status/1445890223846526986
#
@Pinboard
And part of it is definitely that everyone today carries an incredible computer in their pocket and... you can't do anything with it. My first computer program at age 8 made an IBM desktop write profanity in an infinite loop. I wouldn't begin to know how to do that on an iPhone.
(twitter.com/_/status/1445890223846526986)
[tw2113_Slack_] and edburns[d] joined the channel
#
gRegor
aaronpk, Monocle is giving me invalid_token_response, "did not return 'me'" but my logging shows it's in the JSON
#
aaronpk
interesting
#
aaronpk
are you returning the json content type haeder?
#
gRegor
My token endpoint did change, so was first getting 403 in Monocle. Logged in to Aperture to refresh it
#
gRegor
Ah, that might be it
#
gRegor
Hm, no, same error now
#
aaronpk
so that's actually monocle reporting the error from Aperture
#
aaronpk
can you screenshot it?
#
gRegor
Adding more logging. I track each time a token is last accessed, the token Monocle is sending is definitely valid, so it does seem to be something with the response
#
aaronpk
hm that is less helpful than i was hoping
#
gRegor
The token verification request is for the Monocle token, but yeah the error seems to be from Aperture, which seems odd? Shouldn't it be the Aperture token?
#
aaronpk
monocle gets the token, it sends it to aperture, aperture tries to verify it at your site
#
aaronpk
do you have access to the token?
#
aaronpk
can you make this curl request to your token endpoint and paste the result? curl -i https://example.com/token -H "Accept: application/json" -H "Authorization: Bearer XXXXX"
#
aaronpk
`"me "`
#
aaronpk
you have a space
#
gRegor
\o/ I'm in, thanks!
#
aaronpk
woohoo
#
Loqi
😃
#
@aaronpk
↩️ Dynamic Client Registration, but afaik no major provider supports this because they *want* RPs to have a pre-established relationship. We built IndieAuth to avoid the need for any client registration and it works great for that use case: https://aaronparecki.com/2018/07/07/7/oauth-for-the-open-web
(twitter.com/_/status/1445944383933595651)
gRegor, [tantek]1, IWSlackGateway, [snarfed]1, hendursaga, kogepan, jjuran, hendursa1, [pfefferle] and tetov-irc joined the channel
#
capjamesg[d]
I have noticed that some sites have feeds on all pages.
#
capjamesg[d]
I wonder how I can determine "representative feeds" that will capture most of a site's new content without reindexing individual posts on account of their being linked to a feed (i.e. comments feed).
#
capjamesg[d]
The homepage is the obvious way to go.
#
capjamesg[d]
But this would mean that likes, etc. might not be discovered as easily.
#
nekr0z
My site might be a problem. I have several homepage feeds (h-feed as a separate page because I couldn't fit it into the homepage markup, and 4 RSS feeds for different languages/combinations), every category and tag has its own RSS feed (in case someone wants to subscribe to just that; 4 feeds actually, the language variants), and also every page has a feed of comments. Should be a nightmare for your crawler :-D
#
capjamesg[d]
nekr0z The crawler has found at least 5,000 feeds today 😄
#
capjamesg[d]
That's why I raised this.
#
capjamesg[d]
There's also the matter of choosing which format is representative.
#
capjamesg[d]
I have an RSS and h-feed on my home page. They will both mark up the same data. Which should be considered representative? Otherwise I would be crawling two different feeds for the same content.
#
capjamesg[d]
Thus increasing crawl time and load for the site being crawled.
#
capjamesg[d]
As it stands, the crawler has found 1 feed for every 10 pages.
#
nekr0z
capjamesg[d]: That's a tought one: define "representative feed for a site". Because if you're looking for a feed that has _everything_ published on the site, my site doesn't have it ;-)
#
capjamesg[d]
There are contingencies for that 🙂
#
capjamesg[d]
(namely, crawling sitemaps and, although this has not been coded yet, scheduling full recrawls)
#
[KevinMarks]
h-feed can be richer, but if you have backcompat parsing you'll get wordpress mf1 ones which can be a bit thin. You may need to compare them on first crawl and pick a winner
#
GWG
[KevinMarks]: I have a better way to get WordPress sites.
#
GWG
They have a built in API
#
[KevinMarks]
a lot of technorati secret sauce was correlating the feed version with the html version to decide which was richer. You can also have the case where the h-feed or atom/rss feed is summary not content, where you may need to crawl the post page for full.
#
capjamesg[d]
I forgot about the WP API. Which is worth considering due to the number of sites that use WP.
#
[KevinMarks]
also atom and h-feed can tell you if they're summary vs content, but rss uses description for both so you have to guess
#
capjamesg[d]
There is no backcompat for mf1. Should there be?
#
capjamesg[d]
That makes sense.
#
capjamesg[d]
> "correlating the feed version with the html version to decide which was richer. "
#
capjamesg[d]
What matters right now is just retrieving URLs. The crawler doesn't ingest data directly from XML / JSON Feed / WebSub.
#
[KevinMarks]
if you're using the python microformats parser it will construct a minimal h-feed from wordpress mf1 markup
#
capjamesg[d]
Oh that is cool.
#
capjamesg[d]
I am using mf2py which looks to support mf1 too.
#
[KevinMarks]
right, but that can be confusing as the fallback depends on the theme - hentry is kinda vestiagial in WP
#
capjamesg[d]
That makes sense. I think some backcompat is a nice to have right now. The engine parses a lot of mf2 so adding mf1 support would take time.
hs0ucy and akevinhuang joined the channel
#
[snarfed]1
aaronpk re Dynamic Client Registration, Mastodon does its own thing, not that, right?
#
aaronpk
it's similar but not based on the standard IIRC
#
sknebel
very similar yes, but not entirely compatible
#
[snarfed]1
:thumbsup:
#
sknebel
maybe they based it on an early draft?
#
sknebel
it's weirdly close for not being identical
gRegor and jjuran joined the channel
#
nekr0z
capjamesg[d]: looks like indieweb-search has troubles parsing pages on my site. Some have already been recrawled to show author and an avatar, but more often than not they show the author of the first comment, not the article itself :)
#
capjamesg[d]
Ah no 😦
#
capjamesg[d]
What is your domain name?
#
capjamesg[d]
In terms of code, it's easy to fix that. But not without doing a full recrawl 🤦
#
nekr0z
capjamesg[d]: evgenykuznetsov.org
#
nekr0z
capjamesg[d]: I don't mind waiting for a recrawl, that's not the issue. I just wanted to let you know that maybe the code needs fixing.
#
capjamesg[d]
The code does need fixing 😄
#
capjamesg[d]
This is related to the discussion on representative h-feeds. I shouldn't choose a h-card if it is in a h-feed.
jjuran joined the channel
#
nekr0z
Actually, it's not the first comment it gets, but the first like, i.e. the first face in the facepile. It ignores the fact that every post has a proper p-author in the h-entry, and instead takes the first u-author from the facepile (and that happens to be inside h-cite u-like, so is should definitely not count as the post's author no matter what).
#
nekr0z
capjamesg[d]: Nope, these cards are not in h-feed, just in a separate h-cite and u-like (or p-comment) each.
#
capjamesg[d]
I have disabled the feature where h_card_domain != domain name.
#
capjamesg[d]
I think I'm actually choosing the first h-card on a page.
#
capjamesg[d]
If none exists, I revert to the home page h-card.
#
capjamesg[d]
(if there is none on the home page, then the value for h_card is set to none)
#
capjamesg[d]
Your icons will appear, just after I recrawl again 🙂
#
capjamesg[d]
The issue was that I relied on the first h-card on a page other than the home page, even if, as you said, it was in a u-like-of, etc.
#
nekr0z
capjamesg[d]: Not in this case, no. The "real" p-author is much higher up every page :)
#
capjamesg[d]
Great catch! nekr0z++
#
Loqi
nekr0z has 4 karma in this channel over the last year (5 in all channels)
#
capjamesg[d]
Yeah, indeed!
#
capjamesg[d]
I need to do a bit more debugging...
#
capjamesg[d]
Search is hard 😦
#
capjamesg[d]
But fun 🙂
#
nekr0z
I remember running into all kinds of issues when I was writing the code to find a representative h-card on a page; it turned out that people do all kinds of strange and seemingly counter-intuitive things with mf2 :) The code ended up being 3x longer than the initial version.
#
nekr0z
And there's a spec for what counts as a representative h-card, mind you!
#
capjamesg[d]
I have the spec open now.
#
capjamesg[d]
Thanks for reminding me about that.
#
nekr0z
I'd suggest that the author of a h-entry should be considered whoever is mentioned as p-author or u-author directly under h-entry (i.e. not inside any h-cite, p-comment or some such). At least that was the logic I used on my site, and the microformats wiki seems to agree.
#
capjamesg[d]
Oh, really interesting... I'll adhere to this spec in the logic.
#
capjamesg[d]
Yeah. The author should be linked rather than "guessing" by choosing the first h-card.
#
capjamesg[d]
Especially considering, say, the h-card author could post their h-card in the footer, below all the comments.
#
nekr0z
There's but one flaw in this logic.
#
nekr0z
If I ever decide to invite someone to write a post on my site (let's say my wife or my father who don't have their own websites), I'll make sure to identify them as p-author, but that h-card would not be representative per spec.
[pfefferle] and gRegorLove_ joined the channel
#
[KevinMarks]
Well, if its inside the post it's representative for the post, just not the site
#
nekr0z
[KevinMarks]: Sure, but we don't have a spec for that (or do we?)
#
[snarfed]1
we have separate authorship vs representative h-card algorithms, right? https://indieweb.org/authorship
#
capjamesg[d]
nekr0z You can contribute a solution to the engine if you like 😄 https://github.com/capjamesg/indieweb-search/blob/main/crawler/add_to_database.py#L35
strugee_ joined the channel
#
[tantek]1
snarfed++ thank you
#
Loqi
snarfed has 33 karma in this channel over the last year (61 in all channels)
#
[tantek]1
also the embedded image here is a good case for Micropub (and "just text files on my computer") https://twitter.com/RobertHaisfield/status/1385228672580218882
#
nekr0z
capjamesg[d]: I didn't learn that much Python to be of any good there yet, but thanks for the confidence.
#
nekr0z
capjamesg[d]: also, what [snarfed]1 said about the authorship
#
nekr0z
[snarfed]1++
#
Loqi
[snarfed]1 has 1 karma over the last year
#
nekr0z
capjamesg[d]: https://indieweb.org/authorship-spec - I wasn't aware this existed, but apparently it does!
Zegnat, Guest2366 and unrelentingtech1 joined the channel
#
capjamesg[d]
nekr0z 🙂
#
capjamesg[d]
+1 [snarfed]
#
capjamesg[d]
nekr0z and there is a test suite!
#
capjamesg[d]
I'll add support for this.
#
capjamesg[d]
Great find!
#
capjamesg[d]
What is an "author-page"?
#
Loqi
It looks like we don't have a page for "author-page" yet. Would you like to create it? (Or just say "author-page is ____", a sentence describing the term)
#
capjamesg[d]
nekr0z I saw it was italicized in the spec algorithm.
#
capjamesg[d]
"if there is no author-page and the h-entry's page is a permalink page, then ..."
#
nekr0z
capjamesg[d]: As far as I understand, if `author` does not contain a `h-entry` but is a http(s) URL instead, the page it links to is called author-page
#
capjamesg[d]
nekr0z I just wrote some code that complies with the spec. Thanks for helping!
#
capjamesg[d]
I love the .rocks test suites for indieweb technologies.
#
capjamesg[d]
aaronpk Do you have any intentions to make a microsub.rocks service?
#
aaronpk
eventually 😅
#
aaronpk
there's a long list tho
#
nekr0z
capjamesg[d]: oh, you're welcome! I can't code in python much (yet), but I very much like what you're doing and am eager to help in the ways I can, even if it's only testing and reporting ;)
#
GWG
aaronpk: Where do you keep your itch list?
#
aaronpk
combination of text files, notion, todoist, and teuxdeux
#
aaronpk
it's not a good system haha
#
capjamesg[d]
Thank you nekr0z! Feel free to file as many bugs as you find! (and if you feel like the search engine is missing something that you want to see, let me know!)
#
capjamesg[d]
I want to make it as "indieweb" as possible 🙂
#
capjamesg[d]
What is teuxdeux?
#
Loqi
It looks like we don't have a page for "teuxdeux" yet. Would you like to create it? (Or just say "teuxdeux is ____", a sentence describing the term)
#
aaronpk
no thanks loqi
#
GWG
I should update my list too
#
Loqi
I agree
#
angelo
capjamesg[d]: fyi the published dates of your posts are "2021-00-29"
#
hs0ucy
At least the year is not 1970 ^^
#
angelo
the dates of the posts *on your homepage*
#
capjamesg[d]
My blog home page?
#
capjamesg[d]
That's odd haha.
#
capjamesg[d]
Thanks for sharing aaronpk.
#
capjamesg[d]
Ah, I was using a capital M for month when it should have been lower acse.
#
Loqi
angelo has 1 karma in this channel over the last year (2 in all channels)
#
[tantek]1
that sounds like datetime sprintf codes or something
#
capjamesg[d]
I just fixed that issue.
#
capjamesg[d]
Thanks everyone!
marksuth1, [schmarty]1, [tw2113_Slack_]1, IWSlackGateway3, [snarfed] and [fluffy] joined the channel
#
capjamesg[d]
nekr0z authorship code is now public on GitHub. Thanks again!
#
[fluffy]
[capjamesg] let’s move that markdown chat here 🙂
#
[fluffy]
Some Markdown processors do support alternate rendering modes/backends/whatever but generally they don’t work so great on content with embedded HTML.
#
[fluffy]
For exmaple, Misaka provides a means to replace the rendering stuff, which I use in Publ for specific things, but in those contexts I generally strip out all/most of the HTML.
#
capjamesg[d]
I use Typora for almost all of my non-dev markdown (i.e. adding posts manually to my site in my IDE). They render quite a bit of HTML but tbh I don't like that feature.
#
[fluffy]
like in Publ I have a Markdown renderer that’s specifically for rendering table of contents entries.
#
[fluffy]
(which still emits HTML in the end but it strips out everything that wouldn’t be relevant to a TOC)
#
capjamesg[d]
That makes sense.
#
capjamesg[d]
In my Jekyll case, I am more using .md in the spirit of front matter. I actually wonder if I could use a .html file instead.
#
[fluffy]
also FWIW there’s also an HTML postprocessor that everything goes through; HTML entries just don’t go through the Markdown processing stuff first.
#
capjamesg[d]
I think Jekyll might support .html and front matter as part of their parsing.
#
[fluffy]
yeah Publ supports plain-HTML entries as well as Markdown
#
capjamesg[d]
(But all of my blog posts are in almost completely pure markdown because, well, I wrote them in markdown)
#
sknebel
I think some inline HTML is fine. e.g. if you wanted to use <ins>/<del> tags.
#
[fluffy]
and Publ uses RFC2822-style frontmatter, as opposed to Jekyll’s YAML-based stuff
#
capjamesg[d]
Wow! I had no idea they existed sknebel.
#
[fluffy]
RFC2822 is way easier to parse since you can use the standard email parsing library 🙂
#
capjamesg[d]
I done a quiz on semantic HTML a few weeks ago. It was fun.
#
[fluffy]
yeah ins and del are the semantic equivalents of strikeout
#
[fluffy]
Markdown has support for del via `~~` but not for ins
#
[fluffy]
Publ and Pelican both use RFC2822 for their frontmatter formatting, and are both written in Python… coincidence?!?!
#
[fluffy]
Sadly I haven’t found any Markdown editors which support RFC2822, everything that has any specific frontmatter support only supports Jekyll-style.
#
sknebel
and python "happens" to have a fitting parser in the stdlib? :D
#
capjamesg[d]
I like Jekyll's suggested layout because it is clear to me. Front matter is visually and programmatically distinct from the page content.
#
hs0ucy
That's why i'm not a fan of front matter ... it's easy to extend a SSG to other markup languages without it
#
[fluffy]
eh, the only argument I can see for that is if frontmatter is optional. But in Publ it isn’t.
#
sknebel
I thought about doing custom "HTML" tags. e.g. something like <person-tag>fluffy</person-tag>, then HTML parse the output and replace that with some generated <a href="" ...>
#
sknebel
(I think some pieces of that actually might exist in my website code somewhere)
#
[fluffy]
yeah quasi-HTML is a fine way to do things
#
capjamesg[d]
I can see the convenience factor in that.
#
capjamesg[d]
Especially for common things.
#
sknebel
because in "all"non-html markup formats custom elements seemed kind of annoying to do or to remember
#
[fluffy]
Movable Type used its own custom SGML tags for its templating system, which I had mixed feelings about but it at least had the advantage of kinda-sorta working with Dreamweaver.
#
capjamesg[d]
Am I right in saying you can actually make custom HTML elements?
#
capjamesg[d]
I think I saw this somewhere.
#
hs0ucy
yep
#
sknebel
(I think rst did it somewhat well? but then I didnt like rst for other reasons I guess)
#
capjamesg[d]
I like my markdown as simple as I can get it 🙂
#
[fluffy]
HTML5 provides a mechanism yeah, although most people don’t follow it correctly 😛
#
capjamesg[d]
[fluffy] I don't have any use case for it.
#
[fluffy]
man I really cannot wrap my head around rst
#
capjamesg[d]
But it's interesting.
#
capjamesg[d]
That's what I was reading.
#
capjamesg[d]
I have mixed feelings on that. On the one hand, there is flexibility. But on the other, it makes it harder to distinguish semantic HTML from custom HTML to those who haven't done much with HTML.
#
hs0ucy
yep
#
capjamesg[d]
I haven't used it so I can't say it is bad or good. Those are my thoughts right now, however.
#
sknebel
also requires JS, i.e. not something that would work with elements in mf2
#
hs0ucy
it's more for app or templating ... component
#
capjamesg[d]
Semantic HTML is rich in itself. I can't think of a moment where I thought to myself "what I need here is a new HTML element..."
#
capjamesg[d]
That makes sense hs0ucy.
#
hs0ucy
capjamesg[d]: HTML have ~130 elements already ^^
#
capjamesg[d]
More than enough!
#
capjamesg[d]
(for most use cases, I'm sure there are edge cases)
#
hs0ucy
i agree
#
hs0ucy
Anyway with tables and divs you can do everything :P
#
nekr0z
<sknebel> "I thought about doing custom "..." <- This is one of the reasons I like Hugo so much. It uses Go-style templates and allows to easily declare pseudo-tags (or "shortcodes") that can be quite complex and use them in markdown or html content files.
#
sknebel
true, using a templating language in the data files also works for that
akevinhuang2 joined the channel
#
sknebel
there was an interesting engine that did things on DOM level if I remember right ,if I could remember the name ...
#
sknebel
" Soupault makes HTML a first-class citizen. Like web browsers, it can parse HTML and manipulate the element tree of the page. However, it saves the result in a static page rather than displays it on screen. "
#
hs0ucy
sknebel: it's an SSG or what?
#
[KevinMarks]
The new html element I want is <playlist >
#
capjamesg[d]
What would that do [KevinMarks]?
#
[KevinMarks]
And yes I could in theory use SMIL but no.
#
sknebel
hs0ucy: yes
#
capjamesg[d]
> Anyway with tables and divs you can do everything 😛
#
[KevinMarks]
It would play a series of <audio> or <video> elements without gaps
#
[KevinMarks]
Instead we have HLS. Sigh.
#
hs0ucy
[KevinMarks]: SMIL is deprecated no?
#
aaronpk
HLS serves a different purpose than a series of <video> elements
#
[KevinMarks]
Right. I mean I could use it to make a playlist but then I'm writing 80s playlist formats in a separate file.
#
hs0ucy
[KevinMarks]: multiple <audio> elements in <ul> ? :P
hendursaga joined the channel
#
[KevinMarks]
Without gaps is the hard part. You really need to be the browser to do that right. You can do it with low level audio api but that's a pain.
#
hs0ucy
You mean sound gaps?
#
hs0ucy
sknebel: there's good ideas in soupault
#
hs0ucy
maybe I'll borrow some for my ssg
#
sknebel
yeah, it's different with some nice ideas
#
sknebel
the creator was in chat here too at least for a while
#
[KevinMarks]
Sound gaps, but Video gaps to. To avoid Sound gaps you need to be sample accurate, which needs native code
#
[fluffy]
You could do things gapless using webaudio but that’s like using a tactical nuke to take care of a crabgrass infestation
#
[fluffy]
… wait no, deploying a tactical nuke is a lot easier than using webaudio
[Ed_Beck], voxpelli_, Allie, chenghiz_, willnorris, tetov-irc and gRegor joined the channel
#
@samuelgoto
↩️ @aaronpk on a related note: does any part of IndieAuth break when browsers block third party cookies?
(twitter.com/_/status/1446259167178608640)
hs0ucy joined the channel