#dev 2022-04-08

2022-04-08 UTC
nertzy, lagash, YimingWu[d], Silicon[d], jacky, gRegor, gRegorLove_, johnnrs[d] and sayanarijit[d] joined the channel
#
jacky
as I'm reading https://www.inkandswitch.com/cambria/, I'm wondering if some sort of version stamping should be added to things like Microsub and Micropub
#
jacky
to help with versioning and API translations
#
jacky
like if we have a breaking change (like renaming `action` to `act` or introducing PATCH for particular operations), this has a notion of a translation layer (Stripe's approach is probably the most ideal) so requests can be 'upgraded' or 'downgraded' to the version understood by a server
#
jacky
this is meant for local-first software but I can see how this benefits non-local-first
#
jacky
tl;dr: it's schema migrations for API contracts
gRegorLove_, jacky, trig[d], johnnrs[d], Asaf_Agranat[d], laker[d], cygnoir[d], dovedozen[d], sayanarijit[d], hepphepp[d], samhenrigold[d], YimingWu[d], aspenmayer[d], Silicon[d], Jeremiah[d], tracydurnell[d], indieweb-irc-bri, corenominal[d], Nan[d], capjamesg[d], hoenir, shaunix[d], wackycity[d], balupton[d] and mro joined the channel
#
capjamesg[d]
jacky I was thinking about the societal impacts of large-scale, easy to access / crawl social graphs.
#
capjamesg[d]
With h-cards, governments could maintain a database of changes to your profile over time without having to circumvent any social media businesses' crawling limits.
#
capjamesg[d]
I am unsure whether this is a legitimate concern but it feels like one.
#
capjamesg[d]
What if everyone had a h-card on their site in the UK? Would people start mapping friends to create massive datasets for advertising purposes?
MarkJR84[d] joined the channel
#
doosboox
capjamesg[d]: it definitely is a valid concern. That said they wouldn't really have a way of mapping "friends" in a reliable manner, nor know which forums you are frequenting or what kind of information you usually search for.
#
doosboox
capjamesg[d]: btw, I think I've asked this before but what do you use for the indexing and search for your search engine?
#
capjamesg[d]
That is valid. I think I left XFN creep in there a bit without being explicit. I concur with what you said about mapping friends.
#
capjamesg[d]
I use Elasticsearch doosboox.
#
capjamesg[d]
8GB server holds the 400k or so documents.
#
capjamesg[d]
As for the actual querying, I have a Python Flask server that wraps around Elasticsearch and turns human queries ("what is X...") into the right schema.
#
capjamesg[d]
The schema can vary depending on if a query is a "discover" query (find people whose h-cards mention something) or if the query needs to be ordered in some way.
#
doosboox
capjamesg[d]: how much of this have you built yourself and how much is off the shelf components?
#
capjamesg[d]
Elasticsearch is off the shelf. The crawler and search result representation (query cleaning, featured snippet extraction, etc.) is mine.
gRegor joined the channel
#
capjamesg[d]
Then things like post type discovery are part of indieweb-utils.
#
doosboox
how do you parse and translate human queries?
#
capjamesg[d]
The approach is naive right now.
#
capjamesg[d]
If a question contains a "what is" at the beginning, for example, the engine looks for a <dfn>, a direct answer in HTML documents that is likely to match based on a few semantic rules, and a couple of other things.
#
capjamesg[d]
Or if a question starts with "who is", the engine will look to retrieve a h-card for the person whose name is mentioned if one is available.
#
capjamesg[d]
To keep this efficient, this search only runs on the top few results.
#
capjamesg[d]
The assumption is that if the engine has information on, say, my h-card, it would show up highly for a search "who is jamesg.blog" (with "who is" filtered out because it is not useful information to query in the index).
#
capjamesg[d]
At scale, these sorts of naive rules might fall apart as an index grows a bit in favour of more complex logic. But I'm not building a really big search engine like Google 🙂
#
capjamesg[d]
I also remove some punctuation and transform the final, cleaned query, into Elasticsearch syntax (i.e. if a user has provided a keyword that requires a certain filter is used). I can't think of any of these syntax examples off the top of my head but I remember building at least one.
petermolnar, hoenir, capjamesg[d], shaunix[d], corenominal[d], wackycity[d], MarkJR84[d], edburns[d], indieweb-irc-bri, Crypto[d], Nezteb[d], edgeduchess[d], grantcodes[d], mro, niklasfyi[d], Murray[d], YimingWu[d], laker[d] and omz13 joined the channel
#
jamietanna
jacky +1 on versioning, but I'd say there are quite a few industry-used means for doing it we could follow? That does look interesting as an approach
[James_Van_Dyne] joined the channel
#
petermolnar
if anyone wants the avatars from /chat-names , I made a quick hack at petermolnar.net/indiewebavatars.php?name=[username] but it's far from perfect
#
capjamesg[d]
petermolnar++
#
Loqi
petermolnar has 8 karma in this channel over the last year (40 in all channels)
tetov-irc and Murray[d] joined the channel
#
Caesar[m]
<petermolnar> "if anyone wants the avatars from..." <- Maybe an idealistic dream, but from an indieweb viewpoint it would be great if avatars were picked up from our websites instead of having to manually update them at /chat-names, wiki sparklines, etc
balupton[d], kimberlyhirsh[d] and aspenmayer[d] joined the channel
#
Murray[d]
Wait, Loqi is a dinosaur on Discord!
#
Murray[d]
petermolnar++
#
Loqi
petermolnar has 9 karma in this channel over the last year (41 in all channels)
tracydurnell[d], petermolnar, hepphepp[d], dovedozen[d], Nan[d], nertzy, mro, jacky, gRegor, baracurda, cambridgeport90, samhenrigold[d], sayanarijit[d], Ramon[d], omz13 and Jeremiah[d] joined the channel
#
jacky
so my site uses the async webmention callback flow (https://indieweb.org/Webmention-brainstorming#Asynchronous_status_notification) to handle ingestion of Webmentions so it's more of a push flow versus pulling/polling (although I do have support for that to make quick importing simpler)
#
jacky
actually I think I answered my own potential question (how do I work with services that don't support callbacks?) - by mainly avoiding them or falling back to a poll of webmentions
#
sknebel
not sure what "mainly avoiding them" means in this case, since you cant choose if the site you send a WM to supports it or not
mro joined the channel
#
jacky
ah I should have mentioned that my site doesn't do a lot of the work for webmention processing
#
jacky
like Lighthouse does the actual work of sending and receiving and it could probably do some work to just invoke the callback after an hour or so as sent if nothing happened
#
sknebel
ok, yeah, for integration between bits of your site you can of course do that
#
sknebel
(I would've considered Lighthouse part of it)
#
jacky
okay I think I have a decent flow now that's all push-based
#
jacky
the tests let me think so lol
#
jacky
might blog about it
gRegor, mro, baracurda, adstew, JPax[m], KartikPrabhu, Silicon[d], cygnoir[d], Christian_Olivie, Darius_Dunlap[d], jacky and yequari[d] joined the channel
#
jacky
what is rel=subscribe
#
Loqi
rel-subscribe is an experimental rel value for linking from your home page to your subscription endpoint, and is currently prototyped by Aaron Parecki on aaronparecki.com; try the Follow button at https://aaronparecki.com/follow or any permalink https://indieweb.org/rel-subscribe
#
jacky
keep coming back to this
#
jacky
like if something like https://subtome.com was baked into browsers, I'd be glad
#
sknebel
in old firefox readers could register themselves for feeds
#
sknebel
but instead of extending that to feed discovery with a button in the UI they killed that stuff completely
[jeremycherfas] joined the channel
#
jacky
that could have been a really good feature in itself
#
jacky
hmm I wonder if a bridge to use as a subscription endpoint could help people
#
jacky
like if they don't have Microsub on their site but they do use Feedly, it could point them to a page that'd subscribe them in Feedly
#
jacky
runs to user page
#
jacky
tbh the simplest form of this would be people being able to use their site to follow one another (which is good in itself) - the field could be autopopulated with the URL
#
jacky
I do think [schmarty] wrote something about autocomplete and URLs tho
#
[schmarty]
and in answer to a clarifying suggestion: https://martymcgui.re/2020/05/26/121444/
#
Loqi
[Marty McGuire] Thanks, Ryan! I see the same behavior on Firefox, but probably wasn’t clear explaining it (“start typing from somewhere in the middle”). I do use this when I’m on my full computer, and it helps! Where I get really frustrated is on my iOS dev...
#
jacky
heh nice
#
@jackyalcine
↩️ I do! I actually wrote this reply to you from my site (heavy lifting done by http://brid.gy). I like having a place where I can point to people and say, “I did that. It might not be awesome, it might look weird, but I DID THAT”. And then put a… https://jacky.wtf/2022/4/ve/veExvi96fU7VC-75oi8f-neO
(twitter.com/_/status/1512539024329744387)
#
[tantek]1
jacky++
#
Loqi
jacky has 29 karma in this channel over the last year (70 in all channels)
KartikPrabhu, paulrobertlloyd and ShinyCyril joined the channel
#
Loqi
[James] Building a search engine for my blog: Part II
alex11 joined the channel
#
Caesar[m]
Heh, looks like some escaping is needed... unescaped `<title>` tag (and others) in the content
tetov-irc joined the channel
#
jacky
looks like the great firewall is tripping up my server https://jacky.wtf/2022/4/9f/9f4pvjwdAJvFn8Oj-nC34e0K
#
jacky
it got that by reading the plain HTML of the page (I leave `h-koype-stubbed-from-head` as a hint when I handle my reply contexts
nertzy joined the channel