#dev 2024-09-23

2024-09-23 UTC
AramZS and Dryusdan joined the channel
#
[tantek]
capjamesg[d], Technorati was known for *real time search* which literally no one else did, except Google prototyped/shipped an "80%" solution, enough to seem like "competition" to Technorati, then a few years after Technorati folded (for many reasons, can share more informally), Google shut off their real time Blog Search.
#
[tantek]
So today, literally no one does cross-site real time search
#
[tantek]
sure you can search Twitter using Twitter in near realtime. and you can use Google Alerts for some tiny fraction of the web (various news publications), but it's something that doesn't exist, anywhere
#
aaronpk
didn't google even drop their realtime twitter search they did a while ago?
#
[tantek]
yes that too
#
[tantek]
Google Search is pretty much C grade right now, barely usable 70% of the time.
#
[tantek]
other web searches are of even lower quality
#
[tantek]
so "technorati for source code repos" would probably mean some way to index changes to source code repos in real time, i.e. show search results within seconds of when a merge request was landed
[tw2113] joined the channel
#
[tw2113]
ugh, i dread any time i need to perform a search
#
catgirlin.space
i find kagi to be pretty good i think,,,,,
#
catgirlin.space
am confused what you mean by real time search exactly [tantek]
#
catgirlin.space
like, doing a search on funny search engine would go and do a site search on every website it knows about? orrrr
#
[tantek]
no it already has it all indexed in real time
#
[tantek]
so doing a search checks that index and returns you a result
#
catgirlin.space
ooo
#
[tantek]
this is what Technorati did for (nearly?) all blogs back in the mid-2000s. millions of blogs
#
catgirlin.space
is that uh, kinda similar-ish to what indexnow does? telling search engines that something changed and then they can crawl it again if they want to,,,,
#
[tantek]
you could blog something and minutes later, eventually seconds, someone could search for it and find your blog post
#
[tantek]
nope, not at all because no search engines actually crawl things in real time
#
[tantek]
or "promptly" in response to being told "something changed"
#
[tantek]
mostly those signals are ignored I find
#
catgirlin.space
wait was technorati just, constantly polling feeds,,,, for like, every single blog then? [woozycat](https://cdn.discordapp.com/emojis/1183246214595092561.webp?size=48&quality=lossless&name=woozycat)
#
catgirlin.space
[edit] wait was technorati just, constantly polling feeds,,,, for like, every single blog then? [woozycat](https://cdn.discordapp.com/emojis/1183246214595092561.webp?size=48&quality=lossless&name=woozycat)
#
[tantek]
no it was not polling either
#
[tantek]
catgirlinspace you can read the http://enwp.org/Technorati article for some more background if you're curious
#
catgirlin.space
huh
#
[tantek]
what is Technorati
#
Loqi
Technorati was a real-time blog search engine that provided date-ordered results for text phrases or links, typically within seconds of when people published on their blogs https://indieweb.org/Technorati
#
[tantek]
and that
#
[tantek]
so no, don't try to imagine what "was technorati just" because you're very unlikely to figure it out from first principles in a matter of seconds — it was built by a small handful of very clever engineers over months and improved over a few years
#
catgirlin.space
so confused how it knew about new posts then. skimming the wikipedia article it doesnt seem to detail that?
#
catgirlin.space
> Tantek Çelik was the site's Chief Technologist.
#
catgirlin.space
omg that's you, that's so cool,,,,,
#
[tantek]
!tell [snarfed] ah I see my posts are gone from lots of tag searches now. that's really too bad / sad as that's one place people do (re-)discover posts, when they themselves blog about a topic, and then go see what others have said on that topic previously. re: "still looking at exactly what/how I can repair" - here is a suggestion: if I resend a webmention to BF for an old post of mine, BF should go deliver it to everyone. if BF thinks it
#
Loqi
Ok, I'll tell them that when I see them next
#
[tantek]
already delivered it, then BF should send an UPDATE. then if it still does not show up in Masto profiles/tag-searches, then those are Masto bugs and you can help with filing them
#
[tantek]
I realize that still requires manual work on my part to re-webmention BF for a bunch of my posts (all of them since Oct 2022 lol?) however, at least that will shake out a bunch of bugs in BF/Masto interactions and then we can file bugs and advocate for Mastodon folks to fix the Mastodon problems with properly handling UPDATEs
#
[snarfed]
yes! BF already does all that, including updates, and it also compares what it fetches from your site to what it last delivered. in this case, they'd be the same, so it wouldn't send any updtes
#
[tantek]
So I'd have alter blank space or something before re-webmentioning?
#
[tantek]
like add a blank space at the end of a line?
#
[tantek]
presumably BF does not do any "clean-up" before it "compares what it fetches from your site to what it last delivered"
#
[tantek]
so lets file those bugs
#
[snarfed]
but I'm also looking at doing this manually on my end, which will be easier and more efficient than you modifying a bunch of posts and sending wms
#
[tantek]
because that has been frustrating for too long (Mastodon ignoring new tags on UPDATEs etc.)
#
[tantek]
ah that of course would be better
#
[tantek]
LMK how I can help!
#
[snarfed]
alternatively since the posts already exist, we can find and construct the POSTs for Mastodon's search for each post URL, ie https://fed.brid.gy/r/https://tantek.com/... , since searching for those makes the instance re-fetch the post
#
Loqi
[preview] Tantek Çelik
#
[tantek]
ah yes that makes A LOT more sense