#dev 2023-06-15

2023-06-15 UTC
#
gRegor
Unfortunately I think Bridgy Chrome extension is dead for me https://indieweb.org/Instagram#Automated_behavior_warning
#
gRegor
enjoyed it while it lasted, though [snarfed]++
#
Loqi
[snarfed] has 98 karma in this channel over the last year (155 in all channels)
[tw2113_Slack_] joined the channel
#
[snarfed]
gRegor yeah I took it down from AMO and Chrome web stores and from the Bridgy docs a while ago
#
[snarfed]
glad you only got a warning and not your account suspended
gRegorLove_, [Benjamin_Turne] and [schmarty] joined the channel
#
Soni
any updates on "web-ring search"?
#
Soni
(without crawlers)
#
[tantek]
what is web-ring search?
#
Loqi
It looks like we don't have a page for "web-ring search" yet. Would you like to create it? (Or just say "web-ring search is ____", a sentence describing the term)
#
[tantek]
Soni, can you be more specific about which project you're asking about?
#
Soni
it's an idea we brought up a few times: web-ring search, without crawlers
#
Soni
each website in the ring would make its own index using its own criteria, and share that index with the rings
#
Soni
the rings would combine the indexes somehow and provide search
#
Soni
is it harder to use than global search? yeah. but not by much. and it sidesteps LLMs entirely.
#
[tantek]
hmm, if it was brought up a few times, did anyone document the idea on the wiki?
[lifeofpablo] joined the channel
#
[tantek]
what is wikify?
#
Loqi
wikifying is the practice of capturing information and ideas on the wiki https://indieweb.org/wikify
#
[snarfed]
"sorry to say that right now it's a constraint on the network - we won't be federating with any other services (in production) for the time being. We are about to launch a federation sandbox, most of the technical work is finished, but federation opens up a lot of issues with large scale moderation that neither we nor the community are prepared to handle"
#
[snarfed]
totally fair, and may well be the right idea, but still disappointing
#
gRegor
I think this was the webring search discussion https://chat.indieweb.org/dev/2023-04-29#t1682799845444300
#
Loqi
[preview] [Soni] do we have search yet?
mandaris, chenghiz_, Xe, jbrr[m] and win0err joined the channel
#
vikanezrimaya
[snarfed]: re: community operated bluesky feeds with complete views of the network: do you have any links? You have piqued my interest in bluesky
gRegor, bterry, jeremycherfas and omz13 joined the channel
#
[tantek]
Lest anyone think Indieweb is the only community / methodology being picked on, check out this well written critique:
#
[tantek]
Lots of lessons to learn from in there
#
bkil
Soni: I could still mostly repeat myself from the last conversation. You would need to work on your idea a bit more to see its weaknesses and then perhaps blog about it. The size of an index is usually in the same ball park as the data itself. Thus, all participating indexers would need to refetch (="crawl") the content index of everyone else that is even less efficient than recrawling only the modified or new pages based on a site map or atom feed.
#
omz13
[tantek] that was an interesting read, and it does raise some good points (the UX sucks, following sucks, the protocol sucks)
#
vladimyr
^ truly great and thoughtful/thought-provoking read, you could add culture sucks to that list too
Seirdy and ahappydeath joined the channel
#
Soni
bkil: thankfully we have RFC 3229
#
Soni
omz13: nobody cares about the protocol, since it's merely an implementation detail
#
Soni
anyone who says the protocol matters is someone who only cares about programmers
holiday_medley joined the channel
#
bkil
Laugh all you want, but if your protocol is not open at least in some sense of the word, "all your data belong to US". I.e., I couldn't implement my own frontend (let alone mesh-first backend) for Facebook, because they won't let me.
#
bkil
Despite the fact I would like to change quite a few things about it, improve its compatibility, performance, color scheme, hide useless things, etc. The last guy who tried to simulate that with an aftermarket extension was erased completely for life without appeal.
#
Soni
use AI to build your extension, we heard that makes it legal
#
Soni
but also, the context is mastodon. mastodon does have an open protocol.
#
Soni
email is a crap-ass protocol and it still hasn't been replaced
#
Soni
the last big improvement in email was TLS. the second last was MIME.
#
vikanezrimaya
[snarfed]: oooh, interesting! but seems like you were mentioning what is called App Views, and I meant Big Graph Services. So I'm gonna restate my question: are there people around bluesky that are planning to run independent BGS instances? because that seems like bluesky's weak point regarding decentralization
#
vikanezrimaya
I know federation isn't online yet, but I wonder if people are already planning on experimenting with this.
#
vikanezrimaya
Honestly considering the code already seems to be open source, why wouldn't somebody just put up a small-ish test network on their own hardware as an alternative to staging.bsky.app PDS and BGS?
#
vikanezrimaya
The feed generator stuff is extremely interesting though
#
[snarfed]
right, I was answering the use case you asked for, since "BGS" is pure plumbing
#
[snarfed]
but yes the plan is BGSes entirely independent of Bluesky PBLLC
#
[snarfed]
and yes there are already multiple independent PDSes online, based on Bluesky's code, eg https://stems.social/ , and independent implementations in progress (mine is one)
#
[snarfed]
there are already a number of complete, realtime updated archives of all of Bluesky's data even without BGSes being fully specced out and federated, eg https://bsky.jazco.dev/
#
[snarfed]
the disappointing flipside of all this is that it sounds like turning on federation may take longer than we hoped. https://chat.indieweb.org/dev/2023-06-15#t1686797968044800
#
Loqi
[preview] [[snarfed]] "sorry to say that right now it's a constraint on the network - we won't be federating with any other services (in production) for the time being. We are about to launch a federation sandbox, most of the technical work is finished, but federation ope...
#
vikanezrimaya
wow, that graph at bsky.jazco.dev is impressive. i'm starting to drool
#
vikanezrimaya
i wonder if there's a faster way to get invites than a waitlist — maybe someone here knows?
#
vikanezrimaya
(if you do, can I ask for an invite?)
#
vikanezrimaya
[snarfed]: oops, forgot to ping you
#
c​apjamesg
vikanezrimaya I think I still have an invite I can give out if you need one.
#
vikanezrimaya
@capjamesg#4492 can you dm me in discord? vikanezrimaya.xyz -- apparently I just got a prompt for a username change
#
vikanezrimaya
also that's a domain of mine
#
vikanezrimaya
it even links to my old website
#
vikanezrimaya
sent you a friend request on discord
#
[tantek]
here's an AP-boostable permalink to that twittermigration fail blog post apparently from the author: https://finecity.social/notes/9fwhw09aor (running Calckey according to the view source!)
#
Loqi
[preview] [Bloonface] I decided to choose violence. An essay on why the #TwitterMigration failed, some things to learn for the next time it's a possibility, and some thoughts on things that the #RedditMigration has done differently, especially with #Kbin. https://blog.b...
#
vikanezrimaya
Okay, now that I'm not completely missing out on bluesky fun: is there already a way to POSSE to bluesky? aaronpk seems to have some mentions of bluesky in quill, and some posts look suspiciously like posse 😝
#
[tantek]
what is bluesky
#
Loqi
Bluesky is a project run by Jay Graber and initially proposed by Jack Dorsey, former Twitter CEO, to “develop an open and decentralized standard for social media – The goal is for Twitter to ultimately be a client of this standard.” https://indieweb.org/Bluesky
#
[tantek]
yeah it's not easily discoverable from there
jshmlr and [manton] joined the channel
#
[manton]
Shameless plug: http://Micro.blog can POSSE to Bluesky from any feed. 🙂
#
[tantek]
I might do that actually except for replies
#
[tantek]
Replies to posts on bsky that is
#
[manton]
Yeah, I don’t do anything special with replies. Those are difficult to POSSE without wires getting crossed with usernames, etc.
#
[tantek]
indeed, I have some experience with the challenges of POSSEing replies 🙂
tei_ joined the channel
#
omz13
Soni when I talked about "the protocol sucks" I was summarizing what was in bloonface's post; and, yes, most users could not care about protocols... well, until things take too long, or means things need to run on bigger machines, or they have big pipes to cope with highly inefficient payloads, and then, suddenly they might care (or probably not, until it hits them in their wallet which seems to be a real trigger)
#
[tantek]
[manton] could you add a brief summary of how to setup POSSEing from a http://micro.blog account to Bluesky here? https://indieweb.org/Bluesky#How_to_POSSE (maybe in a new paragraph or subsection if you prefer)
#
[tantek]
!tell capjamesg I put pyatproto here because of its name: https://indieweb.org/AT_Protocol#Libraries — however if you think it will be "Bluesky only" for the near feature (3-6 months?) perhaps we should move it to the /Bluesky page for now, until there's more information about how soon it might support "arbitrary" AT Proto instances (which few if any exist AFAIK)
#
Loqi
Ok, I'll tell them that when I see them next
gRegor, tei_1 and tei_ joined the channel
#
[snarfed]
I'd love to carve out some time to do https://github.com/snarfed/bridgy/issues/1453
#
[snarfed]
focused on Bridgy Fed right now
#
[snarfed]
PRs always welcome though!
#
capjamesg
vikanezrimaya Hello!
#
Loqi
capjamesg: [tantek] left you a message 1 hour, 32 minutes ago: I put pyatproto here because of its name: https://indieweb.org/AT_Protocol#Libraries — however if you think it will be "Bluesky only" for the near feature (3-6 months?) perhaps we should move it to the /Bluesky page for now, until there's more information about how soon it might support "arbitrary" AT Proto instances (which few if any exist AFAIK)
#
capjamesg
It will probably be Bluesky only for a while [tantek].
#
capjamesg
vikanezrimaya I promised you some alt text information!
#
vikanezrimaya
capjamesg: nyan-nyan! ✨
#
capjamesg
I haven't played around with it yet, but you have encouraged me to get set up with Recognize Anything Model (RAM). Maybe I'll take a look this weekend.
#
capjamesg
Rather than using object detection (what is in this object and where is it?), I'd recommend using a captioning model. These models are built specifically for generating captions.
#
Loqi
[preview] [xinyu1205] recognize-anything: Code for the Recognize Anything Model (RAM) and Tag2Text Model
#
capjamesg
I don't know the model size though. If it is prohibitively big, I have more suggestions.
#
capjamesg
The captioning models I have seen take up a fair bit of RAM (in the traditional computing sense), but they can identify a vast range of things. Whereas object detection models can generally identify fewer things, unless you use a large, general model like Grounding DINO or Facebook's Detic.
#
capjamesg
In the back of my mind, I have an idea for a service that, given an image URL, returns a caption from one of these models.
#
capjamesg
BLIP is quite well known for captioning, developed by Salesforce Research. BLIPv2, also developed by Salesforce, is great, but is massive (20GB RAM+ if I recall).
#
capjamesg
The trouble with my API idea is that it would probably require a GPU, so I'd need to enforce strict limits and make sure captions can only be generated once per image. That's a lot of hassle.
#
capjamesg
In hindsight, that was a lot of vocabulary. Let me know if I should take a step back.
#
vikanezrimaya
I played around with RAM on Huggingface, and it's... decent, I guess? It seems to produce somewhat coherent result, though human tagging will obviously outperform it in regards to quality and creativity of an alt-text caption.
#
capjamesg
Yes indeed.
#
capjamesg
I haven't played around with RAM, although I want to!
#
capjamesg
I have had some really nice descriptions from it. Again, though, if I recall the model is really big, which makes it impractical and expensive to use.
#
vikanezrimaya
The tag2text RAM model seems promising for alt-text generation, as it produces coherent text as opposed to raw keywords.
#
capjamesg
tag2text RAM is 4.48 GB. I wonder what the performance impact is for inference.
#
vikanezrimaya
4.48? that sounds small. I wonder if this means I can feasibly use it on a CPU w/o quantization (I heard it's the magic trick that makes models smaller but less accurate)
#
vikanezrimaya
My laptop has 16GB of RAM, so maybe I'll be able to run it. But my server only has 4 gigabytes, so I guess if I want to roll this out, I'll have to buy myself a bigger server
#
vikanezrimaya
[manton]: re: micro.blog bluesky support: I assume it's a paid feature or?... what's the workflow? do I POSSE to micro.blog which does POSSE to bluesky?
tei_1 joined the channel
#
capjamesg
Let me know if you have any more questions vikanezrimaya!
#
capjamesg
Wow, RAM is amazing!
#
vikanezrimaya
i hate python
#
vikanezrimaya
i can't reproduce the environment locally
#
vikanezrimaya
and docker is a kludge, not a solution
#
vikanezrimaya
i feel hungry now
#
capjamesg
I'll give it a go and report back!
[KevinMarks] joined the channel
#
vikanezrimaya
also I read the code, and yes it can theoretically run on CPU instead of CUDA
#
[manton]
@vikanezrimaya Yes, POSSE used to be kind of separate but to simplify things it’s just rolled into the $5/month hosting, even if you don’t need hosting. You basically add RSS feeds to your account and then from there http://Micro.blog will pull them in and send them off to other services like Bluesky.
#
vikanezrimaya
sounds somewhat complicated. might be just easier and more fun to implement just enough atp to speak to a pds myself
#
vikanezrimaya
thanks for the plug tho, this inspires me because now I know it's not only possible, but has been done at least twice
tei_1 joined the channel
#
capjamesg
sknebel I need to think through how that would work.
#
capjamesg
Wrong one.
#
Loqi
[preview] [clovaai] donut: Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
#
capjamesg
That's not OCR. That's document understanding.
#
Loqi
[preview] [JaidedAI] EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
#
vikanezrimaya
Just got my first Bluesky post using curl! Kittybox support for Bluesky may be easier than I thought.
#
vikanezrimaya
I just need to implement logging into Bluesky, token management and also posting.
#
vikanezrimaya
Also bluesky's docs are outdated lol
tei_1 and tei_ joined the channel
#
[manton]
If it helps, I also wrote this blog post with a bunch of Bluesky RPC examples. Should have everything for basic posting including image upload, inline links, etc. https://www.manton.org/2023/04/29/getting-started-with.html
#
Loqi
[preview] [Manton Reece] Getting started with Bluesky XRPC
#
vikanezrimaya
[manton]: looks useful, thank you!
Nuve joined the channel
#
c​apjamesg
Wow.
win0err joined the channel
#
c​apjamesg
I have a strange issue with my blog. The sparkline post count is non-deterministic.
#
c​apjamesg
It isn't accurate, either.
#
c​apjamesg
I have no idea why!
#
gRegor
Sure jamesbot hasn't been posting? XD
#
[tantek]
What was the Google Domain that someone was having trouble recovering? [KevinMarks]? Maybe Squarespace customer service will be better?
tei_ joined the channel
#
[schmarty]
Google divesting of their domain service is verrrry interesting
tei_ joined the channel
#
[tantek]
Then what was the point of their gTLD buying spree? Squatting?
#
sknebel
I dont think those are part of it
#
sknebel
but its not 100% clear
#
[tantek]
Evergreen
tei_1 joined the channel
#
gRegor
Wait, what's up with Google Domains?
#
gRegor
I hope Squarespace still keeps free whois protection
[Murray] joined the channel