#dev 2024-06-13

2024-06-13 UTC
geoffo, [aciccarello], timmarinin and wagle joined the channel
#
aaronpk
Um how is this possible? "private DMs from Mastodon were mirrored on their public site and searchable. How this is even possible is beyond me, as DM’s are ostensibly only between two parties, and the message itself was sent from two hackers.town users"
#
aaronpk
This implies mastodon is sending out DMs to federated servers...??
timmarinin joined the channel
#
[tantek]
would that be indirect messages then? 😉
[tw2113] joined the channel
#
[tw2113]
mastodon has DMs?
#
[tw2113]
i didn't even know they were into dungeons and dragons
#
[0x3b0b]
Now I'm trying to remember the context of that other funny thing...something about dnd mode and the person reading it being utterly baffled for a moment...
#
[Joe_Crawford]
There’s no such thing as as a DM on Mastodon. They do have “private mentions” but drafting one warns: “Posts on Mastodon are not end-to-end encrypted. Do not share any sensitive information over Mastodon.”
nertzy joined the channel
#
[tantek]
correct, they are posts with an audience of two.
#
[0x3b0b]
I think the underlying technical AP term is "mentioned actors only." My understanding is that messages with that visibility _should_ only be federated if they mention an actor on another instance and _should_ only be visible to the mentioned actors...and to anyone with administrative access to the server or its hosting platform...but I'm also under the impression that they are Kind Of A Mess.
#
[Joe_Crawford]
Or whomever is explicitly mentioned
#
[0x3b0b]
I am relatively confident that Microblogpub does a _pretty good_ job with keeping them as private as they are supposed to be, but I still treat them more as a way to keep a conversation from _bugging my followers_ than as a way to keep that conversation _private._ Partly because I'm a single-user instance, so unless I'm making a note to myself (or posting ciphertext, I guess), I'm trusting at least the people I'm talking to and all their instance...
#
[0x3b0b]
... admins with the message.
#
aaronpk
... and the instance software
#
[0x3b0b]
...accurate
timmarinin, H4kor and nertzy joined the channel
#
capjamesg[d]
How does everyone store images on their blog?
[Jo] joined the channel
#
[Jo]
badly
#
capjamesg[d]
I'm starting to accrue many images and I don't want to store them in a Git repository any more.
timmarinin and [qubyte] joined the channel
#
[qubyte]
I still just commit them. I even had a GH action triggered by image additions which converts to various sizes and types and commits those too.
#
[qubyte]
I’d rather not have too much binary stuff in a git repo, but using a bucket of some kind instead is a bunch of engineering I don’t want to do.
ttybitnik and [Murray] joined the channel
#
[Murray]
My blog is backed by Craft CMS, so I store them there. If it ever gets too large I _might_ consider moving them to a cloud host, but I hate their pricing models so want to avoid that as long as possible
#
capjamesg[d]
A cloud bucket would be great but I don't think I want all the hassle.
AramZS joined the channel
#
schmudde
Curious what the hassle would be, capjamesg[d]. If you have a public bucket, you can a public URL. Is the problem that you want a link that is permanent?
#
capjamesg[d]
schmudde I think it's about getting the bucket set up and setting up capped billing, etc.
#
aaronpk
i also still just add mine to the git repo
#
aaronpk
hasn't broken things yet
#
capjamesg[d]
It takes like a minute to upload all those to my server on every push though.
#
capjamesg[d]
It's not things breaking but the time it adds to my site going live.
#
aaronpk
i'm not sure i understand the problem
#
aaronpk
are you pushing a full copy of your entire site every time you add a new post or something?
#
pcarrier[d]
capjamesg[d] as to not pushing everything every time, xmit solves that with its own uploader, I'd be curious to know what your hosting looks like so we can explore alternatives… if it exposes a local filesystem over a shell, `rsync` might be all you need
#
schmudde
RE: Buckets - I didn't think about billing. I use public buckets for low-traffic public assets. And I'm not too worried about speed. Almost all my cost is the amount of media. And it ends up being less than $1/month.
#
schmudde
But you probably have more sophisticated needs.
#
capjamesg[d]
My current workflow is to upload a photo to my Git repo, where my static site is built, and then a zipped version of everything -- images and HTML -- is uploaded to my server.
#
capjamesg[d]
Could I have a separate repo for assets that I alias to a subdomain?
#
capjamesg[d]
Using LFS.
#
[tantek]
capjamesg[d] re: "store images on their blog", in general I don't (media storage seems like a fast path to exceeding bandwidth), so various forms of silo storage (which is obv fragile), or open communities (Wikimedia Commons, or IndieWeb if the image is IndieWeb-relevant), or Internet Archive
#
[snarfed]
capjamesg yeah re-uploading your entire site on every build, even zipped, seems pretty clearly like the part to change/improve there
#
aaronpk
agreed
#
aaronpk
but yeah if you used a subdomain for images outside of your static site build process that would be a workaround to actually solving that
#
[snarfed]
unrelated, I have a TOTP question for anyone here familiar. only loosely indieweb related, happy to take it to #indieweb-chat if we want
#
[snarfed]
if I understand right, the server generates and stores a random TOTP seed for each account that enrolls. the user uses that seed to generate a TOTP code, and the server uses the same seed to check that code
#
[snarfed]
so, if a server is breached, and seeds are stored with accounts, wouldn't those seeds usually be included in any data dump from that breach, just like (hopefully hashed) passwords, etc?
#
aaronpk
if the TOTP seed is stored in plaintext yes
#
aaronpk
the server could encrypt the TOTP seed, either with a single server secret or even encrypted using the user's password
#
aaronpk
but this is just another example of why TOTP isn't as good as webauthn since with webauthn the server is only storing the public key
#
[tantek]
the comparison is odd to me ("isn't as good as") because my understanding was that TOTP was intended only for secondary factors, while webauthn is intended as a primary factor
#
aaronpk
webauthn was originally intended as a secondary factor, but only recently as a primary factor with the "passkey" branding
#
[snarfed]
ah right, you can just encrypt the seed. obvious in retrospect. thanks!
#
[snarfed]
(and yeah webauthn is obviously better since it's phishing resistant, definitely agreed)
#
aaronpk
whether anyone actually *does* encrypt the seed is a different story
#
[snarfed]
I don't often hear about TOTP seeds in dumps, but I'm probably not following those closely enough, or attackers aren't grabbing them often enough, or the fraction of users with TOTP is small enough that they don't care 🤷
#
aaronpk
i suppose the impact of leaked TOTP seeds is minimal
#
capjamesg[d]
[snarfed] Agreed re: the solution of not uploading all the data.
#
capjamesg[d]
I wonder if it's worth setting up a GCP bucket.
#
capjamesg[d]
Wait, no. LFS with GitHub on a subdomain may work.
#
capjamesg[d]
I need to play around!
#
capjamesg[d]
Hm. I'll need to pay for bandwidth on LFS 😦
#
[snarfed]
why LFS? images are generally small enough that you don't need LFS
#
[snarfed]
as a data point, https://github.com/snarfed/snarfed.org has ~1200 images in it, across a wide range of sizes, no LFS, works fine
#
Loqi
[preview] [snarfed] snarfed.org: My web site
#
aaronpk
my website storage git repo is 17gb but it's working fine
#
capjamesg[d]
[snarfed] Do you serve directly from that repo?
#
capjamesg[d]
Also: I want the repo to be private.
#
[snarfed]
I don't serve directly from git, no
#
[snarfed]
private, sure, but that seems orthogonal to image storage, LFS, etc
#
[snarfed]
you can make a GH repo private
#
aaronpk
mine is private, it's not even on github
#
capjamesg[d]
Maybe I need a separate Git repo that rsyncs to my server.
#
[tantek]
private image-specific GitHub repo makes some sense
#
[tantek]
like github/username/images.example.com
#
[tantek]
though I wonder if it's worth separate repos for images, audio, video
#
[tantek]
or just one media.example....
#
[tantek]
or is github storage a bad idea for temporal media?
#
[snarfed]
temporal? ie media you might eventually want to delete?
#
[tantek]
time-based media
#
[tantek]
whatever is the better word for audio & video
#
[tantek]
streaming media?
#
[tantek]
where being able to FF/REW/scrub and performance thereof is a consideration
#
aaronpk
other than the large file issues with git, i don't see any relevance
#
[tantek]
some of that I think depends on some HTTP range operations that are more / less supported on some servers than others
#
aaronpk
for really large videos, you'd ideally want to chop them up as HLS rather than serving a single multi-gb file in the first place
#
[tantek]
aaronpk, the issue is you don't want to download an entire video to your mobile device before you play it
#
aaronpk
if you try hard enough and your server has HTTP range support you can make an HLS file and point it at a single multi-gb file and solve that
#
[tantek]
sigh see that's what I mean, as an independent publisher, I really don't want to have to deal with video production / streaming pipeline infrastructure and work-per-post
#
[tantek]
I'd like to be able to publish streaming media as "just one file" on a hosting service, and then use <audio> or <video> tags to "just" stream it in a low latency high performance way
#
[tantek]
don't make me think about post-processing the video file I recorded on my mobile into multiple files or other formats etc.
#
aaronpk
seems like you're asking for a browser feature then
#
[tantek]
no my point is HTTP has enough support to make all this work automatically, if the server supports it for large media / streaming files
#
aaronpk
something has to be aware of the video format in order to use the HTTP support, so that something should probably be the browser
#
aaronpk
otherwise it's going to be server-side preprocessing
#
[tantek]
no the server knows because it goes file extension -> content-type
#
aaronpk
most web servers are not aware of video codecs
#
aaronpk
whereas browsers are
#
[tantek]
the browser knows you are asking for streaming media due to use of an <audio> or <video> tag
#
[schmarty]
i used to store stuff in Git LFS and it was such a hassle! then i realized that under the hood, Git LFS is basically content-addressed storage over HTTP. sooo, I went ahead and made my micropub-media endpoint also save uploads at content-addressed paths and serve them that way, with an image proxy in front for resizing.
#
[tantek]
so the browser can request via HTTP, just give me enough to play for n seconds, and the server has to be smart enough to do that
#
aaronpk
the problem is that N seconds does not translate to M bytes without being aware of the video codec
#
[tantek]
this should not require any extra work on the side of thepublisher
#
aaronpk
so *something* has to be aware of the video codec
#
[schmarty]
tantek: try telling this to apple
#
[schmarty]
loves their hot new patent-encumbered codecs that only other apple devices can understand.
#
[schmarty]
i don't post caturdays as often these days but to go from Live Photo as a loop / bounce in Photos to my website, I have to do two bounces through an app called Metapho, which lets me see the secret two parts to a live photo (still and video) to select the video, then again to have Metapho strip the indentifying location and other metadata.
#
aaronpk
i'm fully on board with the workflow of upload a single video file to a web server and put the URL in a <video> tag. but the missing piece is the video codec to byte range mapping
#
[schmarty]
Metapho also takes care of MP4-ing it, I think? Part of why I adopted a commercial image /video caching proxy was because i could ask them to always coerce the video to MP4 so they could automatically convert the proprietary apple format to something browsers understand.
#
[schmarty]
anyway i guess i have a hard time accepting that indies should be expected to hide all this complexity when apple works to make it _more_ complex for anyone who dares step outside their tight ecosystem.
#
[tantek]
[schmarty] this is one of the reasons I gave up on posting videos a few years ago, it got too hard and the formats kept changing and being weirdly incompatible in unpredictable ways
#
[schmarty]
(that said, i am also a big advocate for promoting image and video proxies to be first-class indieweb building blocks because they are kickass!)
#
aaronpk
this is the reason i don't post more videos on my site too
#
[schmarty]
i smell an IWC session cooking 🍳
#
aaronpk
i don't have an automatic workflow for creating the HLS playlist or doing transcoding
#
[schmarty]
same! i stick to tiny videos so i don't feel bad about single-file mp4s 😅
#
aaronpk
yep, and even then i barely post videos because i don't have a good video micropub client 😂
#
[schmarty]
Shortcuts, baybeeeee (but also yes I feel you here)
#
aaronpk
so the few that i've done are creating a draft photo post, then going and uploading the video file over ftp later 😂
#
capjamesg[d]
I have decided to create a new repository for my images.
#
capjamesg[d]
And will only upload the new images if I can figure out how.
#
aaronpk
i had somethign working in shortcuts for a while but it broke and it's too difficult to debug so i gave up on it
#
capjamesg[d]
Thanks for your help, everyone!
#
capjamesg[d]
[schmarty] This would make for a great IWC session!
#
[schmarty]
i feel like mobile indieweb, at least on iOS, is markedly worse off than from when I spoke about my setup at IWS 2019 https://archive.org/details/indieweb-summit-2019-own-your-mobile-experience
#
aaronpk
agreed
#
[tantek]
I mean honestly at this point I'm even further disincentivized to post any *recorded* images or video because I have no desire to contribute to training genAI models that will inevitably lead to unpredictable (and likely novel) abuse models 😞
#
[tantek]
whether of myself or especially of other people
#
[schmarty]
tantek: yep. i mean, my motivations are a complex ball of spaghetti but i feel pretty timid about putting anything new on the public web. 😩
#
aaronpk
i mean, that's a different story entirely
#
[snarfed]
and arguably applies to images and text too, right?
#
[snarfed]
er, sorry, to text
#
aaronpk
but i'd at least like to be able to post videos to a limited audience which has the same problems as posting publicly anyway
#
[schmarty]
for me it applies to my whole site. and i don't really consider it separate because it's about being able to post to my site. and i have to _want_ to post to my site before i will work on being able. 😅
#
aaronpk
yeah i am very motivated to figure out limited audience posting
#
[schmarty]
"i don't feel comfortable sharing this until i have a good limited audience solution" has different plumbing issues but it also prevents me from spending time on the rest of the barriers.
#
aaronpk
i went and privated a whole bunch of content on my website recently, including making a lot of the new content private by default
#
aaronpk
(and by private i really do mean private only visible to me)
#
[schmarty]
another good IWC session, probably 😄
#
[tantek]
^ when to post private by default
#
[tantek]
if someone figures out a publishing / storage workflow that keeps content from being default crawled by LLM training spiders, I'm listening
#
[tantek]
[snarfed] no text is different. I feel I have a much better grasp of the (ab)uses of LLMs trained on my text than on my photos/videos
#
[tantek]
and on the flip side, there is incentive to post text which will cause LLMs trained on it to say/do good things
#
[tantek]
it's like, pretend like you're a parent to every possible future LLM out there, what would you teach it to believe?
#
[tantek]
now keep that in mind while you write your public blog posts, as if your blog posts are teaching legions of new LLMs
#
capjamesg[d]
[tantek] Tracy has a good robots.txt for blocking AI bots: https://tracydurnell.com/robots.txt
#
capjamesg[d]
But, the pattern has been companies train models _then_ offer opt out.
#
[tantek]
capjamesg[d], robots.txt is only for "well-behaved" AI bots, and with the capitalist gold rush towards AI, startups by default are prioritizing "innovation" and "profit" over "well-behaved".
#
capjamesg[d]
I know, but it's better than nothing.
#
[tantek]
I'm not sure the time cost of writing up and maintaining your robots.txt is worth the marginal benefits
#
[tantek]
and yes the other problem ("companies train models _then_ offer opt out") is implications for POSSE
#
[tantek]
I am still reflecting on that
btrem joined the channel
#
aaronpk
interesting, `-c copy -hls_flags single_file` with ffmpeg copies the video without transcoding and creates a m3u8 playlist pointing to byte ranges in the same file, so that is a nice alternative to making a bunch of tiny files
#
[tantek]
what is video
#
Loqi
video is a type of post where the primary content is a video file (recorded movie, animation etc.) typically with audio, and has growing support on the indie web https://indieweb.org/video
#
[tantek]
^ that page needs a lot of updating (lots of old/dead links) and perhaps some modern advice for how to actually make video posts that work
#
[tantek]
it is once again too hard
#
[tantek]
tbh, "upload your video to Internet Archive, then link/embed that" is my tl;dr when people ask me what path to take to publishing videos these days
#
superkuh
ffmpeg -movflags +faststart on some VPS, though I acknowledge "some VPS" implies cli skills.
#
aaronpk
maybe that's what i should do, make a little video upload server of my own that only handles transcoding and m3u8 generation, then "upload my video to my media server and link/embed that" is my answer
#
[Joe_Crawford]
Oi, you're not kidding about the video page needing updating. It mentions Vine as if it had not gone defunct in 2017.
#
superkuh
I've been working on a HTML5 only (no JS) playlist sort of thing for switching between videos. It just uses CSS {display: none;} and :target {display: inline-block;} with anchors.
#
[Joe_Crawford]
Feel like it's such a large rewrite it deserves a group brainstorm if that's a thing. Video's evolution on the web deserves a proper history. It's supremely weird.
#
[Joe_Crawford]
No, really just need to mention the indieweb usecases for video and what people expect to be able to do with video PESOS & POSSE.
#
[Joe_Crawford]
(but I do want a full history of web video! for me!)
#
[tantek]
[Joe_Crawford] no there are ways to incremental improve the /video page section by section. no need to block on a rewrite
#
[Joe_Crawford]
of course you're right. my claim that of what it needs is my admission I have no idea where to start.
#
[tantek]
as with many things, at the beginning 🙂
#
[Joe_Crawford]
tantek++ for yoda shit
#
Loqi
tantek has 24 karma in this channel over the last year (105 in all channels)
#
chadsix
Web video was interesting. Mostly controlled by RealMedia until the spawn of DivX and later XviD. Things changed significantly from there as I recall anyway.
#
[qubyte]
Catching up… The reason I’ll eventually (once I get myself into gear) put my media into a bucket (and then stick a cache in front of it with immutable-ish headers) is so that git clones are light, but also (since I’m on netlify) I don’t want to slow deployments down when all I really need are references to files. I don’t want to mess around with git-lfs either. Images are all uploaded via a micropub endpoint, so retooling that to s
#
[qubyte]
to a bucket (and maybe pre-populate a cache) is pretty easy. It _is_ more moving parts though.
#
[Joe_Crawford]
I used to transcode to realmedia for game demos for Jamison/Gold. man that was slow. The videos were ridiculously small. like 200x300
#
chadsix
haha the good ol days!
#
[Joe_Crawford]
Had realmedia .ra and .ram files on my site until maybe 10 years ago.
#
[Joe_Crawford]
or was it .rm. one was a list of pointers? the other the data? can't recall.
#
chadsix
.rm xD
#
[Joe_Crawford]
and wmv alternative. and sometimes "just post a zipped up avi to download" -- so terrible.
#
chadsix
haha
thegreekgeek, gRegor and drizzt09 joined the channel
#
[KevinMarks]
Video time to byte range mapping is built into mp4 as that is based on the quicktime movie format, which got that right. Some other formats like Ogg and raw MPEG streams don't do that so you have to guess the byte range to fetch when seeking. HLS/Dash reinvented this at the file system layer with a playlist of files containing media chunks at known offsets, originally for live updating over TCP, and adopted for prerecorded playback
#
[KevinMarks]
because it was simpler to do that the full QT abstraction and enabled alternative quality files per chunk to ease streaming ramp up.
#
[KevinMarks]
QuickTime had really solved the entire audio video abstraction problem at every level, but Apple pissed it away by putting popup ads in QuickTime playback at the system level.
#
aaronpk
BlueSky is considering using the same new client discovery method as in issue 133
#
aaronpk
actually not considering, settled now
#
aaronpk
based on how easy it was for me to make the change in two indieauth servers, i want to encourage anyone else to try it in theirs asap and we can get this changed in the spec
#
aaronpk
[manton] can I interest you in a quick briefing on this? we can talk about the FedCM implications too
#
aaronpk
and GWG do you have bandwidth for making this change in the wordpress plugin?
#
GWG
I think I can. I need something to test against though
#
GWG
Should be easy enough to support
#
aaronpk
webmention.io does it now
#
GWG
Okay. I'll make the discovery pr next
#
aaronpk
oh and indielogin.com too
#
GWG
Which clients?
#
aaronpk
those clients
#
GWG
I thought indielogin wasn't a client traditionally
#
aaronpk
indielogin.com is an indieauth client if you log in with your website that is an indieauth server
sp1ff joined the channel
#
gRegor
aaronpk, I could try it on indiebookclub. Or is this a IndieAuth server update only?
#
aaronpk
it's both. it's more of a change for servers, a really minor change for clients
#
gRegor
Nice, I'll take a look