#dev 2024-09-13

2024-09-13 UTC
aaronpk, for your review (and anyone else interested in a summary of identity on the web) https://github.com/w3c/identity-web-impact/pull/39
[preview] [tantek] #39 add IndieAuth to Standards section with TR link
oh boy
I mean, assuming you're ok with IndieAuth being mentioned in that list
yes, simone should already know about it too because of the FedCM work too
geoffo, [0x3b0b], [pfefferle], gRegorLove_, ttybitnik, [sebbu], [qubyte], [tantek], IWSlackGateway, krove, nnrx_, srushe, lockywolf, lockywolf_, cuibonobo, barnaby, ancarda, capjamesg, vikanezrimaya, suki, nnrx, okCiel and roxwize joined the channel
jonnybarnes and eb joined the channel
to2ds joined the channel
indielogin.com doesn't work with punycode domains
(i mean, in unicode, not using punycode)
gRegorLove_ joined the channel
Unfortunately I think my static site generator (vitepress) is just too slow for a site that needs to render 100k different likes from decades of reddit and YouTube activity.
No matter what, I'm looking at a lot of work converting my site to another platform. I could go with a ssg or honestly I don't mind making something that needs a server to run on, so it can render on request. What tools are popular around here? I think I hear about 11ty a lot, but is it really really fast? That's a requirement for me now
thepaperpilot[d] could your code be slow?
By fast I technically mean not resource intensive. Shouldn't require a dozen gbs of ram and several hours to finish
pcarrier[d]: Well "my code" are just 100k markdown files vitepress is rendering, some css modifications, and a single simple element added to the layout template. I don't think the little code I have plays a part
thepaperpilot[d] OK. and rendering 100k markdown files takes how long?
It fails after about 2 hours because the heap maxes out. 32 gb
I asked the vitepress discord and they told me no ssg could do that... Which doesn't sound right to me
I'll try
It seems to me like once it writes a file it should be removed from memory, so running out shouldn't be a possibility unless I have a single file that is 32gb (I do not)
thepaperpilot[d] you can have dependencies between files (like access the collection from a template)…
OK I have created 100k md files in a directory. even `ls` is slow.
I don't know of anything in vitepress called a collection. I do have tags, but those are just index pages I generated paginated markdown files for in a script that runs before the build.
pcarrier[d]: Yeah I have noticed vs code does not like that folder being expanded, and explorer struggles as well. I still think it should be possible
I mean, I have a script that writes those files from the raw reddit and YouTube data and that works just fine. So a ssg should be able to handle it imo
But if I'm wrong, and need something like a DB that gets queried to render pages JIT, then fair enough. Do you or someone else have any recommendations?
sqlite3 and php/python/ruby/whatever-you-prefer are fine depending on your hosting options
i can't think of a filesystem that doesn't choke a bit on so many files in the same folder. if you can, i suggest grouping them in subfolders, maybe by year or month, to keep the number of files in a folder in the low-thousands-or-less.
I guess the trickiest part of it's a dynamic site is hooking it up to indiekit. Storing files on a git repo was going to be quite convenient for that
a number of modern *nix filesystems easily handle millions to billions of files, even in the same dir, but agreed, not all
(eg XFS, ZFS)
I'm locally developing on windows, but the server is some flavor of Linux, I don't recall which (I just use docker containers for everything, so the differences between distros rarely come up)
yeah linux, ls on zfs with 100k files -> 90ms
in any way, 100k entries will be nothing for sqlite3
Alright, so in theory a ssg should be fine, but I'll need to find a way for indiekit to split up the folders the posts go into by month
but yeah hard to know what "acceptable" performance should be for an SSG w/100k input files
SSG scaling is very different from filesystem scaling
i'm a fan of keeping folders relatively small anyway
that's why i went with YYYY/mm/dd/ folders, so the most files in a folder is the most number of posts I make on a day, which is usually well under 100
I mean, my website won't be critical to access quickly, so I don't mind if it takes awhile to build. I just need it to succeed without crashing
Looking at the indiekit Jekyll preset it's already setup to split posts up by year/month/day folders , so that part will be handled. I guess I'll try to get my local posts in that structure and see if vitepress handles the posts better that way
Ruby on Linux in one zfs directory can create 100k files and read 100k files in less than 2s. https://gist.github.com/pcarrier/7a4fed67660271e6ffe4298a5191fa66
[edit] Ruby on Linux in one zfs directory can create 100k files and read 100k files in less than 2s. https://gist.github.com/pcarrier/7a4fed67660271e6ffe4298a5191fa66
Thanks, if vitepress works out that'll save me a lot of work
but SSGs I know do a lot of complex work, you might be better off with your own tooling whether SSG or SSR
(roughly the same timing with ext4 FWIW)
rozenglass joined the channel
pcarrier[d]: Nice, ext4 is probably what my vps has
Well if you do SSR then use a single sqlite3 and it doesn't matter
If splitting the files into folders does not allow the build to finish successfully, I'm going to try migrating to Nuxt. I get to keep working on Vue, I think it'll be faster out of the box, and it supports both ssg and isr (and others) so I should be able to at least keep my code even if I need to change between a static and dynamic server
ttybitnik, sandra and rozenglass joined the channel