#dev 2023-03-18

2023-03-18 UTC
[asuh], IWSlackGateway, [tw2113_Slack_], [jacky], [tantek], HelloMatrina, gRegor, gRegorLove_, gRegorLove__, sivoais, [schmarty], bterry and [James_Van_Dyne] joined the channel
#
IWDiscordRelay
<c​apjamesg#4492> What is the best way to manage a dozen apps on a server? Docker? Tmux?
[James_Van_Dyne] joined the channel
#
IWDiscordRelay
<c​apjamesg#4492> [KevinMarks] do you know anything about vector stores?
#
IWDiscordRelay
<c​apjamesg#4492> I am trying to understand how faiss lets you query distributed indices.
#
IWDiscordRelay
<c​apjamesg#4492> They seem to use RPC.
[KevinMarks] joined the channel
#
[KevinMarks]
Anything like this involves a mixture of approaches to index compression - that explains the techniques they used, but how well it works will be dependent on the structure of the data.
#
[KevinMarks]
If you know the data well and what you want to use it for you can squash it more for that domain eg https://research.google/pubs/pub46522/
#
[KevinMarks]
The token embeddings have a high dimensionality but are very sparse in the raw form. The networks that model them are already more concentrated, but due to some degree of randomisation during training will have a different degree of sparseness, so coming up with representations that compress them well is a challenging problem. I expect that we will see ways to aggressively prune or compress models for specific applications.
#
IWDiscordRelay
<c​apjamesg#4492> How does an index split across multiple systems work?
#
IWDiscordRelay
<c​apjamesg#4492> Sharding?
Xe and mambang[m] joined the channel
#
[KevinMarks]
More than one way of doing sharding. If you can hash the thing being queried you can direct to a single shard (or a few if doing something like CHORD). If you can't you can query all shards in parallel and then gather the results. The challenges are in balancing the shards
#
IWDiscordRelay
<c​apjamesg#4492> Presumably to do KNN you’d need to compare against everything?
[manton] joined the channel
#
[manton]
[snarfed] I’m considering adding .well-known/nodeinfo support so Micro.blog is included in stats trackers like https://fediverse.observer/stats. Curious if Bridgy Fed supports this?
[timothy_chambe], [TMichelleMoore], sivoais, [chrisbergr], geoffo, chenghiz_, mro and [snarfed] joined the channel
#
Loqi
[preview] [snarfed] #401 Implement nodeinfo protocol