#dev 2022-10-06

2022-10-06 UTC
KartikPrabhu joined the channel
#
KartikPrabhu
Loqi: any chatter for me?
#
[tantek]4
welcome back KartikPrabhu!
#
barnaby
heya kartik
#
KartikPrabhu
yo barnaby
#
[tantek]4
capjamesg, so I won't forget, can you file an issue asking to promote auto_url_summary from cassis-lab to cassis.js core based on your desire to re-use / port it?
#
GWG
KartikPrabhu: Welc5
#
GWG
Welcome
#
KartikPrabhu
I'll take a Welc5 too
jacky and [fluffy] joined the channel
#
GWG
Sorry, hand still shaky
geoffo, jacky, [tw2113_Slack_] and maxwelljoslyn joined the channel
jacky joined the channel
#
GWG
aaronpk: We were discussing your future plans for indieauth.com at HWC. Do you have any recent thoughts?
#
aaronpk
Nothing new
#
GWG
So, should we be telling random new people not to use it was the question
#
GWG
We were just saying, we don't have anything better
#
aaronpk
if anyone else wants to run a public IndieAuth server for people I would be happy to recommend that
#
GWG
So, no alternative suggestions.
#
GWG
aaronpk: I think angelo was volunteering. So, what advice would you give him?
#
aaronpk
Hmm, I have a lot of ideas about how I would build a new one differently, would it be useful to write them all down so someone can copy that?
#
GWG
aaronpk: I think so. Because that's what we were discussing
#
GWG
I think the question was at the most basic level about avoiding the high level issues, though the low level would make a lot of us happy.
jacky and tbbrown joined the channel
#
aaronpk
I was thinking more about the high level stuff anyway
#
GWG
I would love to read it
#
jacky
I'm curious about the high level too
#
jacky
b/c there's things around states that are great tradeoffs if you don't keep a lot of it (like for just rel-me and things required on fetch)
jacky joined the channel
#
capjamesg
Oof thanks for sharing angelo. I can't support a library that explicitly offers options for that sort of thing.
#
capjamesg
I was only looking at the page on getting URLs from a sitemap and the hashtag / @ sign detection.
#
capjamesg
I had to write this code for my SSG. Now I wonder whether it could be in IndieWeb utils.
mro, jacky and gxt joined the channel
#
@zimmergren
↩️ I love the idea of webmentions. I'm not a CSS magician anymore, so obviously it needs some work - but this is pretty neat, and the engagement is clearer. At the end of each post, it's pulling in any social mentions for that specific URL, mainly from Twitter
(twitter.com/_/status/1577935451427639296)
#
@zimmergren
↩️ I'm using native webmentions, pulling in social comments directly to the published posts. I'm leaning into disabling disqus, but I have thousands of comments there that I really don't want to lose out on - so first figuring out a way to map those to the posts, too.
(twitter.com/_/status/1577935030705348608)
jacky and gRegorLove_ joined the channel
#
capjamesg
[tantek] You can review my implementation here: https://github.com/capjamesg/indieweb-utils/pull/69/files
#
capjamesg
src/indieweb_utils/utils/url_summary.py and tests/test_utils/test_url_summary.py are the relevant files.
jacky, mro, gRegorLove__, tetov-irc, KartikPrabhu, barnaby, gRegor, geoffo and gRegorLove_ joined the channel
#
angelo
capjamesg there's some useful stuff in there no doubt. i just find a robot creator offering users of their robot the ability to disobey other robots kind of ominous.
#
sknebel
its a fairly standard thing with any kind of scraping library
#
sknebel
or automated download tool etc
#
sknebel
e.g. wget has the same afaik
#
sknebel
yep. although I expected worse from the title :D
petermolnar joined the channel
#
barnaby
the archive.org wiki has a strong opinion and some background about ignoring robots.txt being fine https://wiki.archiveteam.org/index.php/Robots.txt
#
sknebel
archiveteam != archive.org
#
capjamesg
archiveteam is an independent collective of archivers.
#
sknebel
but yes. there's plenty well-argued reasons why one might want to ignore robots.txt
#
capjamesg
Although this is super interesting.
#
sknebel
(archive.org also does it in some instances)
#
capjamesg
Ignore robots.txt?
jacky joined the channel
#
capjamesg
Do you know what those cases are?
#
capjamesg
Important websites deemed by some criteria?
Seirdy joined the channel
#
capjamesg
Ah I see.
#
sknebel
yep, that
#
[Murray]1
jamietanna++ was genuinely considering _yesterday_ whether I should build my own feed reader, because I've gotten fed up with the offerings that are out there. But maybe not... 😂
#
Loqi
jamietanna has 6 karma in this channel over the last year (14 in all channels)
#
capjamesg
[Murray] Build a Microsub reader :D
#
jacky
we need more microsub readers
#
capjamesg
Building the reader was a lot easier than the server.
#
capjamesg
(for me)
#
[Murray]1
I guess it's more the server that I'm after, though (although, tbh, having just re-read the /Microsub page, I'm not really clear on what it the distinction _is_)
#
[schmarty]
i've been wondering if aaronpk would be open to community additions to Aperture :}
#
sknebel
I never really put my inoreader-micropub-proxy thing into production
#
sknebel
because the clients didnt feel right
#
[Murray]1
I just want something that can give me a list of links to the posts I have not yet read on someone's website 😂 And for that to work well on mobile devices. (And not arbitrarily decide to remove posts older than X, which is my issue with most services)
#
[Murray]1
TheOIdReader works well for point 1, but I'm increasingly preferring to read blog posts on my phone, so that means I barely touch my RSS feeds anymore
#
capjamesg
[Murray] The server does the feed processing and the client shows the jf2 from the feed.
#
[Jamie_Tanna]
I was thinking of building my own Micropub server cause I want to do people-first channels rather than feed-first (which I know as a concept we've talked about but not yet done) and even if it then just manipulated Aperture as a backend, I'd need to implement some of the stuff myself
#
capjamesg
My server read Atom, RSS, mf2, jf2, and JSON feeds, moved them into jf2, ready for serving to clients.
#
[Murray]1
well, guess I should take another look at Microsub options then. I do not have the will to get into building a back end for that; I was planning on going a very similar route to that article (feedparser + 11ty/Astro + an edge function) but even that seems too much now 😄
#
sknebel
capjamesg: did you publish your server somewhere?
#
[Murray]1
it's on the Microsub page by the looks of things
#
[Murray]1
unless I'm getting things confused again
#
Loqi
[capjamesg] cinnamon: A social reader built with Python Flask.
#
sknebel
could've thought of checking there :D thx!
#
[Jamie_Tanna]
More about the bigger vision for what I want to build https://www.jvt.me/posts/2021/05/01/social-reader-features/
#
Loqi
[Jamie Tanna] Features I Want In My Social Reader
[ggirelli] joined the channel
#
[tantek]4
capjamesg, interesting restructuring of the auto_url_summary code!
#
[tantek]4
I think part of the reason I didn't break it down to smaller functions e.g. for github and social media the way you did is I was hoping that with all the code inline in one function, more overall patterns would emerge that could help rewrite from a giant if/else to something that looked things up in a static table
#
capjamesg
I broke up the code because linting fails when a function gets too complex.
#
capjamesg
Too many if statements would have caused an issue in that regard.
#
[tantek]4
also part of the reason why I kept in cassis-lab, it felt like a total hack/prototype piece of code that I wouldn't want others re-using/copying because of it's unsustainable structure lol
#
capjamesg
Well...
#
capjamesg
About that...
#
capjamesg
I wanted to restructure it too.
#
[tantek]4
linting fails?!? that seems like a very poor way to solve the problem
#
capjamesg
But I couldn't think of an elegant solution.
#
[tantek]4
as in, breaking it up in to more functions obfuscates the hackishness
jacky joined the channel
#
capjamesg
[tantek] It's also about my code style preferences / readability.
#
capjamesg
I'm more than open to an alternate solution.
#
[tantek]4
I'd say that's actually *worse* for the eventual rewrite
#
capjamesg
This is MVP. Not library-ready.
#
[tantek]4
I'm a fan of keeping hackishness of code transparent front & center so it shows up more obviously as something to more "globally" fix
#
capjamesg
I'll not be pushing for this to be done in v0.3.0.
#
[tantek]4
rather than hiding the hackishness amongst a bunch of functions
#
[tantek]4
coding style preference sure, however IMO it's a code-maintenance culture thing too, as well as a way to "signal" where something should/should not go
#
capjamesg
[tantek] I thought about a nested dictionary.
#
capjamesg
Then I started to think about how substitutions would work but that would get ugly.
#
capjamesg
Did you think about alternate data structures for this?
#
jacky
perks up at refactoring talk
#
[tantek]4
capjamesg, still thinking about it
#
[tantek]4
I knew I had blogged about keeping ugly code ugly deliberately before: https://tantek.com/log/2005/11.html#d26t1820
petermolnar joined the channel
#
[tantek]4
in the context of "hack should (or MUST in the RFC2119 sense if you prefer):"
#
[tantek]4
"3. Be ugly. It's actually a *good thing* that a hack be visually ugly from a coding aesthetic point of view in the hopes that the ugliness will be a reminder that the hack *is* a hack, and should incite a tendency for people to a) minimize it's usage, and b) remove it's usage over time."
#
jacky
^ re: hacks, I've def gone heavy with commenting
#
jacky
as I've learned that my memory isn't the best
#
jacky
and even trying to write down elsewhere _why_ and _where_ it exists
#
jacky
question about making an indieauth provider: I'm considering using e-mail as a way to provide sign-in for one's site in the same vein that one would do it with rel-me auth with github/twitter etc
#
jacky
the flow I have in mind (will document on wiki soon) would be that I'd look for e-mail addresses in one's site (namely one marked up with rel-me info) and attempt to send an e-mail to it
#
jacky
it'd be a bit similar to how the pgp key flow works
#
jacky
I know this is something I'd be using (before I get push notification auth going), does anyone else seem themselves using something like this (or suggesting it)?
#
jacky
got the idea from micro.blog tbh
#
[schmarty]
IndieAuth.com does this as well
#
sknebel
isnt that just like indieauth.coms method?
#
sknebel
I personally am not a big fan of that method, but thats just personal preference
#
jacky
oh I did _not_ see that on https://indieauth.com/setup wow
#
jacky
just did a ^F
#
sknebel
one implementation hint: build it in a way it works if the link is clicked on a different device!
#
jacky
sknebel: yeah tbh it's def a stop gap for people who don't have a github account nor want to associate things with their twitter account (personal / work / not-work) etc
#
sknebel
mh, not even necessarily imho
#
sknebel
well the publishing of the address is a bit meh
#
sknebel
could encrypt that if you really wanted
#
jacky
that's a v2-esque kind of thing (to use pgp to potentially encrypt the message)
#
sknebel
i meant the mail address
#
sknebel
so its not published
#
jacky
ahhh yeah yeah
#
sknebel
have a thing on the website that lets the user encrypt the address with your services key
#
sknebel
and then they can put that on the page in a special attribute instead
jacky joined the channel
#
jacky
oh like the inverse of the pgp-flow of indieauth.com
#
sknebel
and so your service can decrypt the address and send the email, but its useless for anything else
#
sknebel
while you dont need to keep a user database in the service
#
jacky
takes a note
jacky joined the channel
#
angelo
aaronpk i'd love to hear your ideas on an indieauth service
gRegorLove_, gRegorLove__ and jacky joined the channel
#
capjamesg
[tantek] This is interesting.
#
capjamesg
I had never thought about deliberately hack-y code as a way to signal code may be replaced.
#
capjamesg
[tantek]++
#
Loqi
[tantek] has 22 karma in this channel over the last year (72 in all channels)
#
barnaby
in taproot, I have a dedicated hacks.php file as a dumping ground for everything which I know isn’t particularly elegant but just wanted to get working
petermolnar and jacky joined the channel
#
capjamesg
This is sort of hacky too but it works :D
#
capjamesg
But it doesn't do error handling.
#
capjamesg
The code now uses a list of lambda rules and applies the transformation.
#
capjamesg
The advantage of this is that I can just have a big dictionary of rules rather than a list of functions.
#
capjamesg
I'd love Python devs here to weigh in on the elegance of this though :D
#
capjamesg
I have never used this design pattern before.
#
capjamesg
cc sknebel [James_Van_Dyne]
#
capjamesg
This is a job for Lisp :D
#
[James_Van_Dyne]
I think you’re better making it “private” functions and adding it into the list so you can type / have comments and so forth. It also allows you to pass / wrap the function if you need to
#
[James_Van_Dyne]
But yeah - lgtm
#
sknebel
capjamesg: could use match statement too maybe, if really new python is fine
#
sknebel
but using a dict like that is kind of an established pattern, yes
#
sknebel
two nits: a) I'd prefer None over "", and b) question if it should terminate as soon as one function returned a string or if it should collect all
#
capjamesg
[tantek] This code eliminates lots of "if domain == whatever" statements by doing a lookup in a dict.
#
sknebel
"or none" is unneeded in that
#
capjamesg
Haha yes!
#
capjamesg
conciseness++
#
Loqi
conciseness has 1 karma over the last year
#
sknebel
and thus you also dont need the lambda
#
capjamesg
A bit more polished.
[manton] joined the channel
#
[tantek]4
hmm, maybe that's a thing in Python but in general I'm not a fan of the "array of functions to call" method. that feels like it is "merely" obscuring the hackiness underneath a clean-ish looking layer
#
[tantek]4
I'd still push for it to be purely data-driven and all the logic in one function, not split across functions
#
[tantek]4
again, keeping all the logic in one function makes it easier to see it all at once and develop insights about more global optimizations / redesigns
#
capjamesg
[tantek] This does remove the need for the if domain == statements.
#
capjamesg
Because that bit is a dictionary lookup.
#
capjamesg
I do get your point.
#
capjamesg
How would you recommend doing this?
#
capjamesg
Algorithmically, not in actual code.
#
barnaby
in cases like this I usually prefer a long if/elif chain, at least until one or more of the blocks becomes longer than a few lines
#
capjamesg
Reducing this down to its basics, we need to: 1) get parts of a domain and path; 2) interpolate a string with those values.
#
capjamesg
I could do it that way in IndieWeb utils but I wonder what the better solution is if there is one.
#
capjamesg
Here comes angelo with the regex approach!
#
capjamesg
Ah this is amazing.
#
barnaby
having lists of functions is useful when the library user wants/needs to be able to alter the list
#
angelo
updated to pull the list out as a global
#
capjamesg
barnaby I'd adjust this to have a custom_properties list of rules that a user can add.
#
capjamesg
*that a user can add to
#
angelo
you could replace the format string with the name of a function and try: to call function(**match) and fall back to the format string case
#
capjamesg
Then do a for pattern, summary in url_summary_templates + custom_properties:
#
capjamesg
angelo ^
#
angelo
updated my gist for completeness.. something like that
#
capjamesg
Want to submit a PR to IndieWeb Utils? :D
#
capjamesg
I love this solution and it fits in with our library design choices.
#
capjamesg
angelo I updated mine to add a base case.
#
capjamesg
I added more examples too.
#
angelo
i'd call it custom_templates. NEVER provide a list as the default value of a keyword argument in your signature. i'd use urlparse to grab the netloc. ALWAYS minimize use of regex.
#
angelo
those github regex's are going to fail for certain usernames/projects so a relaxing of those patterns is in order
#
angelo
and i'd chop off the protocol as well so it doesn't take up space in the template definitions
#
capjamesg
Want to contribute a solution?
#
Loqi
[capjamesg] indieweb-utils: Utilities to aid the implementation of various IndieWeb specifications and functionalities. Built with Python.
jacky joined the channel
#
angelo
where are you using this code?
#
capjamesg
In the library?
#
capjamesg
You can create a new file called src/indieweb_utils/utils/url_summary.py and put your code in there.
#
capjamesg
I can write the test cases if you'd like.
#
capjamesg
Otherwise, test cases would go in tests/test_utils/
#
angelo
i mean what is the use case for calling this function?
#
capjamesg
It lets you get a summary of a page without having to make a HTTP request.
#
angelo
when do you want a summary?
#
capjamesg
You could add reply contexts without having to retrieve the full page / as a fallback.
#
capjamesg
For instance, I might want to reply to a post Twitter.
#
capjamesg
But instead of embedding the Tweet / copying the text, I could use this function to get a summary.
#
capjamesg
But of course this would work for all programmed cases.
#
capjamesg
I'd use it for displaying replies on a personal website.
#
angelo
i'm just thinking about the ergonomics eg. "Replied to A tweet by alice"
jacky joined the channel
#
angelo
*capitalization
#
capjamesg
'in reply to a tweet by @indieweb'
#
angelo
i need to put it to use before i can librarify it
#
angelo
but if you're using it now, by all means
#
capjamesg
Do you want to contribute it yourself for some extra GitHub points :D
#
capjamesg
I can also add the code and some attribution statement if that's preferred.
#
capjamesg
I think there might be a good solution in combining both our implementations.
#
capjamesg
I see there being an issue if there were say 100 different patterns and we matched against each.
#
capjamesg
We could do a dict lookup for domain then match from there.
#
angelo
please add it yourself and i'll circle back when i work on reply contexts
#
capjamesg
Will do!
#
gRegor
[tantek]4, were you waiting on Matt's ok to merge https://github.com/indieweb/rel-me/pull/7? It shouldn't change any of the RelMeAuth functionality
#
Loqi
[gRegorLove] #7 Prepare next release
jacky and tetov-irc joined the channel
#
[tantek]4
angelo, re: use-case, the (original) PHP version of this function is in live use on my site for my reply contexts https://indieweb.org/auto-url-summary#Tantek — you can check out the results in context on my home page composite stream, search for "In reply to" in the page https://tantek.com/
#
[tantek]4
gRegor, perhaps. did you have a preference for how we move forward with what jamietanna asked about tests?
#
gRegor
I can remove the commented out lines
#
[tantek]4
I don't have an opinion either way. I was asking only to see if we had a specific strategy in mind 🙂
jacky joined the channel
#
gRegor
Context: this rel-me is in the dependency chain for indiewebify.me and is pinned to php-mf2 0.2. Once there's a new release we can get indiewebifyme running the current parser
#
gRegor
Running `/vendor/bin/phpunit` works well. Composer scripts let you alias a more complex command, which I guess php-mf2 does: https://github.com/microformats/php-mf2/blob/main/composer.json#L32
#
gRegor
`composer run-script tests` will run that alias
#
[tantek]4
Sounds good.
#
[tantek]4
I think a week is enough time for a heads-up. If there's something he suggests reverting we can handle it then 🙂