#dev 2019-04-28

2019-04-28 UTC
KartikPrabhu, [asuh], [tantek], [Michael_Beckwit, [fluffy] and snarfed joined the channel
#
[fluffy]
oops a while ago I started writing a tool for quickly generating reblogs/replies/etc. for Publ and Jekyll and other Markdown-based publishing engines and then I ended up forgetting to actually write the CLI for it and releasing it
#
[fluffy]
maybe I should fix that
#
[fluffy]
although right now it depends on pandoc and that can be annoying as a dependency. Does anyone know of any Python library for converting HTML to Markdown without shelling out to pandoc?
#
Loqi
[Alir3z4] html2text: Convert HTML to Markdown-formatted text.
voxpelli, BenLubar_, ingoogni and [fluffy] joined the channel
#
[fluffy]
I ended up basically rewriting it tonight, oops
#
sknebel
compared to html2text pandoc has more options to tweak the markdown dialect I think, that can be important for more complex content
ingoogni joined the channel
#
GWG
Zegnat: Was looking at your php-webmention-endpoint-discovery repo
#
Zegnat
Uh-oh
#
GWG
Why uh-oh?
#
Zegnat
I was expecting “I was looking at” to be followed by “and found the following issues” ;)
#
GWG
No.
#
GWG
I'm relooking at everything about webmentions
#
GWG
When pfefferle wrote the original code, and I took off on it subsequently, we kept to the way WordPress did Pingbacks. A webmention is not a pingback, so I'm looking to distance myself from that mindset by looking at non-WordPress webmention code
#
Zegnat
Cool. Note that it has a slightly further evolved branch from one of the IWCs. If you really want to get in deep
#
GWG
I'm looking at all these things in the spec that are underimplemented
#
GWG
Very few people seem to take the advice about per-media rules.
#
GWG
Most seem to do a pure text search
#
Zegnat
Oh, I have a different lib for that
#
GWG
Oh?
#
GWG
As I said, revisiting everything.
#
Loqi
[Zegnat] php-linkextractor: Class for finding all resources an HTML document links to.
#
Zegnat
So when the webmention comes from an HTML document, I can check if it really links to me, rather than doing a pure text search
#
GWG
Mostly because of the attempt to unify the bifurcated nature the implementation.
#
Zegnat
I promise no good performance on that code, it was very much experimental. And I think jkphl wrote a hard-fork of that code himself, as mine was PHP7+ only
#
Zegnat
But might be worth looking at
#
GWG
I'm looking for ideas.
#
GWG
So all is good.
#
Zegnat
I think I wrote that code at IWC Berlin 2 years ago? Or maybe IWC Düsseldorf.
#
GWG
I'm even digging down into the WordPress comment storage code to see if I can do async
#
GWG
Which means snarfed is probably right, that I'm overscoping again
#
Zegnat
No surprises there ;)
#
GWG
My dreams far exceed my reach
#
Zegnat
But I totally understand the problem. Sometimes it is hard to compartmentalise the different parts of software you want to work on.
#
Zegnat
So you end up working on everything simultaneously.
#
GWG
Well, imagine someone worked very hard to separate two things so they worked independently and now you are trying to not just copy the files so they are loaded together, but make them work as one thing.
#
GWG
So, removing the duplicate code and the workarounds to have one seamless whole.
iasai joined the channel
#
GWG
So, I am trying to figure out a new low-level implementation of the duplicate code.
#
GWG
Which takes longer, but sets up something better in the future. And since the two plugin solution still works, there is no rush on merging them specifically.
#
Zegnat
Can you start from scratch quicker? If you want to rethink the low-level stuff anyway? Or is that not a good solution for WordPress things?
#
GWG
Well, I am starting parts from scratch by removing parts of the code to their own functions
#
GWG
So, the code does a head request to check whether the URL is valid and of an approved content type to avoid being tricked into downloading a video file or something large. That can be a separate thing.
#
GWG
As I go, I am adding in proactive parsing instead of parsing as an afterthought
#
Zegnat
Note that the HEAD request can not guarantee you the size or file type if you are talking about an actively mallicious source. It is trivial to report different things on HEAD and GET requests.
#
Zegnat
But in general still a very good sanity check to have. The extra request on your end is probably cheap, and will catch people putting in random URLs.
#
GWG
WordPress is a big target
#
GWG
I wonder if anyone has ever made a checklist of all the recommendations in the spec and seen who follows what
#
GWG
Looking at how WordPress does pingbacks, it limits the retrieval size to 150kb
#
GWG
The spec uses 1mb as an example
#
Zegnat
How does it limit that, GWG? I was looking into limiting curl download size in PHP but it all seemed a little messy
#
GWG
Zegnat, there's an http wrapper API for curl in WordPress
#
GWG
So I never needed to find out
#
Zegnat
Link? I’d love to see how they solved it.
#
Zegnat
One of the things I found was to do your own response streamer, but that still felt a little iffy to me…
#
GWG
I know it uses https://github.com/rmccue/Requests below the hood
#
Loqi
[rmccue] Requests: Requests for PHP is a humble HTTP request library. It simplifies how you interact with other sites and takes away all your worries.
#
Zegnat
“takes away all your worries” ... hmm ... I have a lot of worries though
#
Zegnat
Thanks GWG! Will have a look there later :D
#
Zegnat
Oh interesting, it doesn’t seem to cancel the download if the file is too big, rather it just stops writing those bytes to memory. So that makes it sound like it will drain bandwidth and time.
#
GWG
Well, you do the best you can
#
GWG
Also reading about private webmentions
#
Zegnat
You can actually cancel the request, but then curl errors out. It is overall just a big weird on how to handle protection against huge files, and I do not have a perfect answer *shrug* Interesting seeing how they did it though.
#
Zegnat
I am not sure private webmentions are worth work. I much prefer working on something like AutoAuth. The webmention itself shouldn’t be responsible for marking something as private.
#
GWG
I am looking at autoauth
#
GWG
Private webmentions date back to 2017 last update
#
GWG
Autoauth looks like it supersedes that
#
sknebel
yes, AutoAuth was motivated by getting a better solution for private Webmention among other things
#
GWG
Maybe it should be linked to on that page
#
GWG
Reading how WordPress responds to private posts with a 404.
#
sknebel
yeah, add a link
#
GWG
WordPress has the idea of password protected posts as well
#
GWG
But I wonder if I should change the behavior of private posts on WordPress to return a 401
#
GWG
I added it to see Also, but not sure if I should edit the spec
#
sknebel
re 401, I just made a issue on AutoAuth to incorporate that - many sites do 404 to not leak any information at all
#
sknebel
e.g. a github repo you don't have access to returns 404
#
sknebel
so it probably makes sense for AutoAuth clients to handle that, even if there's additional cost of logging in and discovering there's nothing there
iasai joined the channel
#
GWG
I considered changing mine. I don't see the point of private posts where you don't know there's a page
#
sknebel
well, the idea generally is that you get told that there is something, or can discover it from feeds etc
#
GWG
And password protected is not something I care about. Though the question is about which to change to my desired behavior
#
GWG
Also, you can have a page with protected elements
#
GWG
Public and private
#
Zegnat
Honestly I think AutoAuth should look at WWW-Authorize headers and kinda ignore status codes
#
Zegnat
But definitely needs discussing
#
GWG
I may poll some WordPress users
#
sknebel
Zegnat: right now I worded it as "401 or 2xx", because I didn't consider the 404 case (and 403 also needs speccing)
#
GWG
Yet another rabbit hole
#
Zegnat
“Is there WWW-Authorize? Yes: there may be a different response code if you send authorize header along, so try that.” Initially feels like it makes sense. Though it is sad that you need to ignore the meaning of HTTP status codes :(
#
GWG
I have a lot of hardening things I need to do with IndieAuth to make it work better with this
#
GWG
Right now, if you get an Indieauth token, you get carte blanche access to everything unless the endpoint you are accessing stops you. I need to move scopes somehow further up the chain
#
sknebel
really need to come up with something that gets people to try AutoAuth, for now everyone says it's cool but the consuming code is missing
#
GWG
sknebel, you could try to talk me into something
sebsel joined the channel
#
GWG
I jump around from project to project, like a honeybee pollinating flowers
#
GWG
So, I will take on a bigger project and stop to do fixes to smaller projects
#
Zegnat
GWG, you have given me more work. I was looking at some recent bugs on rmccue/Requests to see how stable it is, and think I now may be forced to make a PR already.
#
GWG
There are 29 of them
#
GWG
rmccue does a lot of stuff
leg joined the channel
#
GWG
He's turned over SimplePie to mblaney, for example
#
Zegnat
Soon to be 30 PRs. Possible. Haha.
#
sebsel
I have too many projects too
#
sebsel
Luckily there is a IWC-spree coming :)
#
Zegnat
GWG, opened PR number 30 for the review queue ;) https://github.com/rmccue/Requests/pull/339
#
Loqi
[Zegnat] #339 Ignore locale when creating the HTTP version string from a float
#
GWG
sebsel, I will be at one and remoting into another at least
#
GWG
I am committed to Summit too
#
sebsel
GWG I'm very sad that I have to miss Berlin, but I'll remote that one too!
[tantek] joined the channel
#
[tantek]
what is a markdown dialect?
#
Loqi
It looks like we don't have a page for "markdown dialect" yet. Would you like to create it? (Or just say "markdown dialect is ____", a sentence describing the term)
#
GWG
sebsel, if you change your mind, I have a bed for rent
#
[tantek]
Anyone have good plain text design for /quotation posts?
#
[tantek]
I'm thinking of experimenting
#
sebsel
[tantek] isn't the > there to render a blockquote?
#
sebsel
taken from the quoting of e-mails
#
sebsel
markdown dialect is markdown#Dialects
#
sebsel
hm, too long ago, but there are some on that page
#
[tantek]
sebsel, yes, > at start of line is a good blockquote equivalent for email, usenet, etc. however looks odd outside that context. I'm wondering can we make it even easier to read, in broader plain text contexts, e.g. POSSE tweet
#
[tantek]
current thinking: start post content with “...quote here...” — URL optional@-name
#
[tantek]
actually put optional @-name BEFORE URL for more chance of POSSEing it in-reply-to a tweet of the article itself
#
sebsel
yea, you have this ``` thing, which creates a code-block in Github flavoured markdown. That one can take an optional word to denote the language, e.g. ```php
#
[tantek]
the triple-backtick thing is totally made up and violates Markdown principle 1
#
sebsel
but I guess something like >>> [name] would be way to parser-y, not realy user friendly
#
sebsel
yea reading up on that now
#
[tantek]
exactly, I'm strongly rejecting any not user friendly use of punctuation in plain text
#
[tantek]
this is why I'm considering naming my alternative / replacement for Markdown "Markdont"
#
sebsel
as a user of markdown, I don't really care. I am now used to markdown and read `this` as a piece of code now. I use it in all my e-mails to co-workers
#
[tantek]
I care because the whole point of communication is authoring for reading, not yourself
#
[tantek]
for *others* reading
#
sebsel
I should ask whether or not they understand me, probably
#
[tantek]
same reason use of jargon is bad as well as acronyms
#
sebsel
I see the single backtics as just another kind of quotes. And these are just used for code. My assumption would be that most people who understand the code, would understand my use of markdown
#
sebsel
But yes, that is an assumption
#
[tantek]
I prefer code blocks demarcated by comment lines in the syntax of that code
#
[tantek]
E.g. /* PHP *?
#
[tantek]
er /* PHP */
#
[tantek]
that's then both human and machine readable without punctuation abuse
#
[tantek]
triple backtick is total punctuation abuse and frankly you might as well just use markup once you start abusing punctuation
#
[tantek]
ok here goes my experiment
#
@t
“changes in … blood were potent but ‘transient,’ …. So activities would have to be repeated to provide any continuing [#cancer] protection, and it remains unclear how intense or prolonged that exercise ideally would need to be” — @nytimes ... https://tantek.com/t50E1
(twitter.com/_/status/1122551773695074310)
#
[tantek]
also my first use of a [inserted] hashtag
#
[tantek]
now to document my experimentation on /quotation 🙂
#
sebsel
[tantek] triple backtick and single backticks are different stories then
#
[tantek]
sebsel, yes single backtick I'm still considering, thanks to jeremycherfas's clarifications
#
[tantek]
still "feels" counter-intuitive (and hard/subtle to "discover" if you don't already know it)
#
[tantek]
yet is shorter than an inline comment technique like /* PHP */ x=1; /* */
#
@t
“changes in … blood were potent but ‘transient,’ …. So activities would have to be repeated to provide any continuing [#cancer] protection, and it remains unclear how intense or prolonged that exercise ideally would need to be” — @nytimes ... https://tantek.com/t50E1
(twitter.com/_/status/1122551773695074310)
#
Loqi
[Tantek Çelik] “changes in … blood were potent but ‘transient,’ …. So activities would have to be repeated to provide any continuing [#cancer] protection, and it remains unclear how intense or prolonged that exercise ideally would need to be” — @nytim...
#
sebsel
I believe it's also possible to add four spaces for blocks of code, but sometimes they are rendered as blockquotes... no standards and stuff
#
sebsel
if you do that <code> $x = 1; </code> is way shorter!
#
sebsel
or at least clearer in when it ends
#
[tantek]
yet something like /* PHP */ or /* CSS */ also conveys the language
#
sebsel
yes, but not all languages have an inline comment like that
#
sknebel
the CommonMark syntax also only works for multiline sections
#
sebsel
please correct me if I'm wrong, but I think Ruby and Elixir only use # to comment, which takes the rest of the line as comment
[fluffy] joined the channel
#
[fluffy]
[sknebel] pandoc has more options than html2txt but unfortunately they’re all pretty awful, I’m finding. At least with regards to `<ul>` and so on.
#
[fluffy]
for inline code you can use backticks i.e. ``` `$x = 1;` ``` (unfortunately Slack’s not-quite-Markdown has no way of properly handling backticks in its own inline backtick thing, really annoying!)
#
sknebel
but not with a specifier for the language
#
[fluffy]
oh, true
#
[fluffy]
in the various Markdown implementations I’ve used, backticks translate directly to <code> while fences translate into <pre><code> and if there’s a language specified that gets further sent through pygments or whatever.
#
sknebel
(re pandoc vs html2txt: yeah, lists seem to be a major source of problems. commonmark and the original markdown slighty differ there, so it's problematic. had some issues with bridgy publish because of that: html2txt does gruber-style, github uses an extended CommonMark)
#
[fluffy]
but I haven’t seen anything that tries to syntax highlight for inline.
#
Loqi
[fluffy-critter] #5 Better engine than pandoc/gfm?
#
sebsel
lol, nice how Slack uses markdown here to even obscure this discussion more https://seblog.nl/temp/media-endpoint/09266c-irc-vs-slack-markdown.jpg
#
[fluffy]
[sebsel] heh, good point
#
[fluffy]
the multi-protocol nature of this chat sure leads to a bunch of unfortunate weirdness.
#
GWG
I just figured out some possible exploits of the WordPress Indieauth code. Wonder if I should stop and fix them
#
GWG
Or wait since it would require an unlikely scenario to work
#
Zegnat
Depends on the severity of the exploit? If it is something that would allow anyone to authenticate as you: probably fix ASAP
#
GWG
Zegnat, it is more if you authenticate as user A you might be able to access info for user B
#
Zegnat
Oh, interesting
#
GWG
Since most WordPress sites using IndieAuth are single user, probably a slim issue, but still wrong
#
GWG
To fix it, I need to update Micropub, Microsub, and IndieAuth as all three have scope checking code
#
GWG
But clearly I need to implement some tests for this
#
GWG
The code checks for permission to edit a post, but not if it is your post to edit
#
GWG
Will fix
#
GWG
Oh well, more important than webmentions but figured this out while reading about private posts
#
Zegnat
Does the IndieAuth plugin really need an update for that?
#
GWG
Zegnat: For that what?
#
GWG
Well, I'd like to move more of the permissions checks from the endpoints, like Micropub, up to the IndieAuth plugin
#
Zegnat
for the token check? Feels like it shouldn’t care.
#
Zegnat
Aah, that makes sense then
#
GWG
Well, the problem is that the Micropub plugin does have a version of the code that uses an external IndieAuth endpoint.
#
GWG
I was surprised people wanted that.
#
GWG
Not sure why people would still want to use an external Indieauth endpoint over a built in one
#
GWG
Can anyone think of some good reasons?
#
Zegnat
Running multiple websites and wanting to login to the secondary blog with your primary identity
#
sknebel
really only one I can think of would be people that have multiple sites
#
sknebel
but then authenticating with their "home" endpoint (instead of WP login) during the flow would also come close
#
sknebel
and mean that the endpoints don't have to handle an external token endpoint etc
#
sknebel
Zegnat probably has thoughts on that
#
Zegnat
Many thoughts, haha. But currently I am feeling like token and auth endpoints really win from being connected. And as Microsub endpoint you need to be able to always trust the token endpoint. So they may as well be all within the same system for such a usecase.
gRegorLove joined the channel
#
Zegnat
I go back and forth on the whole thing.
#
GWG
So, is it worth having so much redundant code is my question
#
GWG
Because I now have three branches of the same code.
#
GWG
I would love to justify not having 3 versions
#
Zegnat
Rephrase that question to: “is it worth for Microsub and Micropub to support external auth and token endpoints?” And then find someone who actually needs that ability at all and ask them why they need it.
#
GWG
Well, there is the Authorization header issue
#
Zegnat
But that applies to all the requests, so isn’t specifically a reason to have only the auth endpoints externally.
#
GWG
Maybe this needs a blog post added to\ Indienews
#
sknebel
what are the 3 versions, out of interest? internal endpoints, external endpoints, ...?
#
Loqi
It looks like we don't have a page for "3 versions, out of interest" yet. Would you like to create it? (Or just say "3 versions, out of interest is ____", a sentence describing the term)
#
GWG
There is a file in Indieauth, Micropub, and Microsub called ___-authorize.php
#
GWG
It contains the code that takes a token and logs you into the site
#
GWG
If you have the IndieAuth plugin, it doesn't load the similar file in the other two plugins
#
GWG
The Microsub one is just a copy of the Micropub one.
#
GWG
The one in IndieAuth verifies tokens directly, the other two using an external verification process.
#
sknebel
that doesn't seem too bad, but is complexity, yes
#
sknebel
the header thing could(TM) be relevant to people using external services, e.g. Aperture?
#
sknebel
also, what was the upgrade path like when you added the internal endpoint? do people that had indieauth.com configured before still use that if they never changed it?
#
GWG
Yes.
#
GWG
It changed.
#
GWG
But now, if you install IndieAuth the plugin, you use the internal endpoint, otherwise the external
#
Zegnat
After a little more thinking, I think I personally like the option of having external endpoints available. But I honestly have a hard time thinking of why anyone who is not a power-user, or otherwise (freakishly) invested in plaing with their IndieAuth endpoints, needs that option. Feels like something a WP plugin developer should be able to safely drop.
#
GWG
I think I need to poll the active users.
#
GWG
And keep the other code somewhere, but not bundled if I do it
doubleloop, [Michael_Beckwit and eli_oat joined the channel
#
GWG
Let's see if anyone responds
#
Zegnat
Looks good GWG. And agreed on your observations re web sign-in.
#
Zegnat
First step to making things easier might be to remove that from the IndieAuth plugin.
#
GWG
Zegnat: Other than Indielogin, there isn't much prior art on the web signin protocol
#
GWG
It's unclear about whether that means I need to implement basic rel me auth
#
Zegnat
True. I guess it would just be "Login with IndieAuth" then? If you want to login with GitHub or another identity provider, it would make sense to pick dedicated plugins for that anyway
#
sknebel
just to clarify, there's two scenarios for using another site: one would be only replacing the sign in into wordpress with indieauth - the wordpress site authenticates you using indieauth, but provides its own endpoints to the client. the second is using the endpoints from the other site entirely, disabling the internal ones. (and technically, you could have a hybrid where auth endpoint is one and token endpoint another...)
#
sknebel
agreed with Zegnats last line about other identity providers
#
GWG
I could move the external IndieAuth stuff into a new web sign-in plugin.
KartikPrabhu joined the channel
#
Zegnat
Lot of mixed cases, sknebel. I am just trying to think about which ones are actually worth supporting by GWG. Some freeform combination could exist, but people using them maybe should just code-dive themselves
#
GWG
That's why I want to hear from users.
#
GWG
The biggest roadblock is the fact that people have trouble with the headers
#
sknebel
Zegnat: I wanted to clarify because of "The only use case presented for allowing an external site was…what if I want to sign into Site A with the credentials of Site B?" - and to me there's two different things in that realm
#
Zegnat
True GWG. But those users can't host micropub or microsub themselves either. So they just need a plugin that can add Link headers / meta elements (like the Aperture plugin)
#
Zegnat
Would a wholly separate plugin make sense for those people? One where you copy and paste external endpoint URLs?
#
GWG
Quite possibly.
#
GWG
That's why I thought of a web sign-in plugin
#
sknebel
(well, I guess micropub can do it with clients that use form-encoded)
#
Zegnat
I feel the need for a whiteboard with this discussion! Haha
#
GWG
Zegnat: Berlin?
#
Zegnat
If I make it there, or even if I end up remotely participating, I am always happy to do brainstorms!
#
Zegnat
Now I fill go sleep though. Back to fixing the daily issues at work tomorrow. Cheers!
#
sknebel
(side note: this is the point where the always-hated analytics would come in handy....)
#
GWG
I'm packing now
#
doubleloop
I'm trying bridgy fed, but getting this error from webfinger: Couldn't find a representative h-card on https://solarsailer.doubleloop.net/
#
doubleloop
I needed to rel=me into the h-card
#
sknebel
that seems like it's just that your WP site didn't understand the notification about the follow
#
doubleloop
Ahh, yup - that's probably it. I should turn on the option for a generic mentions page
#
doubleloop
sknebel++
#
Loqi
sknebel has 43 karma in this channel over the last year (111 in all channels)