#microformats 2022-02-18

2022-02-18 UTC
sarahd[d], capjamesg[d], antrdnv[d], Myst[d], indieweb-irc-bri, [Zeina], [aciccarello], [tw2113_Slack_], ur5us, [davidmead], KartikPrabhu, Loqi__, cygnoir[d], zack[m], diegov, mambang[m], KartikPrabhu1, [jgmac1106], angelo, IntriguedWow[d], [tantek] and jacky joined the channel
#
@jgmac1106
↩️ So what I did was save the audio files to my Internet Archive first, Made an HTML page on @glitch, and then use http://Granary.io to convert my microformats into an xml file for podcast feeds
(twitter.com/_/status/1494763197567492100)
barnaby joined the channel
andysylvester, ur5us and angelo joined the channel
#
willnorris
to be clear, that's just my fork. The canonical version is https://dissolve.github.io/mf2-tester/
#
willnorris
thought we could/should eventually move it to a more "official" hostname, either under microformats.org or microformats.io. (I also own microformats.dev)
#
jacky
ah yes :)
#
willnorris
all the REAL work was done by ben_thatmustbeme :)
#
barnaby
very cool tool! it’d be interesting to see how much better php-mf2 0.5.0 is than 0.4.6, which is being tested there
#
jacky
that's a static site; no? I wonder if that could be added at like https://microformats.io/matrix.html or something
#
capjamesg[d]
Wait so there is more work to be done on the Python parser?
#
barnaby
the img/alt parsing fixes a lot of those
#
jacky
I personally stopped at v2 for now until I had a consuming case for v1 (re: mf2)
#
barnaby
and the rest seem to be whitespace issues, which afaik there are also improvements to in 0.5.0
#
willnorris
jacky: technically, sure. Right now it's all built with GitHub Actions, so publishing to GitHub Pages is pretty straightforward. Getting it to publish to whatever serves microformats.org would be a lot more work, auth issues, etc
#
jacky
ah gotcha
#
jacky
wow lol `microformats-v2/h-card/extendeddescription` is kicking a lot of the parser's butt
#
jacky
hmm actually now I wonder
#
barnaby
jacky: pretty sure that’s just the comparatively new img/alt parsing not being available everywhere
#
jacky
ah gotcha
#
barnaby
which is the case for a lot of those tests
#
capjamesg[d]
Is there Python parser work to be done?
#
willnorris
barnaby: trying now with updated php dependencies. Unfortunately that will overwrite the existing results that include rust, but /shrug
#
barnaby
capjamesg[d]: if 1.1.2 (the tested version according to that matrix) is the latest, then the python parser needs img/alt parsing
#
barnaby
and some changes to whitespace handling
#
barnaby
willnorris: cool, thanks!
#
barnaby
and then there are some other failures which point to larger problems, e.g. an entire nested mf being missing here https://dissolve.github.io/mf2-tester/python/microformats-v1/hreview/item.json.diff.txt
#
capjamesg[d]
Good to know! I don’t know how hard this will be to tackle but I’ll take a look.
#
barnaby
img/alt should be easy, unsure about the others
#
barnaby
it’s been maybe 8 years since I helped tom get started on the python parser, so I’m not entirely familiar with how it works these days
#
capjamesg[d]
It looks like that function is actually already implemented under a feature flag.
#
willnorris
okay, https://willnorris.github.io/mf2-tester/ now reflects v0.5.0 of the php library. Now passes 6 additional test cases.
#
willnorris
I'll send Ben a PR for that as well
#
capjamesg[d]
It’s not on by default for backwards comparability according to the reader.
#
barnaby
great, looks like it’s mostly whitespace stuff remaining now
#
barnaby
capjamesg[d]: yeah img/alt being a breaking change for badly-written consuming code was one of the reasons to bump php-mf2 from 0.4 to 0.5
#
barnaby
(and for me to write an article warning people not to make the mistake which leads to it being a breaking change)
#
barnaby
well, perhaps “badly” is too harsh. “naively” would be more accurate
#
barnaby
willnorris: are the tests in the matrix from https://github.com/microformats/tests? because afaik php-mf2 passes those, unless they’re not included in the default test suite
#
willnorris
yes, this just vendors in microformats/tests
#
barnaby
hmm looks like I need to fix some more stuff in php-mf2 then
#
barnaby
good to know
#
willnorris
I mean, it's also possible there's a bug somewhere in how it's running the tests. Does php-mf2 run the shared test suite as part of it's normal testing?
#
barnaby
I thought it did, but it’s been a long time since I worked on the tests, and I didn’t look through them in detail when setting up the new CI recently
#
willnorris
That's what I do for the go client. I have the shared test suite setup as a submodule in the repo, and https://github.com/willnorris/microformats/blob/main/testsuite_test.go runs it as part of the normal testing process
#
barnaby
and differences between how whitespace is handled in e- and p- properties, which is weird because I’d assume that the value key of an e-parsed block would be exactly the same as a p-parsed property
#
willnorris
barnaby: I had some somewhat similar escaping issues in the go client that was causing tests to fail. I suggested fixing it in the tester (https://github.com/dissolve/mf2-tester/pull/4) but folks didn't like that. I ended up doing it in the library (https://github.com/willnorris/microformats/commit/820225570fa984be709885c95180dd1bff0d7dd6)
#
[KevinMarks]
The issue with json having multiple possible representations of unicode is a bit tricky
#
barnaby
I think in this particular case it’s curly quotes which are the problem, so just special-casing a couple of characters isn’t going to cut it
#
barnaby
I do remember needing to do some weird htmlentities and encoding juggling a long time ago to get DOMDocument to handle unicode correctly
angelo joined the channel
#
barnaby
maybe there’s a better way by now, if it wasn’t already improved
#
[KevinMarks]
You can have utf-8 representations or \u representationa iirc
#
[KevinMarks]
I vaguely remember discussing this a while ago, but it may not be in the parsing issues
#
barnaby
hmm okay it looks like the php-mf2 tests do pass the mf/tests suite internally, but only by subclassing Parser and adding a custom textContent implementation??
#
barnaby
which would explain why the tests pass internally but not in the matrix
#
barnaby
I need to look at this in more detail another time, I think
#
barnaby
I’m more concerned by the apparent official difference between whitespace handling in e-*.value and p-* properties
#
[KevinMarks]
It may be worth another pass through that with different parser implementations, and make some tests with emoji and other non basic utf-8 in.
#
barnaby
definitely +1 on having more tests using non-english characters
#
barnaby
and emoji too — any app which can’t handle emoji in 2022 is on the way to irrelevance ;)
Anders joined the channel
#
barnaby
Anders: did you have a specific use-case in mind for the directory-listing idea?
#
Anders
Fairly specific. As an example, if you run `python3 -m http.server` it will give you a simple file browser over HTML
#
Anders
It wouldn't be too hard to add h-dir, h-file, p-mod-time, p-size, etc to that listing and make it machine readable
#
Anders
Once you have that it would be easy to wrap it in a FUSE mount
#
Anders
Or rclone
#
Anders
So any webserver that implements this could be treated as a read-only filesystem
#
barnaby
Anders: sounds interesting, but what do you want to do with that?
#
[KevinMarks]
I have xoxo to json and back in unmung, and python and php xoxo implementations, but it is a bit separate from other microformats
#
barnaby
especially when webDAV exists already (albeit much more complex than what you’re suggesting)
#
barnaby
[KevinMarks]: I meant more applications which actually parse xoxo data and use it for something useful, not just showing that it’s there :D
#
Anders
Simpler WebDAV is basically the idea
#
barnaby
https://microformats.org/wiki/xoxo#Implementations mentions exactly one practical implementation on odeo, which no longer exists
#
Anders
One concrete example that would be useful for me. If you have a large directory tree that changes regularly but only a few files at a time, zipping the whole thing is impractical. Something like this would let you do incremental updates by crawling the tree and checking for files with changed modTimes or sizes
#
Anders
You could also make the spec a bit more complicated, and add reference links to thumbnails for images. That would let you implement a hole bunch of applications.
#
barnaby
the size and modtime stuff sound like things which are already accessible from a HEAD request and caching headers. I’m not sure I see the value in duplicating that information elsewhere unless it’s important that it’s human-readable as well as machine readable
#
Anders
Another example would be opening files natively in VLC for streaming (assuming the server/FUSE support byte range requests).
#
Anders
The difference is you have to HEAD each file individually, which quickly becomes way too many requests. It's a moot point anyway, because webserver developers are duplicating that information anyway for the listings I mention above. I'm simply advocating doing that in a standard way.
#
Anders
Imagine if you had to ls -l each file on your filesystem one by one if you wanted to see the size of everything in a directory