Ivy_Alpha[d], KartikPrabhu1, KartikPrabhu, edburns[d], Seirdy, gRegor, ur5us, [jeremyfelt], P1000[d], gRegorLove_ and IntriguedWow[d] joined the channel; KartikPrabhu left the channel
@CircleReaderSo, way back when, I tried encoding hReview microformats into some #WordPress book-blog posts, and even styling them like blocks — but I was never a full time front end web dev, just an enthusiastic learner & believer in what has now, apparently, become the #IndieWeb community. (twitter.com/_/status/1471183567438950400)
gRegorYeah, a stretch goal of mine for the indieweb gift calendar was to get that released. Would love any input and help. I think it's pretty much ready
ZegnatMaybe I was. Though most of the stuff I did last for the PHP parser was making it use the latest version of the test suite and massage those into something that made sense.
ZegnatI am ... ambivelent to the mbstring merge? My worry is not so much about the code as it is about the maintanence of said code. I personally do not want to maintain a charset sniffer as part of the mf2 parser just so we can get rid of multibyte functions (that I think ship by default with PHP these days?)
ZegnatI do not know of any PHP tools like ftfy. I also do not think it would help? From https://ftfy.readthedocs.io/en/latest/detect.html is sounds like ftfy already expects unicode in, and will try to not touch bytes that it deems are correct. IIRC the whole problem for mf2 parser in PHP is that the PHP HTML parser throws a hissyfit on certain encodings thus we need to force-normalise it wholesale?
ZegnatProbably because PHP lacks a real HTML parser, it is all handled by whatever version of libxml was pulled in by your distro when PHP was compiled.
[KevinMarks]PHP encoding is a dice roll by PHP version, as different ones assumed different encodings for bytes by default, and it tends to mix with MySQL, who add special new subsets too.