Loqicjwillcock: sknebel left you a message 1 day, 5 hours ago: congrats on your parser progress! It'd be great if you could put a testing page like https://php.microformats.io up, helps with manual testing and sharing results?
sknebelgenerally, good HTML parsers have been somewhat of an issue, since HTML5 allows a bunch of stuff older parsers don't get, and e.g. HTML minimizer tools tend to exploit all that's allowed
cjwillcockI was able to get around libxml not understanding the tags from 5 by adding the recover and noerror flags - but I wasn't expecting it have that open exploit, unfixed for +2 years :/
ZegnatThen there are definitely limitations, cjwillcock. In fact, we already know those limitations to be out there in the wild because that made us look for a userland one…
ZegnatOlder XML based parsers try to get it right by finding the location they need to close the P (before block elements) but they do not know <article> is a block element because it is an HTML5 element
cjwillcockexactly right. Running the html through the tidy extension first resolves it. So I can either internally use the tidy extension - or strip out libxml and replace with a good html5 parser
ZegnatThat’s what I meant. Official language part, or otherwise available as some sort of official extension/plugin. Like how Node offers file system functionality ontop of the V8/ECMAScript that powers it
Zegnatphp-mf2 will use the official/native/default DOMDocument parser of the language, unless you provide a userland implementation (which we recommend, because the official isn’t HTML5 safe)
@jgmac1106You know...I really like the minimalist features of Blogger and Classic Theme...
always said I just want a blank HTML box with all the plumbing..in a way I am getting this vibe
My stylesheet, my ideas, my HTML and now with microformats all my metadat… http://bit.ly/2SqCiB0 (twitter.com/_/status/1095461920650547206)