strugee, [jgmac1106], [eddie], [xavierroy], mickael, nitot, barpthewire, KartikPrabhu, [pfefferle], [Vincent], [adam], jgmac1106, [Rose], [kevinmarks] and [tantek] joined the channel
#Loqicjwillcock: sknebel left you a message 1 day, 5 hours ago: congrats on your parser progress! It'd be great if you could put a testing page like https://php.microformats.io up, helps with manual testing and sharing results?
#sknebelgenerally, good HTML parsers have been somewhat of an issue, since HTML5 allows a bunch of stuff older parsers don't get, and e.g. HTML minimizer tools tend to exploit all that's allowed
#cjwillcockI was able to get around libxml not understanding the tags from 5 by adding the recover and noerror flags - but I wasn't expecting it have that open exploit, unfixed for +2 years :/
#sknebelI somehow remembered seeing the maintainer of something else rant about that a few weeks back
#ZegnatThen there are definitely limitations, cjwillcock. In fact, we already know those limitations to be out there in the wild because that made us look for a userland one…
#ZegnatOlder XML based parsers try to get it right by finding the location they need to close the P (before block elements) but they do not know <article> is a block element because it is an HTML5 element
#sknebeland I think the one in HTML knew that, but didn'T know that <article> would force the close to happen
#Zegnat<div><p>Something<p>Something else</div> worked, IIRC. But as soon as HTML5 comes in it is game over
nitot joined the channel
#cjwillcockexactly right. Running the html through the tidy extension first resolves it. So I can either internally use the tidy extension - or strip out libxml and replace with a good html5 parser
#cjwillcockhowever, that use of tidy may not work in the case described in the CVE (I'll check it out)
#Loqi[inikulin] parse5: HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.
#sknebelwhat does "have actual html5 parsers" mean? afaik Go is the only one where it is part of the official language project
#ZegnatThat’s what I meant. Official language part, or otherwise available as some sort of official extension/plugin. Like how Node offers file system functionality ontop of the V8/ECMAScript that powers it
#sknebelbut the ones we use in php-mf2, mf2py and I think microformats-node are html5 parsers
#Zegnatphp-mf2 will use the official/native/default DOMDocument parser of the language, unless you provide a userland implementation (which we recommend, because the official isn’t HTML5 safe)
#[kevinmarks]Well, mf2py uses beautiful soup which can use html5lib.
KartikPrabhu, [jgarber], jgmac1106, [schmarty] and tantek joined the channel
#@jgmac1106You know...I really like the minimalist features of Blogger and Classic Theme...
always said I just want a blank HTML box with all the plumbing..in a way I am getting this vibe
My stylesheet, my ideas, my HTML and now with microformats all my metadat… http://bit.ly/2SqCiB0 (twitter.com/_/status/1095461920650547206)