#gRegorLoveIf the u-photo dfn included "if the entry has a content property, that should be used as the description for the photo(s)
#gRegorLove" that implies the u-photo shouldn't be embedded in e-content.
#gRegorLoveI haven't thought about #23 much since the IWC Austin 2020 conversation that touched on it, so don't have a strong opinion
#barnabywaltersas I mentioned in #23, I think that complexity of authoring UIs and consumers (as well as back-compatibility for consumers) are strong arguments for permitting u-photo, -audio, -video etc within e-content
#barnabywaltersif you allow them inside e-content, then any post authoring UI which allows HTML editing of e-content immediately natively supports image posts with re-ordering, alt text etc all via text editing
#barnabywaltersand people can also build dedicated UIs for managing all of that programatically if they want to, but it’s not required
#gRegorLoveI guess I'm not clear what the outcome of #23 will be other than a possible recommendation for publishers. It's not a parsing spec change, so u-photo will continue to be consumed regardless where it appears.
#gRegorLoveI also publish u-photo inside e-content
#sknebelsee the linked xray issue for why thats trouble for consumers
#barnabywalterswell mostly it’s about the official definition of the u-photo and related u-[main content] properties, right? and how their presence should alter what e-content is used for, if at all
#barnabywaltersI think part of that problem is that consuming HTML is hard. The mf2 parser makes it easier by narrowing down what HTML you have to deal with, but as soon as e-* properties are involved, consumers have to worry about potentially non-trivial transformations if they want to work with it
#barnabywaltersand I’m skeptical about how much this can be reduced by trying to force publishers to use properties in very specific ways
#Loqi[aaronpk] #52 Remove images from posts containing a photo
#barnabywalterse.g. if the consumer wants to use the plaintext e-content value, and sees that there’s a u-photo property, they could replace instances of the u-photo URL in the plaintext content with an empty string
#barnabywaltersor if they want to use the html e-content value, they could parse it and check for an img element with the u-photo url. If they find one, either remove it if they want to display the image themselves, or leave it in and know not to display the image a second time
#barnabywaltersIMO documenting cases like this, and coming up with algorithms, recommendations and software to help consumers handle them is the more productive approach
#aaronpk"could replace instances of the u-photo URL in the plaintext content with an empty string" sounds simple but it is not and it is very error prone
#barnabywaltersbut regarding that issue: I do agree that it’d be worth reviewing how plaintext values are generated, where to imply u-photo
#aaronpkalso see the examples i documented with alt text
#sknebelright, the "several ways" is part of the problem. now everyone gets to implement a long list of special cases, and not all software will implement them identically
#barnabywaltersaaronpk: yeah, consuming HTML is a giant mess, and microformats can’t solve all of the problems
#sknebeland everyone who doesnt do it like xray gets complaints ;)
#aaronpkit is *so close* to solving all the problems tho
#barnabywaltersaaronpk: regarding the suggestion of handling plaintext content by replacing occurrences of the photo url with empty string: where would this not work?
#aaronpksimple but contrived example is if the URL is also actually in the text for some reason
#barnabywaltersin what context is an exact copy of the photo URL going to find its way into the plaintext content
#aaronpklike a blog post that contains code samples
#barnabywalterscode samples are going to look like shit in plaintext anyway, sadly
#barnabywaltersand anyway, a blog post with code samples in isn’t likely to have a u-photo property, as it’s a blog post not a photo post
#aaronpkwell that's the other part of this discussion...which is what exactly does it mean to have a u-photo property and when should a post use it
#barnabywalterswell in that case, I don’t see why having photo URLs show up in plaintext content is any worse than having completely broken code samples show up there
#sknebelwhat is "completely broken" about a plaintext code sample in a plaintext post?
#sknebel(improved whitespace handling would help, but that's also somewhere on the todo pile ;))
#barnabywaltersindentation and formatting are likely to be broken unless the entire plaintext content is presented respecting whitespace, which is likely to cause other whitespace problems when displaying HTML content
#barnabywaltersIMO, for anything other than the most basic content, the plaintext version of e-* properties is a convenience mostly useful for debugging or very basic usage, and any more serious consumer is likely going to have to wrestle with the html and do whatever parsing, sanitizing and processing is necessary for their use-case
#aaronpkthat is probably true, but there's also a huge difference between handing off the HTML to an HTML sanitizer vs going and pulling out individual HTML tags from the document
#sknebeland you can make their job a lot easier, or at least reduce the amount of breakage trhough cases they havent covered, by recommending to not put the u- in the e-content
#aaronpkfor example i'm able to throw the e-content HTML at the main PHP HTML sanitizer and trust that the result will be usable without any further DOM fiddling
TallTed joined the channel
#barnabywaltersat least in PHP, it’s not too difficult to parse the HTML into a DOMDocument and e.g. search it for an occurrence of <img> with an src matching the contents of a parsed u-photo property
#barnabywaltersif you’re using php-mf2, then you already have a DOMDocument available, which has done a lot of the hard work of resolving URLs, dealing with encodings, etc
KartikPrabhu joined the channel
#sknebeland another few lines to special case <picture> tags, and ...
#barnabywaltersthis reminds me of a topic I was thinking about a little back when I was actively working on php-mf2, which was how can we improve parsers to make consuming mf2 and HTML even easier
#barnabywaltersone of the things I was thinking about was to have a parsing mode where each property additional contains a key which maps to the DOMElement it was parsed from, allowing consuming code to use the mf2 output to “reach into” the DOMDocument and get additional information, make changes etc
#barnabywaltersso say you have e-content and a u-photo. You get references to the DOMElements they were parsed from, check to see if the photo was inside the content, and if so, call a method to remove the photo DOMElement from its parent
#barnabywaltersthen you’d be able to get the inner content from the content DOMElement, knowing that it no longer contains the photo element
#barnabywalters(that’d at least handle the <picture> special casing you bought up…)
#barnabywaltersaaronpk: something like this might help close the *so close* gap you mentioned
#barnabywaltersanother thing which could be useful is to make the function which converts to whitespace more readily available, so it can be called on any DOMElement
#barnabywaltersthat way, consumers can find content, make whatever changes they want to its DOM representation, then call the toPlaintext function on the result
#barnabywaltersthere are always going to be fewer consumers than publishers, so IMO it makes sense to concentrate complexity at the consumers
#barnabywaltersand IMO the mf2 parsing spec doesn’t have to have all the answers, provided parser implementations give their users sufficient tools
PooPSGTech, [tantek], ben_thatmustbeme, [tw2113_Slack_], [snarfed], [chee], [aciccarello], [jacky], TallTed, barnabywalters and justBull joined the channel; kiroul left the channel