#microformats 2023-06-25
2023-06-25 UTC
joeyg, [aciccarello], sebbu, btrem and JKing joined the channel
# JKing I am writing a test case for handling of <area>, <img>, and <data> in Value Class Pattern processing (seemingly not covered by any extant test cases) and noticed that three out of five implementations I tried (PHP, JS, Python against Go, Ruby) don't behave as I'd expect. Could all three be wrong, or might the parsing documentation need updating?
# JKing It seems from the output like the three don't implement the special cases of those elements at all.
# JKing Given:
# JKing <div class="h-test"><div class="p-name">
# JKing <img class="value" alt="A">
# JKing <area class="value" alt="B">
# JKing <data class="value" value="C">c</data>
# JKing <data class="value">D</data>
# JKing </div></div>
# JKing The value of 'name' should be "ABCD" per the VCP docs, but is "cD" in PHP, JS, and Python.
# JKing How can that be? The very first step at http://microformats.org/wiki/microformats2-parsing#parsing_a_p-_property is to look for a VCP value.
# JKing I don't understand what you mean.
# JKing I'm debugging a processor I just wrote by following the documentation and available tests. Given that there was no test I could find for this part, I wrote the above. My implementation agrees with Ruby and Go, which use "ABCD" as the value for the p-name.
# JKing (and by PHP, JS, Python, Ruby, and Go, I mean the implementations referenced from microfrmats.io)
[pfefferle] joined the channel
# JKing sebbu: I think you may be misreading the test case and/or misunderstanding what I'm saying. There is indeed a p-name property. All five of the (what I assume are) mature implementations are doing VCP processing, hence no whitespace in the result. They only differ in whether they take information from alt= and =value attributes as stated at http://microformats.org/wiki/value-class-pattern#Basic_Parsing
[schmarty], btrem and [tw2113_Slack_] joined the channel
# Zegnat JKing: that looks to be a bug at least in the PHP parser. The problem is it is using the inner text for all elements with a value class and does not have the check for alt/value attributes as mentioned in the VCP spec. Feel free to open an issue on https://github.com/microformats/php-mf2
# JKing Zegnat: Okay, thanks for the confirmation.
# Zegnat The fact that multiple parsers are doing this wrong might also be because there is no good test for it in https://github.com/microformats/tests . In the perfect world, if you are implementing a parser yourself, you should be able to just go against the input and outputs from that repo ... so if you have created tests yourself that are not covered there, feel free to PR!
# JKing Yes, I noticed that.
# JKing I'll clean up my test cases and post a patch, then. I have a few.
# JKing That surprised me as well, which is why I wondered if it was intentional.
# JKing Happy to do it. Even with the omissions and vagaries in the documentation and holes in the test suite, it was pretty easy to write a processor from scratch. Two weeks to write and one week to test in my spare time, so not too bad.
# JKing The hardest part of testing was figuring out that multiple implementations were using a different text trimming algorithm (one of your devising, I understand) than what the parsing spec actually mandates. I did eventually find the documentation on the wiki, but you first have to know to look for it.
[tantek] and eitilt joined the channel
# [KevinMarks] Clarifying that would be a community benefit we'd all learn from