#microformats 2022-02-03

2022-02-03 UTC
sarahd[d], ur5us, jacky, Seirdy, KartikPrabhu, [tonz], [fluffy], hans63us[d], Seb[d], aspenmayer[d], [tw2113_Slack_], [Aaron_Klemm], Jeremiah[d], darkkirb, balupton[d], sknebel, [jgmac1106], IWSlackGateway, [KevinMarks], rattroupe[d], Zegnat, JSharp, gRegor, SemihCebraiL[d] and SemihCebraiL4853 joined the channel
#
sknebel
I think the "for each property in the list" bit could be clearer, even sticking to spec-lang, but I've also not found a good phrasing yet
[manton] and jacky joined the channel
#
jacky
heh so
#
jacky
actually wait I asked this before
#
jacky
(if a URL with a path component that's just "/" should be kept or removed)
#
jacky
and the answer was to remove it IIRC
#
[tantek]
you mean independent of microformats? depends on the context IMO
#
jacky
ah in the context in microformats
#
[tantek]
if you're displaying it, like for web sign-in, then no need to have a dangling slash on the end of the domain
#
jacky
this is in the case of parsing
#
jacky
I'm guessing that I should not do any particular massaging there
#
jacky
it's happening because anything that happens to be a string and parsed as a `u-` gets represented as a URL internally
#
sknebel
I'd argue "do whatever your environment does for URL normalization" and I suspect it'll keep the /
#
[tantek]
where does it say to drop it?
#
jacky
one of the test cases doesn't have one
#
jacky
is procuring a link
#
sknebel
sigh. probably old test case
#
sknebel
the url normalization in all cases is "new" (i.e. less tan 5 years old)
#
jacky
heh you called it
#
jacky
> 7 years ago
#
jacky
the <a href="https://example.com"> link
#
[tantek]
hah I can see how that happens
#
jacky
the thing is, I feel like that's "valid"
#
sknebel
yes-ish. we dont define what URL normalization means
#
[tantek]
"normalized absolute URL of the gotten value, following the containing document's language's rules for resolving relative URLs"
#
sknebel
but afaik most specs say URL have to have a path
#
sknebel
so they'll ahve the slash
#
jacky
the JSON doesn't use a trailing slash at ".items[0].properties.in-reply-to[0].properties.author[0].properties.url[0]"
#
jacky
aka "https://example.com"
#
jacky
wait so that's okay then?
#
jacky
and I should not add a slash (or have my parser avoid adding a slash?)
#
jacky
I was confused there for a bit and thought the opposite
#
sknebel
what does your environment do for URL normalization?
#
sknebel
if you dont add any extra handling, what does happen?
#
sknebel
(I assume you're not implementing URL normalization yourself)
#
jacky
I'm not, I'm using a lib
#
jacky
gets anotherl ink
#
sknebel
I'd argue with slash is more correct
#
sknebel
and thats what I expect libs to do
#
sknebel
but if yours does something different it'd be worth a second look
#
jacky
hmmm this is interesting
#
jacky
because the standard is saying "A URL’s path is either an ASCII string or a list of zero or more ASCII strings, usually identifying a location."
#
jacky
and this is _not_ abiding by that
#
jacky
potentially a bug!
#
sknebel
what is not abiding by that?
#
jacky
the Rust library for parsing URLs (which is used by Servo)
#
sknebel
what is it doing
#
jacky
it's adding a trailing / when it's not needed (or provided)
#
sknebel
thts correct
#
sknebel
so we and the spec and the lib agree, have the slash
#
sknebel
so the testcase is wrong
#
jacky
okay lol my fault i was confused as to what part of this is wrong lol
#
jacky
then in that case, I'll update the JSON file for microformats/tests
#
jacky
(unless you mean the test case I'm writing that's incorrect)
#
sknebel
yeah, fix the json
#
jacky
going to try to run this across more of the tests so I can make this one larger PR versus a bunch of bite size ones
jacky joined the channel
#
[tantek]
would normatively referencing the URL spec help make this more clear?
#
jacky
tbh it could
#
jacky
but I also see how this could be implied
#
jacky
_but_ it doesn't hurt to add
cygnoir[d] joined the channel
#
tantek
edited /microformats2-parsing (-45) "editorial: de-dup issues section, already have links in header, frag to maintain section link"
(view diff)
zachburau[d], gRegor and jacky joined the channel
#
jacky
now time for value class parsing
KartikPrabhu, ben_thatmustbeme, ur5us, jacky and Jack[d] joined the channel
#
jacky
value class datetime parsing is a hell of a trip
#
jacky
lots of cases to consider
#
[KevinMarks]
It is tricky, and easier to use a time element now
#
jacky
a three digit day?
#
jacky
oooh DDD => WXX
#
jacky
wait no
#
jacky
ordinal dates
#
sknebel
day XXX of the year, yes
#
jacky
hm I'm wondering if I should do this with regexes or just convert some of these to be handled by the datetime parser
#
jacky
hm well I'd only know what part of it to parse it as if it matches one of those formats
#
jacky
prob easier to have regex define what part it's in, combind it in the order of ISO 8601 / RFC 2282 and parse it as that unified strng
#
gRegor
php-mf2 uses regex, then normalizes it with PHP's datetime lib: https://github.com/microformats/php-mf2/blob/master/Mf2/Parser.php#L241
#
gRegor
which language are you building a parser in?
#
jacky
Rust atm
#
jacky
got some free time to finish up a PR I had
#
jacky
I might end up doing something similar to https://github.com/microformats/php-mf2/blob/master/Mf2/Parser.php#L741-L797 (like you said)
#
sknebel
/ Not using value-class (phew).
#
[tantek]
jacky++ "hell of a trip" yeah, that's basically the result of years of analyzing existing content publishing practices and coming up with markup patterns to match them as well as reduce chances for introducing errors by providing as much ability to stay DRY as possible
#
[tantek]
VCP is a good example of something that clearly was not designed theoretically, because it doesn't have a simple clean "look" to it that ultimately fails to actually match real world needs 😉
#
jacky
that's def clear from the examples and the format styles
#
jacky
I switched over to rel parsing b/c it was easier
#
jacky
wanted to leave the 'hardest' one for last
#
jacky
dang, is value-title parsing necessary? (lol)
#
jacky
looks like it is, it's in the test suite
#
jacky
groans
#
jacky
and I understand why it's helpful
#
jacky
but I kind of want to leave it out
#
jacky
(agggh)
#
jacky
this all kind of sniffs at a need/want to use something like data-*
#
jacky
but eh
#
jacky
[heh that wasn't that hard to support, lol]
#
[tantek]
not data-* but rather <data>
#
[tantek]
and yes, for the most part value-title should not be necessary for publishers using HTML5
#
[tantek]
it would be good to go back through all value-title use-cases and redo them using <data> just to present that as a tutorial for folks
#
[tantek]
eventually it would be good to deprecate value-title as only being necessary for backcompat
#
sknebel
yes, I suspect in practice it is that
#
[tantek]
for the moment we could add some publishing warnings to use <data> instead
#
sknebel
I wonder if we have it documented anywhere outside the parsing spec for mf2
#
sknebel
(people always tell me I can't assume that publishers read the parsing spec ... :P)
#
[tantek]
less documentation of value-title is ok, if our intent is to deprecate it
#
[tantek]
sknebel I believe it has its own page
#
sknebel
right. and yes, the question was if there is documentation where people might stumble over it that needs warnings, or if its "contained" :D
#
tantek
created /value-title (+70) "page for easier linking"
(view diff)
gRegorLove_ joined the channel