#microformats 2019-08-19

2019-08-19 UTC
galaxie, mauz555, Eyes, [tantek], [xavierroy], [KevinMarks], [prtksxna], IWSlackGateway, [fluffy] and [grantcodes] joined the channel
#
Zegnat
Re discussion a couple days ago: I am totally in favour of documenting what normalised URL means in mf2. I agree with the sentiment that it isn’t actually a valid URL without path component.
[jgmac1106], capDiscord[m], [xavierroy] and [jgarber] joined the channel
#
[jgarber]
Zegnat: I’ve opened a PR on the test suite based on the conversation here 👉 https://github.com/microformats/tests/pull/111
#
[jgarber]
Feel free to weigh in there!
[prtksxna], jackjamieson, mauz555 and [tantek] joined the channel
#
[tantek]
Zegnat note that browsers provide URLs without the /
#
[tantek]
Eg when you type in “archive.org” into the URL bar, it goes to https://archive.org (without any trailing slash)
#
[tantek]
So I’m not sure that behavior matches user expectation
#
[tantek]
Also when I type in “tantek.com” or “https://tantek.com” into any web sign-in UI, I expect my identity to be “tantek.com” on the service, not “tantek.com/“
#
[tantek]
Lastly if you look at site URLs in print they all lack the trailing slash
#
[jgarber]
The IndieAuth spec disagrees with you on that particular point, but there are notes that its on implementors to sort out the trailing slash in profile URLs, _not_ end users.
#
[tantek]
So if anything we should normalize the without the slash because that’s where all the evidence is
#
[tantek]
We should normalize URLs to match user expectations, not some plumbing nerdery
#
[jgarber]
I _think_ the intention is that normalization is an implementor’s concern, not an end user’s concern. So, authors could still mark up URLs as `<a href="https://tantek.com" class="h-card">Tantek</a>`.
#
Loqi
totally
#
[jgarber]
…and parsers would be responsible for whatever normalization behavior we collectively agree is appropriate.
#
[tantek]
Except that implementors will then by default display that normalized URL and break user expectations
#
[tantek]
A spec that causes implementors to break user expectations when they do the “simple” thing is wrong
#
[jgarber]
One could argue then that microformats-adjacent technology is already breaking that expectation (see: IndieAuth).
#
[tantek]
Nope. My sign-in on the wiki is User:Tantek.com not User:Tantek.com
#
[tantek]
So there is no “already breaking”
#
[tantek]
However if you pick this / normalizing then you’ll lower the barrier to LOTS of other implementors breaking user expectations
#
[jgarber]
One data point: when I enter the literal string `https://sixtwothree.org` into the sign in box on https://indieauth.com, I see a message reading:
#
[jgarber]
> Authenticate using one of the methods below to sign in as https://sixtwothree.org/
#
[tantek]
So if you want to fix anything, fix everything to normalize without the /
#
[tantek]
Even you in your website title drop the trailing slash
#
[jgarber]
Related data points:
#
[jgarber]
Ruby’s URI class has a `normalize` method that will add the trailing slash. The Ruby gem Addressable has a similar method that also adds the trailing slash.
#
[jgarber]
_To be clear_, I’m not wild about the trailing slash but I’m trying to figure out where the middle ground or commonalities in behavior are between specs, libraries, and test suites.
#
[jgarber]
…and maybe there isn’t a middle ground. 🤷‍♂️
mauz555 joined the channel
#
[jgarber]
Maybe a solution is to update the microformats wiki/specs/etc. and clarify what the community intends “normalization” to mean.
#
[tantek]
I believe we reference HTML parsing for that
#
[tantek]
Do we “normalize” anything besides URLs?
#
[tantek]
Pretty sure we don’t because typical other uses of “normalization” (Eg numbers and dates) add artificial precision which is a loss of information (about the degree of precision)
#
[tantek]
Eg look at how much code / specs adds :00 seconds just to satisfy some datetime validated somewhere (like Atom)
#
[tantek]
That adding of :00 seconds is a lie
#
[jgarber]
The value class pattern page (http://microformats.org/wiki/value-class-pattern) discusses normalization, but that might be related to the Atom pattern noted above.
#
[jgarber]
The “lie” as it were. 🙃
#
[jgarber]
`p-rsvp` values are spec’ed as “Case-insensitive values, normalized to lowercase.”
#
[jgarber]
Truthfully, I’m happy to withdraw the PR on microformats/tests if there’s not support for the change.
zdunn and [schmarty] joined the channel
#
[tantek]
The value class pattern tries hard to maintain the fidelity of the data found, and only does syntactic normalization AFAIK
#
[tantek]
I’d prefer *not* normalizing / vs not on domain URLs
#
[tantek]
That kind of normalizing, where some application has to determine if two things are “the same” is better done in the context of the specific application, instead of upstream in an abstract parser
#
[tantek]
The specific application may do all kinds of other normalizing that apply to its use case of comparing for equivalency
#
[tantek]
With p-rsvp, because it’s an enumerated set of values, it makes sense to normalize to that set
[jgmac1106] joined the channel
#
KartikPrabhu
[tantek]: there might be "normalisation" in value-class-pattern for dates, but I might be mis-remembering
#
Loqi
KartikPrabhu: Zegnat left you a message 2 weeks, 2 days ago: if you want to see the next step up from /textcontent-parsing: https://github.com/Zegnat/php-innertext
#
KartikPrabhu
actually, datetime parsing in VCP does not have any "normalisation" it seems to allow for every pattern under the Sol!
#
[tantek]
KartikPrabhu, there's a bunch of syntactic normalization here: http://microformats.org/wiki/value-class-pattern#Date_and_time_values, note the sentence with "the parser assembles the overall datetime value by concatenating"
#
[tantek]
[jgarber] the only date time normalization there that at all resembles the Atom criticism I made was the implied :00 minutes for any time specifying only an hour, e.g. 4am or 5pm, because in those contexts (with the am/pm indicator), the published intended precision *is* 4:00am or 5:00pm, and the dropping of the :00 is for abbreviation purposes, not to imply lack of precision
#
KartikPrabhu
I also agree that the "trailing slash" normalisation is best left to the specific application
#
[tantek]
Aside: in all the years of date-time precision discussions, I have found exactly *one* that seems to imply a strong :00 seconds when only hours and minutes are provided, which is United "boarding ends" times. They literally close the doors at the time indicated with :00 seconds. Which kinda sucks because it means that if you get to the gate at the boarding ends time (during that minute), you'll likely be denied boarding.
#
[tantek]
so conceptually you have to keep in mind that the last minute to actually board is the posted "boarding ends" time minus one minute
[Lewis_Cowles] joined the channel
#
[tantek]
getting back to moving to using a GitHub repo for the rel registry — I didn't see any opposition or concern with the general idea / proposal
#
[tantek]
would anyone here like to work with me on figuring out the details of the move? (from microformats.org/wiki/existing-rel-values to a new repo in github.com/microformats )
KartikPrabhu, ichoquo0Aigh9ie, [bdesham], [fluffy] and [KevinMarks] joined the channel
#
[KevinMarks]
I can help
KartikPrabhu joined the channel