#microformats 2020-07-26

2020-07-26 UTC
[tantek] joined the channel
#
[tantek]
Pages are datestamped
#
aaronpk
this is why i suggested we collect examples
#
aaronpk
because mine are not
#
[tantek]
They have an actual datetine of when they were published (whether or not it’s displayed), and an actual date time of when they were last updated (even if not displayed)
#
aaronpk
if it's not displayed, then it wouldn't have microformats properties for those attributes
Tekk_ joined the channel
#
[tantek]
And you can roughly look up both of those via the internet archive
#
[tantek]
Correct this is why dt-published and dt-updated are optional and why they make sense as optional
#
[tantek]
You will not find a meaningful difference between a “page” and a random note permalink.
#
aaronpk
except that everyone who actually publishes these thinks of them as completely different things
#
[tantek]
Certainly not a big enough of a difference to justify a whole new microformat
#
[tantek]
Sure, and that’s usually reflected in the path
#
aaronpk
among other things, which i am suggesting we collect examples of
#
[tantek]
Doesn’t matter if you “think” of them differently, the duck rule still applies
#
[tantek]
Hence why I said good luck finding a meaningful difference
#
aaronpk
i am suggesting lack of published date is likely an indicator, but we need examples
#
aaronpk
so far this holds true for me and chrisaldrich examples
#
[tantek]
I think you’ll find all combinations of presence absence of published & updated
#
[tantek]
And all of them can be represented unambiguously with h-entry
#
[tantek]
Yup, by path
#
[tantek]
There are date-like paths and paths without dates / years
#
aaronpk
and lack of published date
#
aaronpk
i'm talking about showing a published date on the page
#
[tantek]
Arguably presentational since even most such “pages” have precise creation & updated dates from their git repo
#
aaronpk
presentational is the whole point
#
[tantek]
And in their file system
#
aaronpk
internal data storage isn't really relevant
#
[tantek]
It’s absolutely relevant as the presentation changes over time for fashion or stylistic reasons
#
aaronpk
styling changes, but if you stopped showing the content of a note for example, it wouldn't really be a note anymore
#
[tantek]
Even some “blog posts” lack published dates because annoying SEO folks say to not display it because you don’t want your articles to look “old”
#
[tantek]
But you can usually find it buried in the metacrap
#
[tantek]
By default you can pretty much markup pages as <html class=h-entry><title class=p-name>...</title><body class=e-content>...
#
[tantek]
Especially if you ignore all the SEO junk advice for title elements
#
[tantek]
They’re all entries on your website
#
aaronpk
do you have a better suggestion for distinguishing pages?
#
aaronpk
the distingshing feature does not need to be in the scope of microformats, but it would be convenient if it were
#
[tantek]
I think there’s too much of a “strongly typed” programmer vibe around that kind of question
#
[tantek]
It matters more why you should treat them differently
#
aaronpk
no this is based entirely on user features
#
[tantek]
Rather than coming up with an abstract set of types
#
aaronpk
but if we're leaving the scope of microformats, this discussion should continue in #indieweb or #indieweb-dev
#
[tantek]
I think we already walked down this discussion years ago when figured out that properties matter, not some abstract type hierarchy
#
aaronpk
it's not abstract
#
aaronpk
or a hierarchy
#
[tantek]
It is, that reasoning is what gets you to AS object type hierarchy
#
[tantek]
That’s the path you end up with that thinking
#
aaronpk
the problem is specifically that an editing app wants to show lists of things to edit, and people want those lists to be either their time-ordered posts or their pages
#
aaronpk
people want their pages to show up in a different list from time-ordered posts
#
aaronpk
hence my suggestion that lack of published date is an indicator
#
[tantek]
Time ordered posts are typically part of a feed. That’s one useful distinction
#
[tantek]
But everything has (physically) creation and updated datetimes. So there’s no getting around that.
#
aaronpk
eh that's not really relevant if those dates aren't shown
#
aaronpk
there's plenty of other properties posts have in internal implementations that we don't consider
#
[tantek]
What’s more relevant is the path of the stuff
#
[tantek]
So show lists of things to edit from the root, and feeds you can discover from there. Done
#
aaronpk
eh, "done" is like saying "it's just" or "simply"
#
[tantek]
Whether something is part of a time ordered feed or not is the interesting part
#
[tantek]
And has nothing to do with displaying dates or not
#
[tantek]
Point is there’s already enough information to make those UI choices
#
[tantek]
You don’t have to make up yet another artificial distinction
#
aaronpk
what happens if someone's set up their blog to not include any date info in the URL? that's an option in wordpress even
#
aaronpk
so there are 3 potential differences: whether a date is displayed, the URL of the page, whether the entry is contained within a feed
#
aaronpk
we need to document examples
#
[tantek]
I think that’s too narrow a framing and is actually missing the larger point of the UI flow and the broader range of possibilities
#
[tantek]
As an editor you start with their root, their domain, their home page and then you have a discovery challenge, not a “type” challenging, and certainly not a “type” challenge between only two types. That’s super narrow thinking IMO
#
aaronpk
that doesn't match the UI of most editors though
#
aaronpk
that's a very filesystem-centric view
#
[tantek]
That kinda seems irrelevant?
#
[tantek]
Content matters not legacy UX
#
aaronpk
not sure how this got turned into "legacy UX"?
#
aaronpk
like i said before, this may not be a microformats issue
#
[tantek]
You said “most editors” = legacy UX
#
[tantek]
Eg if you’d said “most feed readers” perhaps it’d be more obvious how bad a framing that is
#
aaronpk
i would say that FTP editors are legacy UX and "start with the root" is a very FTP editor view of it
#
[tantek]
We literally had to say no, most feed readers are crap that is stuck in the early 2000s or bad versions of 1990s email UI
#
aaronpk
this is a current discussion because people are creating new apps right now and need to solve this problem
#
[tantek]
That’s a pretty big leap
#
[tantek]
Solve it creatively, not by mimicking “most editors” which are likely similarly stuck in 20+ year old thinking
#
[tantek]
Never said FTP, you implied that
#
aaronpk
i'm not about to go tell someone how to design their app tho
#
[tantek]
There are way smarter ways to edit starting from a “home” than hierarchical folders 🤦‍♂️
#
[tantek]
Depends if their app cares about the diversity of sites or not
#
aaronpk
that's why i'm saying we need to document examples
#
aaronpk
look for commonalities
#
aaronpk
this "page" idea is pretty common, let's figure out what makes something a page
#
[tantek]
And you will find further distinctions, different kinds of pages
#
aaronpk
i feel like i'm getting a lot of pushback on the idea of collecting examples and researching this and you're jumping to the conclusion of "just use h-entry" when that doesn't actually solve the problem at hand
#
[tantek]
I’m pushing back because I think you’re assume page vs post is the important distinction and I’m saying no, page/post vs collection/feed is actually a much more important distinction and there are likely even more such distinctions
#
[tantek]
You’ve framed the problem too narrowly even before you collect examples
#
[tantek]
That’s my point
#
[tantek]
And yes, absent actually failures/needs page/post is well represented by a root level h-entry, and collection/feed/archive by an h-feed
#
[tantek]
(Feel free to ignore the h-card as root case for now 🙈, except as an example in the back of your mind that there are even more distinctions that an editor will likely have to care about)
#
Tekk_
Is there any particular historical reason the spec needed to define implied properties with a whole bunch of implication rules instead of doing the (at least feels right to me) other possibility of just saying "you need to have these things to be valid"?
#
Tekk_
Specifically thinking about the 10 different fallback rules that had to be defined to save the poor author the trouble of having to write class="p-name"
#
[tantek]
yup, historically formats that say "you need to have these things to be valid" are a lot more fragile. See XML, Atom etc.
#
[tantek]
we also learned from some of the early requirements on original microformats, and watching authors get them wrong anyway, despite various validators warning them etc.
#
[tantek]
short version: making requirements like that doesn't actually result in more valid content. so if the goal is valid content, define how to process it rather than making extra work for the author.
#
Tekk_
Fair enough, any reason for not going the other way? (e.g. the fields are optional and supported parsers don't have to jump through 30 different hoops to try and find it because you didn't specify.)
#
[tantek]
all properties are both optional and potentially plural
#
[tantek]
ordinality is another requirement that most formats got horribly wrong it turns out (were based on theory rather than actual real world examples and author publishing behaviors)
#
Tekk_
Oh, fair enough. I didn't really see any must/should/suggested distinctions when looking through the wiki so I wasn't entirely clear on that.
#
[tantek]
we optimized for the common cases of things having a name, image, link
#
Tekk_
Mhm
#
[tantek]
it's such a common pattern of publishing things on the web, that it adds a tremendous amount of efficiency to do the "obvious" thing for those cases.
#
Tekk_
So it's more or less: MUST parse even if 'essential' data is missing, SHOULD try and infer the missing essential data, with a suggested algorithm for doing so?
#
[tantek]
took some iterations on the implied rules (based on publishing and parsing experiences), but I think we're in pretty good shape now, especially for p-name and u-url
#
[tantek]
nope, they're in the parsing algorithm which makes them all MUSTS because there is no optional steps there
#
Tekk_
In my particular case I'm only interested in writing on the publishing side, so it's not the end of the world for me. Just something I got curious about.
#
Tekk_
Ah
#
[tantek]
better to have as few SHOULDs (and especially MAYs) in format processing/parsing specs
#
Tekk_
Unless it comes to the actual file format iself, apparently ;)
#
[tantek]
flexibility is for authors, not for parser developers
#
Tekk_
If stuff like name or photo can be plural, is there any reason there's not like, an "m-canonical" class (to pick some arbitrary prefix for 'meta')? That'd solve the small h-card vs big h-card problem too.
#
Tekk_
Though I saw some pages suggesting u-uid for the latter.
#
[tantek]
right, u-uid can be used for that
#
Tekk_
But u-uid doesn't make sense for say, a name.
#
Tekk_
If I go by multiple names.
#
[tantek]
u-uid isn't about the name, it's about the object
#
Tekk_
Yeah but having like <div class="h-card"><p class="p-name u-uid">Foo Barson</p></div> doesn't make any sense
#
[tantek]
all the properties are about the object, not each other
#
[tantek]
right because that text content isn't a URL so you're likely to get an unexpected result there
#
[tantek]
like if that was on example.com, the u-uid would be example.com/Foo%20Barson
#
Tekk_
Yeah. So it works for the full h-card use case but u-uid is a noverly specific concept
#
[tantek]
it's always applicable to the full object
#
Tekk_
If I'm say an author who uses pen names it makes a perfect amount of sense to say <div class="h-card"><p class="p-name">Michael Jordan</p><p class="p-name">Jules Vernee</p><p class="p-name">Jonathan Frakes</p></div>
#
Tekk_
If I'm commenting on someone's site and that site wants to grab my h-card to know what name to put on the comment, which should it grab?
#
[tantek]
yes you may do that
#
[tantek]
good question!
#
[tantek]
that's a better question for #indieweb-dev since you're asking an IndieWeb specific question "commenting on someone's site and" … "what name to put on the comment"?
#
Tekk_
That was just an example I concocted for a scenario where you might want or need a person's "canonical" name.
#
Tekk_
Just like you might want a canonical URL among their rel=me entries
#
Tekk_
Or heck, if I'm just looking on their page they may have multiple sets of pronouns and I want to know which they prefer
#
[tantek]
pretty sure somewhere in the spec it talks about any specific application needs to "pick one" should (not must) pick the first one
#
Tekk_
Isn't relying on what order the elements happen to be in a bad idea when you could just trivially make something more specific?
#
[tantek]
nope, because element order typically reflects content order which is both predictable for authors and usually reflects their own importance
#
[tantek]
and nothing trivial about making something else that needs to be parsed
#
Tekk_
Sorry if this comes off as arguing in bad faith but that seems kinda weird in light of the previous discussioN?
#
Tekk_
Specifically that it's out of the question to expect implementors to go through that whole fallback process and then gracefully handle the case of "well it turns out they don't even specify 'A Picture Of The Grand Canyon' on their picture so we can't make that their name" but it's a ridiculous burden to have them check which child has a "canonical" attribute attached.
#
Tekk_
Which like, the fallback system isn't the end of the world it just seems entirely reasonable to me to go "If they don't specify a picture in their card then they just don't have a picture and that's okay"
#
jacky
agh I napped during this one :(
#
Tekk_
Hey jacky
#
[tantek]
sorry Tekk, I'm not following. In both cases, the model / requirements are simpler for the author.
#
Tekk_
So the rule of thumb is just "try to correct the author's mistakes no matter how malformatted the data is"?
#
Tekk_
Which, I suppose is already how html and css theoretically work, but that's how you end up with garbage fires like CSS :)
#
[tantek]
not quite. it's more about providing a simpler and predictable model for the author, that deliberately optimizes common cases to not require extraneous markup
#
Tekk_
If you're defining a card as having a name, I wouldn't say that it's extraneous to require them to include a name.
#
[tantek]
which is the opposite of most formats, which attempt to optimize for the most complex edge-cases and then burden all uses with that complexity (see also namespaces, RDF, etc.)
#
jacky
oy this page vs post thing was one I wanted to be involved in
#
jacky
the distinction is something that comes up a _lot_ and is very prevalent in a lot of sites / CMS
#
[tantek]
yeah the insights I had about it are pretty new. and compared to page/post vs collection/feed
#
[tantek]
I feel like "editor" UI needs to be rethought as much as we rethought "feed reader" to come up with "social reader", but that's probably better discussed in #indieweb-dev
#
jacky
I definitely take a lot of my ideas/inspiration on the editor experience from Ghost (which, in turn, is inspired by a mashup of Tumblr and Wordpress's approach)
#
jacky
that said, pages definitely are more distinct and are even made more so by the definition of h-entrys (and how pages don't fit into that)
#
jacky
h-entrys are encouraged to have timing information which is something that pages tend not to have
#
[tantek]
pages often have published and modified dates. e.g. see every mediawiki install, Wikipedia, IndieWeb etc.
#
[tantek]
pages often have published and modified dates. e.g. see every mediawiki install, Wikipedia, IndieWeb etc.
[chrisaldrich] joined the channel
#
[tantek]
I'm curious why you think they don't fit — we've been using h-entry markup for various pages for quite some time and they seem to fit just fine
#
jacky
Wikipedia pages are more like articles if you were to run them by PTD, no? And they _have_ to have that kind of messaging due to its need to show a log of changes
#
jacky
whereas as a page like https://jacky.wtf/live.html wouldn't need something like that
#
[tantek]
via Internet Archive, all pages have an implied log
#
jacky
I guess now that's a distinction thing, some things have implied + distinct logs and others (the majority that I come across) don't
#
[tantek]
and perhaps the majority now have revisions in databases or git so they have logs, just not surfaced in the presentation by default (but could be)
#
[tantek]
and if they're static files then they also have created/modified dates in the file system, very similar to published/updated dates
#
[tantek]
a log isn't essential per se
#
[chrisaldrich]
wow... what a scrollback...
#
[chrisaldrich]
Another difference between a post (article) and a page is that of the difference between the garden and the stream... posts come and go, but use cases for pages in the wild give them a more curated permanence. This difference generally isn't indicated in the metadata, but more often by the fact that they're included in a menu.
#
[tantek]
posts don't come and go, they have permalinks
#
[tantek]
[chrisaldrich] the garden vs stream distinction is what I was trying to get at. whether something is in a feed/archive/collection or not!
#
[tantek]
(plenty on that in the scrollback)
#
[chrisaldrich]
They all have permalinks. But my /about page has more "permanence" and prominence on my site than the post I made on 2016-08-10.
#
[tantek]
that distinction (is it part of a feed or not?) is much more relevant than artificial post vs page distinctions
#
[chrisaldrich]
I'd actually think that there's a case to be made that "pages" would be beneficial to be put into a stream/feed -- especially when they're updated-- but almost no CMSes or systems do this in practice.
#
[chrisaldrich]
Wikis may be the general exception, but in that framing, there aren't people usually using wikis to post notes.
#
jacky
The storage of the log doesn't matter though; that's something that's, once it's visible, highlighted/surfaced _because_ it exists
#
[chrisaldrich]
The real question here is the problem we're trying to solve: how can one specify a page one would like to put into a more prominent space, menu, other in a way that an editor can target it for creation/update?
#
[tantek]
It seems like the only real distinctions are 1. is it part of a (time ordered) feed, and 2. does it have a URL based on its name more than a date
#
[tantek]
neither of those have to do with its actual contents
#
[tantek]
which is why I think the focus on what is a page vs a post (by looking at their contents) is bit futile
#
jacky
my posts don't use date-centric URLs (so it'd be only the first thing for me)
#
[chrisaldrich]
And maybe I don't even want to distinguish between the two with a url-based date within my system?
#
[chrisaldrich]
WordPress is an abysmal set up in that it has the same general UI for posting both posts and pages, but differentiates between them internally within the database. Even then, I could make a post look like a page by placing it into my menu.
#
[chrisaldrich]
The difference between them is semantic, we just need something to specify that semantic handle so we can do something with them. N'est-ce Pas?
#
jacky
Hm, I didn't consider the semantics angle but that's true as well
[tw2113] joined the channel
#
jacky
like entries lean into the web journaling language of the Web
#
[chrisaldrich]
pages/articles seem to be a superset of all of the various post types... even reads, watches, listens are really just a "bookmark" of a thing that got read, watched, or listened to.
[fluffy] joined the channel
#
[tantek]
or passive responses (bookmark is an active response)
#
KartikPrabhu
I'm still not sure what the "actual" difference is for someone consuming/reading your website
#
aaronpk
It's not a consuming difference, that's why I kept suggesting it may not be a Microformats concern
#
aaronpk
It's really an authoring difference
#
KartikPrabhu
yeah I get that part
#
[tantek]
I think it's primarily a URL path difference. Even in WordPress I believe that's the biggest difference
#
[tantek]
that and pages don't go into feeds AFAIK
[mapkyca] and [chrisaldrich] joined the channel
#
@tomer
↩️ יש לך שגיאה לוגית. מצד אחד, אתה כותב שב־MDN אפשר למצוא רשימה של כל התגיות שאפשר להכניס בשדה, ואז כעבור שתי פסקאות פתאום מספר על דברים שלא מצאת ב־MDN. תחליט כבר. אולי כדאי להוסיף את הדף הבא מ־http://microformats.org שמכיל גם הרחבות: http://microformats.org/wiki/existing-rel-values#HTML5_link_type_extensions
(twitter.com/_/status/1287305862596431873)
gRegorLove, [KevinMarks] and [grantcodes] joined the channel
#
[grantcodes]
There are plenty of examples of posts with top level paths and no date in the URL
#
[KevinMarks]
Like mine
[manton] and [tantek] joined the channel
#
[tantek]
Right, that too
#
[tantek]
So really there’s no relevant difference between “posts” and “pages” except for whether it’s part of a feed (or feeds?) or not
#
[tantek]
Absence / presence of a property (published) is seriously insufficient to justify a whole new microformat
#
[tantek]
I feel like the UI argument of “people want to pick one or the other” is the same mistaken argument as “people want to explicitly pick a photo post vs a note vs an article etc before thy even start creating” which we already debunked many years ago as legacy UI
#
[tantek]
If it really is just about does it show a published date or not, then make *that* the option. Or do it automatically (or as a default) if they check
#
[tantek]
[x] include in feed
#
[tantek]
Or perhaps there’s a 2x2 grid of options, whether it’s in a feed or not, whether it displays published/updated or not
#
[tantek]
In that respect a “page” is a lot like an “unlisted” post
#
[tantek]
KevinMarks, do your top level posts without dates in the URL appear in a feed or not?
#
[tantek]
If you don’t want to display the published/updated date then have an option like
#
[tantek]
[x] timeless (don’t show creation, published, updated date times)
#
[tantek]
Or if you really want to call it a page, treat that as shorthand for unlisted + timeless, and default path is the name of the post
#
[tantek]
so maybe “page” is just an artificial construct that means unlisted timeless article typically, and in some cases (mediawiki) still dated
jamietanna, strugee, [schmarty] and [KevinMarks] joined the channel
#
[KevinMarks]
Well, it's all manual, so they do if I remember to update the feed.
#
[KevinMarks]
Also, I sometimes forget the date in the article page.
[tantek] and [tw2113] joined the channel