#microformats 2016-02-29

2016-02-29 UTC
fuzzyhorns joined the channel
#
tantek
edited /microformats2-parsing-issues (+342) "/* exclude style elements before parsing */ drop both style and script when parsing"
(view diff)
#
tantek
greetings mf2 parser developers, I've made a proposal to resolve the style (and script) tag issue, please review and provide feedback: http://microformats.org/wiki/microformats2-parsing-issues#exclude_style_elements_before_parsing
#
tantek
tommorris, kylewm, aaronpk, KevinMarks, et al
#
aaronpk
edited /microformats2-parsing-issues (+152) "/* exclude style elements before parsing */ +1"
(view diff)
fuzzyhorns joined the channel
#
tantek
ah, yes it does!
#
aaronpk
I know I said I wasn't going to write actual code for the php mf2 parser, but I just broke that rule
#
aaronpk
but I have it removing script tags now, I think.
#
aaronpk
oh darn nevermind, I need to make it search recursively :(
#
aaronpk
ok yeah this is hard
#
aaronpk
I think I've got it
fuzzyhorns joined the channel
#
@ProjectPeachUK
We've #played with #microformats. Love the #idea of #marking up our #business data to #machines as well as #humans! #biznoticeUK #cpp
(twitter.com/_/status/704130279326289920)
davidmead, fuzzyhorns, Zegnat, netweb, dogada, tantek, adactio and Garbee joined the channel
fuzzyhorns, nitot, tantek, TallTed, misa_ and JohnBeales joined the channel
#
tantek
edited /microformats2-parsing-brainstorming (+1177) "/* Parse language information */ additionally id, and consider html-lang parsed property name"
(view diff)
#
misa__
hello channel, just found this line on the microformats wiki: microdata was explicitly dropped by the W3C (and therefore not part of W3C HTML5) due to a lack of interest by anyone to edit the spec and keep it up to date.
#
kylewm
edited /microformats2-parsing-issues (+21) "/* exclude style elements before parsing */ +1"
(view diff)
#
tantek
misa__: welcome, and yes that's a summary result from the W3C HTML Working Group discussions on microdata
#
misa__
I understand that, and just a note here, I am quite unexperienced regarding these concepts so I would appreciate some guidance from more experienced folks, just this seems a bit outdated as microdata blog seems to be quite a busy place...
#
tantek
misa__: it's up to date. there has been no further work on microdata at W3C so it's still just as dead/dropped from a W3C standards perspective
#
tantek
people work on all kinds of things outside W3C
#
tantek
which is fine too, it just means those things are not W3C standards
#
tantek
misa__: if you're interested in the latest work on microdata-like added markup, take a look at microformats2: http://microformats.org/wiki/microformats2
#
@kykyorg
Вдогонку и видео дня: Лео с Оскаром в руках рассказывает киноэлите про глобальное потепление: http://kyky.org/microformats/2016-02-29/video
(twitter.com/_/status/704346910224678913)
#
tantek
microformats2 is an even simpler replacement for microdata
#
@kykyorg
И видео дня: Лео с Оскаром в руках рассказывает киноэлите про глобальное потепление: http://kyky.org/microformats/2016-02-29/video
(twitter.com/_/status/704347134519349248)
#
misa__
yes I can see that, just deciding with what to get going is somehow overwhelming, then there is schema.org... could you care to share your thoughts in a brief on that?
#
tantek
misa__: schema.org is a Google run effort, with some contributions from Microsoft and Yandex. It's not an open standard.
#
tantek
I sympathize with the sense of being overwhelmed
#
tantek
hence why a lot of us have worked on simplifying things with microformats2
KartikPrabhu joined the channel
#
misa__
so as far as indieweb is concerned this is not something i should care about?
#
misa__
(google's shema.org)
#
tantek
correct, it's been pretty much completely ignored
#
tantek
there has been some use of OGP / Twitter Cards as a fallback for some use-cases, but schema is largely ignored for being overly complex and unnecessary
#
tantek
welcome KartikPrabhu !
#
KartikPrabhu
was just checking logs
#
KartikPrabhu
misa__: you should decide what to use depending on what you'd like to use if for
#
KartikPrabhu
Schema.org has not been very useful for indieweb things mainly due to its complexity of parsing and publishing
#
misa__
indieweb seems very right for me, I can comprehend most of its concepts, just this whole thing with microdata/microformats and the rest got me overwhelmed...
#
tantek
edited /microformats2-parsing-brainstorming (+141) "/* Parse language information */ first instance of id attributes only as a way to de-dup / uniqueify id attrs at parse time"
(view diff)
#
tantek
misa__: microdata is ignorable for indieweb. no one is actively using it.
#
KartikPrabhu
misa__: I would suggest starting with microformats as they are the simplest ones to publish
#
tantek
there may be a few folks publishing a few microdata things experimentally, but it's never gotten any traction in the peer to peer independent web
#
tantek
misa__: glad to hear indieweb seems very right for you! come on by to #indiewebcamp and say hello to discuss indieweb concepts :)
#
KartikPrabhu
schema.org is so horribly over-thought that it is ignorable too
gRegorLove joined the channel
#
tantek
welcome gRegorLove !
#
ben_thatmustbeme
aaronpk: moving that over here
#
KartikPrabhu
amp;dr ;)
#
aaronpk
so notice how I made the value of in-reply-to just the URL, even though it's actually an h-cite on my site
#
aaronpk
and then moved the actual h-cite content outside the main entry
#
aaronpk
my goal with XRay (and jf2) is to reduce the number of exceptions you have to deal with when consuming a page
#
ben_thatmustbeme
hmmm, interesting
tantek joined the channel
#
ben_thatmustbeme
so if you had just an embedded object, would it also be just in "refs"
#
aaronpk
so rather than if(is object) {...} else if(is url) { ... } it's just always a URL, and if you want you can check if there's extra data about the URL
#
aaronpk
not sure about that case yet
#
ben_thatmustbeme
or is refs specific to thinks like in-reply-to like-of, etc
#
aaronpk
i'm taking the opposite approach we originally took with jf2
#
aaronpk
i'm explicitly adding things to the output when there's a reason to, rather than trying to map a complete mf2 document to this output
#
tantek
aaronpk, that's how mf2 JSON was built
#
tantek
"explicitly adding things to the output when there's a reason to"
#
tantek
hence the evolution of how the parsed rel values made it in there
#
aaronpk
this is one level above that though
#
tantek
so it will be interesting to see if you come to similar/different conclusions
#
aaronpk
at the author level
#
tantek
so was mf2 JSON
#
aaronpk
e.g. someone can put an h-card anywhere on the page, and it will end up who knows where in the mf2 JSON
#
tantek
started at the HTML author level
#
ben_thatmustbeme
i like the idea of pulling it all out, almost like the refs: section could be completely ignored since you have to fetch the content anyway
#
aaronpk
i'm only interested in that h-card if it has explicit meaning that I can consume
#
tantek
"put an h-card anywhere on the page" - then the h-card likely has different meanings, so it makes sense for it to show up different places in the mf2 JSON
#
ben_thatmustbeme
aaronpk: it looks like you are doing [] specifically for some values but not for others
#
aaronpk
actually tantek.com is a great example. There's an h-card as the last child object of the top-level Tantek h-card
#
tantek
aaronpk - that's not author-centric (as you claimed originally), that's *consuming* centric ("i'm only interested ... if it has explicit meaning that I can consume")
#
aaronpk
I have no idea what that means, so it's not going to show up in the XRay output
#
aaronpk
I didn't say author centric, I said author-level
#
tantek
but you're not doing author-level either, you're doing consuming-level
#
aaronpk
what was your intention of marking up that Rebecca Daniels h-card?
#
tantek
it's a reference to a person
#
aaronpk
(as a child h-card of your top-level one)
#
tantek
as a publisher, it makes sense to markup all your content that's meaningful with established microformats
#
ben_thatmustbeme
can we get back on to the point we were discussing?
#
aaronpk
there's no other references to it though, so it has no context. For example if there was some other object on the page with a u-url of rebeccadanielsphoto.com then I might know what it's for
#
tantek
right, no other context, and that's ok
#
aaronpk
and at that point it would show up in the "refs" list
#
tantek
all you know is, this is a person that is referenced on this page
#
tantek
that's it
fuzzyhorns joined the channel
#
tantek
so e.g. if you have a tool that shows you a list of people on a page, you can display them
#
tantek
(there are such tools like Operator FF add-on)
#
tantek
and that's useful because you can do things like keep a history of where people were mentioned
#
tantek
(e.g. in the browser)
#
tantek
histories of people mentioned are useful for things like search, auto-complete etc.
#
tantek
plenty of applications for even minimal context like that
#
tantek
just maybe not your specific application today
#
ben_thatmustbeme
perhaps JF2 is evolving to a more specific use case of social rather than just general microformats
#
tantek
point is, if it doesn't make sense to your application, you can just ignore it
#
ben_thatmustbeme
microformats has a JSON representation already
#
aaronpk
ben_thatmustbeme: yeah that's kind of what I'm thinking
#
tantek
ben_thatmustbeme: that's how it starts, but as you add more use-cases, you'll likely end up making something very similar
#
aaronpk
the problem is when a property can be either an array or a string, then both cases end up needing to code exceptions for
#
tantek
e.g. every use-case I listed above for random h-cards on a page *IS* social
#
aaronpk
in mf2 json everything is an array, so most of the time you're doing [0] to get the first. but when jf2 makes a property a string if there's only one value, then you have to do a bunch of checks
#
tantek
aaronpk: precisely why that design decision was made for mf2 json
#
ben_thatmustbeme
what if we just make this one rule aaronpk, as soon as you hit somthing that has a specific URL outside the domain context (non-authoritative content) we move it over to refs. Basically we could do that as the only change to the MF2 JSON and see what we get
#
tantek
so consuming code wouldn't have to do "bunch of checks" (or at least fewer)
#
aaronpk
i'm not saying that's bad, just what it is
#
tantek
right
#
tantek
which I'm happy to see the alternative being explored
#
aaronpk
so with the XRay output, I made it vocabulary-aware, so that it's easier to consume when you know what your'e consuming
#
aaronpk
e.g. "this is an h-entry. if there is a published date, it will always be a string. if it's a reply, you can find all the URLs it's in reply to in the array 'in-reply-to'"
#
ben_thatmustbeme
so, one of the biggest complaints i keep hearing is the need to check if something is an array, single item, or object
#
aaronpk
also the value of "in-reply-to" will never be an object with this, since if it was an object in the mf2 JSON, that object gets moved down to refs and the URL of the object is the value in the array
#
ben_thatmustbeme
shouldn't author: be a single array item then? couldn't you have multi-author posts?
#
tantek
where do you hear these complaints ben_thatmustbeme ?
#
aaronpk
some of them are from me
#
aaronpk
but i've heard that from others as well
#
ben_thatmustbeme
i have heard others, i do not have citations right now, will try to keep them noted down from now on
#
ben_thatmustbeme
and i do rather agree with them, it is sort of annoying
#
aaronpk
it's very annoying. annoying enough that i'm encapsulating all this logic into XRay so I don't have to do it again
#
aaronpk
I need this for: readers, showing reply context, showing comments/reactions
#
ben_thatmustbeme
again, you are assuming only one author ever?
#
aaronpk
well so far i haven't seen any multi-author posts
#
aaronpk
and even if there was one, 99.9% of all posts i encounter are single author
#
ben_thatmustbeme
notices you aren't processing comments either
#
ben_thatmustbeme
is that just haven't gotten there yet?
#
aaronpk
no not yet. like I said I am only adding things when I want to consume them
#
ben_thatmustbeme
comments could likely just get reduced to a list of urls too
#
ben_thatmustbeme
unless they comment directly on the site
#
ben_thatmustbeme
i know some allow that
#
aaronpk
most likely I'm going to make the "comment" property a list of URLs, and the actual comment objects will live in the "refs" below
#
ben_thatmustbeme
just for comments that don't have a URL, what do you do?
#
aaronpk
if there's no URL for a comment (including no fragment URL) then I'm just going to drop it, since nothing will be able to do anything with that comment anyway
#
ben_thatmustbeme
i don't think thats true, salmention would still work with a comment that doesn't have a url
#
aaronpk
stick a fragment URL on the inline comment and then it's useful again
#
aaronpk
ben_thatmustbeme: in practice, any consuming code trying to handle something that doesn't have a URL isn't going to end up with good results
#
ben_thatmustbeme
i feel like i'm responding just a moment too early to you
#
aaronpk
combine that with tantek's earlier suggestion of XRay returning the object inside the fragment identifier and then fragment comments act just like comments with their own URL
#
ben_thatmustbeme
indeed, i'd love for the mf2 parser to be able to do that directly actually
#
aaronpk
i guess that's a totally fine job for the mf2 parser
#
tantek
yeah!
#
ben_thatmustbeme
still not totally sold on all the items that have been dropped, (location, shortlink, name)
#
aaronpk
name hasn't been dropped
#
aaronpk
but it's only there when it's actually a name
#
aaronpk
e.g. it gets removed if it is the same (or a subset) of the content
#
ben_thatmustbeme
or a subset? that seems... wrong. As most people will reference the title of a post in the content
#
aaronpk
er, prefix
#
aaronpk
it's what's described on comments-presentation
#
aaronpk
now all of a sudden "name" is useful again
#
aaronpk
i'll add location soon
#
aaronpk
basically every property on http://microformats.org/wiki/h-entry should show up if present
#
ben_thatmustbeme
all seems to make sense, looks like uid and logo aren't carried over, but those aren't really needed / prefer url over uid and logo is just photo again
#
ben_thatmustbeme
indeed, so there is never any x- prefixes or anything parsed
#
ben_thatmustbeme
trying to think of a good argument for shortlink, i feel like it is needed as an authoriative alternate url
#
aaronpk
btw i'm not sure this is actually the best step for jf2, which is why i've just been building this as an API, but this is how I want to consume pages
#
ben_thatmustbeme
which is different from other redirects
#
ben_thatmustbeme
i know i made a bunch of optimizations with that
#
ben_thatmustbeme
some no, some yes
#
ben_thatmustbeme
i actually really like the refs: idea
#
ben_thatmustbeme
anything non-authoritative becomes SUPER easy to just throw away if you don't want to look at it
#
aaronpk
yeah, it's more like you have two options to find out about a URL that's in the in-reply-to or whatever
#
aaronpk
you can check the refs property, or go fetch the URL yourself
#
ben_thatmustbeme
i may look at just applying that to a straight mf2 json output to see what it looks like
#
ben_thatmustbeme
keeping the "always an array" idea, and just cleaning up all that stuff
#
aaronpk
interesting idea
#
aaronpk
i think the rule would be if the object has a url property, replace the object with that URL and move the object to the refs array
#
aaronpk
btw gRegorLove do you have a sec to review my PR to the php parser? https://github.com/indieweb/php-mf2/pull/83
#
gRegorLove
Looks good at a glance, without testing. I think the innerText method should remove the script and style, but I'm not aware of any problems explicitly removing them first, either.
#
aaronpk
i think innerText does
#
aaronpk
that's what's used for the "value" property
#
gRegorLove
Oh, you're stripping it from the 'html' value?
#
aaronpk
but the html property is built up with the calls to $node->C14N() which does not remove them
#
gRegorLove
Heh. Forgot my own bug report :)
Calli, fuzzyhorns, Left_Turn, KartikPrabhu and Garbee joined the channel
#
tantek
edited /Template:MicroFormatCopyrightStatement (+53) "or already have submitted, updating purely for temporal prose accuracy"
(view diff)
#
tantek
edited /rel-tag (+85) "note rel-tag incorporated into HTML5"
(view diff)
tantek, fuzzyhorns, uf-wiki-visitor and mkaply joined the channel
#
mkaply
Has anyone else used microformats-shiv? I'm not seeing the results I expect and I'm trying to figure out what I'm doing wrong.
#
mkaply
I'm using it against this page.
#
mkaply
Getting no h-cards
#
aaronpk
you might need to try with actual HTML, not just wiki text
#
aaronpk
also that page looks like it only contains microformats1 objects, not sure if the microformats-shiv library parses those or only microformats2
#
mkaply
It's supposed to parse 1. It's been so long since I've touched this stuff. I guess it's time to bring Operator back from the dead.
#
aaronpk
well if you're literally parsing the wiki URL, you probably won't find anything there, since it's all escaped HTML
#
mkaply
aaronpk: It's parsing the DOM directly, so it should be finding the nodes. Must be something else going on
#
aaronpk
that's what i'm saying though, the HTML on that page is stuff like class="vcard">
#
aaronpk
it doesn't actually have any microformats in it
fuzzyhorns joined the channel
#
mkaply
I tried tantek's page too. same result. i must be doing something wrong. I'll keep looking
KartikPrabhu, MeanderingCode, fuzzyhorns and tantek joined the channel
#
tantek
mkaply: not sure. maybe ping mixedpuppy? I know he had tests working.
#
mkaply
tantek; i figured it out. I was passing a string as filters. I opened a bug against shiv to handle that.
#
tantek
were you able to get it to work without a filter?
fuzzyhorns and Chordachi joined the channel
#
mkaply
Yes. It's working now. Oddly if I add the filters: ["h-card"], it hangs the browser. But if I don't specify a filter, I get the h-card
#
tantek
that *is* odd. another bug?
#
mkaply
I'm debugging now. It's strange because it does work in our tests.
fuzzyhorns and KartikPrabhu joined the channel