#microformats 2015-06-10

2015-06-10 UTC
KartikPrabhu, elux, fuzzyhorns and tantek joined the channel
#
ben_thatmustbeme
is still lost at times on why p- and not u-
#
ben_thatmustbeme
<a class="p-author h-card" href="http://martin.example.org/">Martin</a>
#
ben_thatmustbeme
why wouldn't that be u-author ?
#
aaronpk
it appears to have the same parsed result either way, not sure
#
tantek
aaronpk: only because the "u-* h-*" parsing fix hasn't happened yet in implementations except for mf2py
#
tantek
ben_thatmustbeme: to answer your question, we may end up allowing either and then expanding the authorship algorithm to handle both
#
tantek
which would then put a slight preference on u-author since by getting the author's URL you can likely get more information than just their name
#
ben_thatmustbeme
so the non-technical answer is really what data you are focusing on as the primary item
#
aaronpk
at some point i need to see some sort of written sumary about what's changed in microformats parsing
#
aaronpk
hard to follow the IRC chatter
dym_cx, eschnou, tantek, KevinMarks_ and kez joined the channel
#
hendrick
edited /hcalendar-authoring (+198) "/* Related Pages */"
(view diff)
kez, Zegnat, ChiefRA, csarven, eschnou, chiui, pfefferle, pfefferle_, KevinMarks__, Left_Turn and KevinMarks_ joined the channel
#
csarven
Dear LazyMF , Are @rel self and bookmark changed in mf2?
adactio and KevinMarks_ joined the channel
#
@wpscouts
Author hReview : Ajouter une Evaluation étoilée á WordPress et des Témoignages d'Utilisateurs http://wpscouts.com/author-hreview/ via @wpscouts
(twitter.com/_/status/608575044797739008)
pfefferle, KevinMarks__ and glennjones joined the channel
#
KevinMarks
csarven: you mean rel=me ?
#
KevinMarks
rel=self is a weird Atom thing
glennjones, eschnou and pfefferle joined the channel
#
@hirameki
(掘り返し) h:1130 ひらめいったーをAutoPagerizeのMicroformatsに対応させる。させろ from id:fuba http://ryogrid.net/idea/twit/1130
(twitter.com/_/status/608596980428738563)
#
csarven
KevinMarks IIRC from mf1, "self bookmark" was used as a permalink
#
csarven
I suppose in mf2, its equivalent is u-url
netweb, glennjones, KartikPrabhu, pfefferle, TallTed and eschnou joined the channel
#
ben_thatmustbeme
do any implementations of the parser support the p-audio yet?
#
ben_thatmustbeme
or rather u-audio
#
kylewm
csarven: rel-self is an Atom thing, not part of microformats
#
kylewm
rel-bookmark is equivalent of u-url, yes, though there is proposed backcompat parsing for it (supported in mf2py only right afaik)
#
kylewm
right now* afaik
#
kylewm
(oh sorry, I didn't see KevinMarks already replied about re-self)
#
csarven
I was talking about self *and* bookmark together. It is about the current document's permalink.
#
csarven
"bookmark external" would be for.. external links
#
kylewm
there's no rel-self defined for html at all
#
kylewm
with or without bookmark
eschnou, pfefferle and fuzzyhorns joined the channel
#
csarven
kylewm OKie.. then I'm mistaken. I don't remember why I had self in the first place. Probably pre-2007 stuff.
#
kylewm
csarven: possibly for Pubsubhubbub?
#
csarven
I really can't remember. No big deal :)
KevinMarks_ and KartikPrabhu joined the channel
#
kylewm
KevinMarks_: hullo, seeing a different encoding problem now. when the test case doesn't define a charset, BS4 guesses, and in the case of Tantek Çelik it guesses windows-1252
#
kylewm
"If you happen to know a document’s encoding ahead of time, you can avoid mistakes and delays by passing it to the BeautifulSoup constructor as from_encoding."
#
kylewm
do you think we should pass in utf-8 always? maybe just for all tests?
KartikPrabhu and KevinMarks__ joined the channel
#
KevinMarks__
Will that override actual encoding? There are lots of sites that use non utf8 encodings, especially Chinese and Japanese
#
KevinMarks__
If it just changes default that's good.
#
kylewm
do you think we should pass in utf-8 always? maybe just for all tests?
#
KevinMarks__
The Unicode dammit stuff tried to deal with guessing encoding
#
kylewm
blargh
#
kylewm
no, it overrides
#
kylewm
yeah but it guesses wrong on these short samples
#
kylewm
even with chardet installed, it guessed wrong for t's name
#
kylewm
>>> soup.original_encoding
#
kylewm
'ISO-8859-2'
#
KevinMarks__
We could just add encoding to the test
#
kylewm
yeah that's what i will propose
#
kylewm
<meta charset="utf-8"> fixes in
#
KevinMarks__
I mean the meta tag for it, not a change to the test runner
#
kylewm
collapsing whitespace and ignoring extraneous keys didn't reduce the number of failing tests as much as I'd hoped
#
KevinMarks__
The real scary case is when the encoding varies within a pagw
#
KevinMarks__
This used to be a big problem with blogs copy and pasting quotes
#
KevinMarks__
And if they had a windows nonbreaking space char in 0x80 the utf8 decoder would throw an exception
#
kylewm
UnicodeDammit.detwingle :)
ben_thatmustbeme, KevinMarks_ and tantek joined the channel
#
ben_thatmustbeme
so in going through as2 there are a couple of things that i think make sense, but just had questions on
#
ben_thatmustbeme
first was supporting multiple sources for audio / video / picture tags
#
tantek
depends on *why* you're publishing multiple sources
#
ben_thatmustbeme
right now they would all get clumped under just "audio" correct?
#
ben_thatmustbeme
well for video its about browser support
#
ben_thatmustbeme
a reader would need both if they want to re-include
#
tantek
if you can answer that question, then we can look at the proper HTML markup, whether it's different formats for different UA support, or different resolutions for devices, or different sizes for bandwidths
#
tantek
nope, for video it's all 3
#
tantek
and that's where we start asking/looking for real world publishing examples of multiple sources of audio / video / picture
#
tantek
to determine how examples should mark them up
#
tantek
or rather, to determine *what* specific problems the examples should show solutions for
#
tantek
rather than attempting to solve an arbitrary m x n x o cubespace of possibilities
#
ben_thatmustbeme
this is where I was testing pin13's support
#
ben_thatmustbeme
more of told me that it doesn't support u-audio grabbing src= value
#
tantek
more of?
#
ben_thatmustbeme
s/more of told me that it/actually it just confirmed that pin13s parser/
#
Loqi
ben_thatmustbeme meant to say: actually it just confirmed that pin13s parser doesn't support u-audio grabbing src= value
#
ben_thatmustbeme
but that shows properly parsed values. the question is how to group the multiple audio files which are just alternates
#
ben_thatmustbeme
that looks like 4 seperate files in the parsed version
KevinMarks__ joined the channel
#
ben_thatmustbeme
added a p-name too so it has something better in there
#
kylewm
possibly you aren't passing url to the parser?
#
tantek
ben_thatmustbeme: "just alternates" is ambiguous
#
tantek
alternates in what dimension(s)?
#
ben_thatmustbeme
alternate encodings of the same video content, they are grouped in an <audio> tag which is intended for that
#
tantek
is anyone publishing that?
#
ben_thatmustbeme
audio tag i am already using though i don't rewrite code for multiple versions yet
#
ben_thatmustbeme
in indieweb not sure anyone is, outside of indieweb it shouldn't be too hard to find some samples
#
KevinMarks__
It's less common with audio, as mp3 is ubiquitous. Rarer to see m4a these days
#
KevinMarks__
Also rarer to see different quality choices for audio based on bandwidth
#
aaronpk
silo example of different encodings of the same video content: youtube
#
tantek
right this is my point - we need examples of multiple sources on real world sites
#
tantek
in order to come up with *practical* examples for such
#
tantek
in my experience I've seen more JS randomness for doing weird format detection stuff
#
tantek
than use of multie <source> audio or video :/
#
tantek
hence the challenge to actually show a real world example with *markup* for multiple sources
#
aaronpk
i actually used to publish multiple encodings of my videos on my site, but can't find the examples anymore
#
aaronpk
but that was pre-HD days, so it was like 320p vs 480p
#
ben_thatmustbeme
I only do one right now because i haven't built that part out yet. Wasn't going to bother until i fixed up mobilepub
#
tantek
people also do weird UA-detection stuff on the server and then only serve one source etc.
#
tantek
point is - I'd say ditch the multi-source examples for now
#
tantek
rather, ditch the *theoretical* multi-source examples, and if anyone objects, ask for the same kinds of real world example citations that we're asking for above
#
aaronpk
as a publisher, I would much rather publish only full 1080p HD because rendering multiple formats is annoying and takes a long time
#
ben_thatmustbeme
agreed, transcoding is annoying
#
ben_thatmustbeme
though it can be pretty important for mobile
#
ben_thatmustbeme
ideally the format wars can just be relegated to "in the future browser makers will finally just support 1"
KevinMarks_ joined the channel
#
tantek
ben_thatmustbeme: agreed that in theory it can be pretty important for mobile - but if it is actually important, then finding real world examples, and thus documenting them, should be easy
#
aaronpk
that's becoming less and less true tho, because my phone now gets faster internet than my office
#
ben_thatmustbeme
aaronpk: same here, haha, we have DSL in our office
#
ben_thatmustbeme
so i can accept that for now, a single <source> is sufficient
#
ben_thatmustbeme
until need arises
#
ben_thatmustbeme
my second question was on supporting language of an h-entry
#
ben_thatmustbeme
though i know tantek already said (other channel) that it can be difficult to convince publishers to include such information
KevinMarks__ joined the channel
#
ben_thatmustbeme
My last question thus far will likely get relagated to "real world examples" which is what I will be working on soon. Its on having some way to convey a "type" of h-entry or action taken
#
ben_thatmustbeme
I will be publishing any activity i do, as an h-feed
#
tantek
ben_thatmustbeme: yes, my point is that there must be specific incentive for authors to do something before they will, and do it right (keep it up to date)
#
ben_thatmustbeme
but while posts you can determine things like "like" from the existance of u-like-of, ways to determine create of a post vs edit of a post are not so clear
#
tantek
without such incentive, it doesn't matter what others (parser devs, readers, etc.) *want*, the authors will either ignore them, or worse, copy / paste from a template and get it wrong, or neglect it and let it get out of sync
#
ben_thatmustbeme
unless there is a bunch of u-edit / u-create etc
#
tantek
huh? topic switch? confused
#
ben_thatmustbeme
sorry, moved off the second question as it didn't seem to be getting a response
#
tantek
sorry, I'm on the CSS telcon so trying to multitask ;)
#
ben_thatmustbeme
and the language thing was just more of a curiousity
#
tantek
anyway, historically, lang attribute has been pretty crappy
#
tantek
especially lang=en, due to aforementioned template/copy/paste problem
#
tantek
it's the simple proof that authors won't care or will get wrong, things they don't have sufficient incentive to get right
#
ben_thatmustbeme
ahh true, a lot of peopel jsut copy paste that
#
tantek
therefore it is better to *not* ask them to do something, then to ask them, and have them get it wrong and provide *noise*
#
tantek
something which lots of wishful thinkers continue to get wrong in terms of asking authors to do extra work
#
ben_thatmustbeme
switching to other question. I will be publishing data on creating / editing / deleting posts. an h-feed of things done, not actual posts.
#
ben_thatmustbeme
i already do this with incoming events (though i have yet to have anyone edit a reply or anything)
#
tantek
edits are actual posts
#
tantek
also - this is probably more of an #indiewebcamp topic so let's take it there
gRegorLove joined the channel
#
KevinMarks__
Language of an entry I'd look at Stephanie Booth for examples
#
tantek
KevinMarks: could you explain author incentive though?
#
KevinMarks__
Steph posts in English and French, and posts a summary in the other language
#
KevinMarks__
And wrote a WordPress plugin for other people to use too
KevinMarks_ joined the channel
#
KevinMarks_
She uses both visible [fr] [en] markup and lang on the hentry
chiui joined the channel
#
tantek
KevinMarks - but why? what's the incentive to do so besides markup geekery?
#
tantek
what code consumes that markup and does anything with it?
#
tantek
says as a fellow markup geek
KevinMarks__ joined the channel
#
KevinMarks__
Steph's deeper point is about perception of language use
#
KevinMarks__
We are all imperfectly multilingual in different ways
#
KevinMarks__
Assuming that a text has a single language is an oversimplification
#
KevinMarks__
Which Google insists on, to the point of telling you not to use more than one language as they treat it as a bug
#
tantek
classic programmer-think: your natural behavior is too complex for our code, change your natural behavior to conform to our limited world model
#
KevinMarks__
They do use rel=alternate hreflang
#
tantek
that's got to be an existing anti-pattern defined somewhere
#
tantek
programmer blinders or something
#
tantek
could you document/cite that on microformats.org/wiki/rel-alternate
#
KevinMarks__
Stephanie gave a tech talk about it at Google (I invited her to)
#
KevinMarks__
Also programmers assuming there is a simple country:language mapping
#
aaronpk
ah if only it were that simple
#
KevinMarks__
Stephanie lives in Switzerland, the French speaking bit
#
KevinMarks__
Many us sites assume that their .ch site should be in German
KevinMarks_ and elf-pavlik joined the channel
#
KevinMarks
the hreflang is already documented there
#
tantek
is the google citation already there?
#
tantek
the support.google citation that is
#
csarven
KevinMarks I'm in Bern, CH (mostly German in this Canton). https://developers.google.com/structured-data/testing-tool/ displays content in English (outer) and German (inner).
#
KevinMarks
Google thinks that they can do better by algorithm than by explicit setting
#
KevinMarks
and a lto of the time they are right
#
KevinMarks
but some of the time they have misdefined the problem
#
KevinMarks
assuming things have a single language is one of those times
KevinMarks__, chiui, dym_cx, glennjones, KevinMarks_, eschnou, elf-pavlik, iwaim, Erkan_Yilmaz and fuzzyhorns joined the channel