#LoqiSocial Web WG Face to Face Meeting at MIT (F2F8)
#ben_thatmustbemetantek: we have 5 clusters of topics to discuss, PR transitions, CR transitions, WD updates or note transitions, group continuity or transition to CG, or other business
#ben_thatmustbemetantek: ok, i think we have everything through lunch scheduled, lets go ahead and start with that, if we need more time on things we can take more time and movethings around
#ben_thatmustbemeeprodrom: there is ostatus group, activitypub, others
#ben_thatmustbeme... the activitypub one should probably be closed. the activity streams one could still be open since we have work to do
#ben_thatmustbeme... well these are at CR, but if there are things we think should be included and incubated, those would be good for the community group.
#cwebber2the laptop on the table has disconnected apparently
jasnell_ joined the channel
#ben_thatmustbeme... the incubator group can continue to add extensions and add features to the specs we've defined
#cwebber2I can't hear anything, though I see the other webcam visually moving still
#ben_thatmustbemeeprodrom: with AS, when we talked about extensibility, we talked about adding it to the namespace, which means there is some document maintinance
#ben_thatmustbemetantek: so part of it should probably be messaging all those other groups to tell them there is a community group that was not active and they may want to look at this group
#ben_thatmustbemesandro: I can imagine ppl being annoyed by that because they only want to follow one technology
#ben_thatmustbemetantek: i don't think anyone will really care since these groups are pretty much dead
#ben_thatmustbemesandro: if someone complains i think it would be good to keep the group open and maybe just have some major updates
#ben_thatmustbemetantek: i think that makes it worse as it looks like the discussion is there, but it isn't
#ben_thatmustbemesandro: do we want to try to do a webinar. tutorials on what exists already, "this is what webmention is, heres how to use it" etc for each of our specs?
#ben_thatmustbemeeprodrom: so we are expecting to go to at least april with our group, and we expect to have a CG that will continue indefinitely
#ben_thatmustbemetantek: i think we should be clear that everything else should be finished up by january, and the rest is really just for pubsub mainly. its not like we are trying to cram a bunch of other things in there
#ben_thatmustbemetantek: and we're not expecting to be doing an F2F after now, right?
#ben_thatmustbemesandro: I don't think so, the big question, what happens if someone points out a BIG issue with one of our specs that has to go through the whole cycle again
#ben_thatmustbemeeprodrom: for us as a group, would it be fair to say, after jan 1, we have telcons as needed
#ben_thatmustbemetantek: i'm going to guess we'll want to do them at least monthly?
#ben_thatmustbemerhiaro: monthly 4am phone calls will be preferable to weekly 4am phone calls for me
#ben_thatmustbemetantek: presumably we can still get staff contact time for the extension? and the chairs will be able to commit to having time?
#ben_thatmustbemeeprodrom: yes, i can commit to being around for that
#ben_thatmustbemecwebber2: we can ge tto this in the AP time, but as AP has the most amount of work to do, i don't want to completely discount work on AP after Jan1
#ben_thatmustbemetantek: thats certainly something we need to talk about, and making that kind of request is important, we'll go over that schedule when we get to AP
#ben_thatmustbeme... anything else on group continuity / incubation group?
#ben_thatmustbeme... as we discussed, any impact on continuity is something the editor should bring up when we discuss them
#csarven[16:22:41] <ben_thatmustbeme> eprodrom: so we are expecting to go to at least april with our group, and we expect to have a CG that will continue indefinitely --- Is this agreed? April is the extension?
#tantekthat's roughly correct. there's a strong preference to wrap up our other specs in January, however we will discuss each spec in particular and figure out its particular needs.
#ben_thatmustbeme<ben_thatmustbeme> its not like the arbitrarily chose April, we have to give a certain amount of time for IP exclusions, since that period is still going on for pubsub, we need to extend
#ben_thatmustbeme<ben_thatmustbeme> that gives us the opportunity to finish dotting i's and crossing t's on other specs, but we want to be clear that we are not just trying to cram a bunch of other new stuff in by extending
#cwebber2btw, I assume #social is a pretty welcome place in general to invite people hoping to implement socialwg specs?
#cwebber2or should I encourage people to join another channel
#rhiaro... How do we ensure that requests coming from the hub to the subscribers (that are not content) are actually coming from the hub
#rhiaro... It was suggested that we use a signature mechanism, except we don't have a body, so nothing tos ign
#rhiaro... I think (and aaronpk agrees) is to not provide a signature mechanism but strongly incentivise subscribers to use complex urls that are not guessable
#rhiarojulien: if we go in that direction, which I think is safe in terms of security, is why do we even need to sign notifications? If we have https and complex urls, that should be secure enough
#rhiarojulien: I still think we should allow for non-https, that adds complexity, and I think we need the signature for the case where you could have a man in the middle thing where someone could alter the content and you would not know that they had
#rhiaro... That url is never exposed because it's sent in the POST body over https to the hub
#rhiaro... With that in mind, totally separate issue: why do we even have signature son the body?
#rhiaro... If you assume the URL is secret and nobody can send forged requests to it, why does the hub need to sign the payload?
#rhiaro... One reason to continue using signatures is it does allow subscribers to not support https if they are willing to take the risk of forged confirmations
#rhiaro... It works just as well with all the peices together whether or not subscribers do this
#rhiaro... I have a hard time saying it's a MUST because of that
#rhiaro... but if someone is implementing this.. I opened this issue because I was writing the test subscriber, and I was like 'wait a second what is stopping someone else from making this request?' there wasn't something in the spec to tell me how to protect myself
#rhiaro... I think just at that point the implementer will go to the spec and see the recommendation to make it a unique URL
#rhiaroaaronpk: I can make the test suite measure that too
#rhiaro... THe hub will compute the signature at the byte level, the subscriber will use a different encoding and compute the siganture on the string version of this, and get a different result
#rhiaroaaronpk: shouldn't hte hub compute it on the string level then?
#rhiarojulien: the hub is the party signing the content, not th epublisher
#rhiaro... it would be more secure for the publisher to sign and the hub just transmit it, and be agnostic of the content
#rhiaroben_thatmustbeme: but then you're starting to define the publishing
#rhiarojulien: right htat's a completely different thing
#rhiarosandro: In the current model the hub has to sign it differently for each user (per secret)
#rhiaroeprodrom: one of the problems iwth capability urls is as soon as you publish anything across security boundaries you've started yoru clock ticking. Somebody is going to get to it at some piont, whether it's 100 years from now or next week
#rhiaro... And then signatures on the notification not changing, because it is implemented some people are doing things with it, if you don't want to use it you don't have to, and you can always go fetch the original content if you don't trust the hub's payload
#ben_thatmustbemejulien, aaronpk, i think i found an issue, but i'm not sure, i feel like i might be missing something
#rhiarotantek: are any of these SHOULDs currently unimplemented
#rhiaro... People open issues, I close them, what should I do?
#rhiarotantek: what we've been doing so far is to try to reach a conclusion that the person who opened the issue has been happy with, and ask them to close it
#rhiarosandro: We don't want to have the situation where someone feels like they haven't been heard
#rhiaro... If you do exactly what they ask for it should be fine to close
#rhiarojulien: Same with PRs, I just merge them myself
#rhiarotantek: You said you're opening PRs. If you get PRs from people who are outside of the WG we actually need to get them to agree to the contributor's agreement before we merge them
#rhiarojulien: okay I think I merge da couple of things..
#rhiarosandro: there's the patent and copyright question
#rhiaro... if it's editorial it could be a copyright issue
#rhiaro... Right now the discovery steps in the specs say check the http headers and then if it's xml or html then look for the link tag, and then lastly check host-meta
#rhiaro... The spec doesn't say much about how that part works
#rhiaro... IT's sort of left as go read about host meta and figure it out
#rhiaro... I supsect there's not much implementation of it
#rhiaro... I suggest to drop it if there's no known implementations
#rhiarojulien: I initially agreed and then changed my mind. There's a use case that's very useful - github pages
#rhiaroaaronpk: a hosting environment that doens't le tyou set http headers, for document types that don't support embedded links
#rhiaro... hostmeta is a specific part under .well-known
#rhiaro... In my opinion, the best way to do it for pubsub (and possibly webmention) is to define a pubsub .well-known
#rhiaro... where the pubsub spec defines the document inside there
#rhiarotantek: that's not unique to pubsub, adding one more step to discovery
jasnell joined the channel
#rhiaroaaronpk: we solved for webmention with not supporting wellknown, and whole domain delegation, at an http header level
#rhiaro... We don't support per-document discovery on non-xml content types, but the assumption was the majority of the use cases for that could be solved with headers
#cwebber2that requires that you control the web server, so it wouldn't work with something like github pages right? which is probably fine
#rhiarotantek: we did roughly define our own discovery guidelines in this group, across all the different approaches
#rhiaroaaronpk: that doesn't solve this particular use case
#rhiarotantek: I think that was considered, this isn't a new use case
#rhiaro... There's not enough new information to bring it up from two years ago
#rhiaroeprodrom: Link header and then link tag is going to discover somewhere north of 95% of cases
#rhiaro... So it might not be worth doing anything more than punting and saying go look around the web and see what other discovery mechanisms there are
#rhiaroaaronpk: Here's my concern. It makes writing subscribers harder. As a subscriber it's nice ot know when you've checked all the possible ways to find a thing
#rhiarojulien: a long time ago, the discovery mechanism itself can be extracted into a custom service. Can just be a library that people can reuse
#rhiaro... I wrote a service a couple of years ago called feed discovery..
#rhiaro... It's harder to implement, but it's a matter of just using a library that does it
#rhiaroaaronpk: the reason that we have to talk about it is because it is in the spec already
#rhiaro... The only solution that doesn't break *possible* implementations that we haven't confirmed, is to harden the aspect of hostmeta that the spec does *already* refer to
#rhiaro... Only looking for the xml format in the hostmeta file
#rhiaro... If there are implementations they did it that way, cos that's all the spec hinted at
#cwebber2wonders, will we be breaking for lunch soon?
#rhiaroaaronpk: I would rather drop it, but am okay with restricting to xml, to support the use case that has technically been supported before this group adopted it
#rhiarojulien: I want the mechanism to exist for these people
#rhiarotantek: cwebber2 do you have an objection to either way?
#rhiaro... I'm going to declare consensus on aaronpk's proposal, of restricting the scope to what it seems the pubsubhubbub spec intended, and marking it as at risk
#rhiaro... And indicate in the spec that we know of no known implementations, and if you have an implementation the group stronglyr equests your input on this issue in particular
#rhiaroben_thatmustbeme: I think the most important feedback is from subscribers to know that they're all checking for it
#rhiaro... If nobody is checking for it, what's the point of specifying it?
#rhiaroaaronpk: we can solve it a better way in a future version
#rhiarotantek: we're asking for an extension for this spec in particular, so if anyone decides to look through with a fine tooth comb, one they will look for is narrowing scope, not adding new features
#Loqi[@sandhawke] Any workaround for sites (@github) which don't let you set HTTP Link headers? I find myself sadly wanting .well-known/extra-http-headers.txt
#rhiaro... So.. what do we need to get to CR? Have we covered all issues?
#rhiarojulien: medium has its own hub for all of the feeds (superfeedr)
#rhiaro... We don't have an API for reading, we just have feeds
#rhiaro... I can see where we might eventually have some things not available as feeds to be available through pubsub. I'm fighting hard against auth for that
#rhiaroaaronpk: Add it to the test tool to check if it's being done, so we can keep track of it
#rhiaro... If the actual diffing mechanism is not in the spec, how do people know what to do and what to expect from the payload
#rhiaro... Can we say here is where to go to learn about what to expect?
#rhiarosandro: in practice, when people get the new version of an rss feed, they don't actually get it, they get a stripped down verson that only has the new stuff?
#rhiarosandro: sounds like it's not in conformance to the spec
#rhiaro... I imagine it says the fat ping is the content that is being published
#rhiaro... It sholud say it's either the content being published, or the subset appropriate for that media type
julien joined the channel
#julien <p>A content distribution request is an HTTP [[!RFC7231]] POST request from hub to the subscriber's callback URL. The HTTP body of the POST request MUST include the payload of the notification. This request MUST have a <samp>Content-Type</samp> Header corresponding to the <samp>Content-Type</samp> of the topic, and SHOULD contain the full contents of the topic URL. The hub MAY reduce the payload to a diff between two consecutive versions if its format al[CUT]
#rhiaroaaronpk: the point is that the actual, now you can't even tell what the payload is going to be
#rhiaro... "subset or diff" are not actual spec words (not defined)
#wilkieRSS/Atom is fairly straightforward, too. if you know what RSS is, then you can tell when an entry is an entry you've never seen before. So you just always treat the incoming data from PuSH as a subset and just take what you need.
#rhiarosandro: so, for formats that are a set of items, this may be reduced to only the changed items
#rhiaroaaronpk: to better define the 'diffing mechanism' in generic terms?
#rhiaroeprodrom: what about defining the diff for... it's such a common use case that we've got, using rss items and atom entries, it seems worthwhile to.. it would take two sentences to define it
#rhiaroaaronpk: the problem is that for the content types that aren't those, what is expected?
#rhiaroeprodrom: it could be a subset. For rss it's an item, for atom it's an entry, for AS2 it could be a single activity
#wilkieif you let people know that a subset that is based on the content type is how it is expected to work, I don't think people will be confused to that purpose
#wilkieif you understand the content type (RSS or AS2) you won't find a subset surprising or hard to deal with, basically
#rhiarosandro: if the topic is a json document and the top level si an object and one key value has changed can you just send that key value?
#wilkieyeah, a general diff may just be too complicated to get right. noting that a formalized subset defined elsewhere is to be expected seems to be a good note in the spec, giving RSS/Atom as an example.
#rhiarojulien: you havea feed with ten entries, all new you get ten. You have a feed with 100, 10 new, you get 10. The subscriber doesn't know
#rhiaroeprodrom: I think we always did exactly one
#rhiarosandro: I could have a completely broken implementation without realising it. I might think th ehub is broken because it's sending only soe of it
#rhiaro... Every consumer has to be written with this awareness, that if it's RSS/Atom it might not be getting the full content
#rhiaroaaronpk: the alternate is that you assume you're getting the complete feed and don't dedup
#rhiarosandro: I could be building a subsystem, a fetch module, you give it a url and it gives you back the content, and I want to add pubsub discovery rather than a polling mechanism
#rhiaro... not rss/atom. A web mirroring system. It turns out that pubsubhubbub on two media types do somethign different than pure mirroring
#rhiarocwebber2: if there's an expectation that you have to jump back to very correctly get the content if it's somethign that was private I guess it has to be both private and transient for that to be an issue
#rhiaro... But if that's an issue there's not any guarantee you can fetch it again
#cwebber2so, the reason I raised this was we realized if you had private *and* transient data flying across the wire, you won't be able to fetch it again... which is possible in activitypub
#rhiarosandro: if I get an html page I din't know it has hentries on it, but the hub knows that and trims it, but I don't know it has h-entry, I'm screwed
#rhiarojulien: yeah. Subscribers SHOULD NOT care wehther the feed is truncated or full. For RSS and Atom
#rhiaroaaronpk: subscribers should not make any assumptions about whether the feed has been truncated or not
#rhiarotantek: reducing the scope to two specific content types
#rhiaro... The only potential situation where we may want to reconsider the feature is with AS2 feeds because they are intended to work like RSS/Atom feeds, only json
#rhiarosandro: we could do an at risk thing around as2 and get some experience
#rhiaroaaronpk: An AS2 object has the same content type as a Collection
#cwebber2wonders how much time spent on diffing topic
#rhiaroaaronpk: we're going to describe what RSS/Atom are doing for fat pings, describing actual thigns people may be receiving, and say they are there for legacy reasons and say you must not modify the topic URL for any other content types
#KevinMarks2The discussion of feeds of items in pubsub
#aaronpkoh, sure, but it's a new feature from the perspective of PubSub and we are trying to not add new features if possible
#wilkieI think PuSH just needs to operate at the item level and not the feed level and that "fat pings" are just the sending of multiple yet distinct entries instead of trying to syndicate a feed which is apparently impractical/unnecessary
#cwebber2we can narrow this down fast with some dice rolls
#ben_thatmustbemePROPOSED: We reject all pre-exisitng names before today, and we will pick from between WebSub, WebSubscribe, and WebFollow barring any new better names
#ben_thatmustbemeRESOLVED: We reject all pre-exisitng names before today, and we will pick from between WebSub, WebSubscribe, and WebFollow barring any new better names
#ben_thatmustbemePROPOSED: add aaronpk as a co-editor for PubSubHubbub / PubSub / whatever we name it
#sandroaaronpk: my testing tool can't always tell if the right end result happened, so it asks the user, with a check box, like "Does the photo now appear...."
#sandro.. the test tool becomes the consisten payload sent to the server, then have the human check the box, if you can't tell
#sandrocwebber2: Diaspora-- we had Jason Robinson for a while -- I think he lost some time, and was disappointed we didnt do signatures, but we did address some things, so they COULD implement them
#sandrorhiaro: dinner at VeggieGalaxy, then people can hop on T to party or wherever
#KevinMarksI was going to suggest "flowpast" as a pubsubhubbub name, but looks like Google expired the domain on me
#strugeeeprodrom btw, since you're here - heard recently from cwebber2 that you were intending to be part of the AP implementation in pump.io, is that still accurate? I started work on it yesterday but can hold off if you want...
#cwebber2strugee: eprodrom said he'd stay out of your way so it can be an independent implementation :)
#strugeeexcellent. I read some of the log but must have missed that part
#cwebber2eprodrom might have just said it on the call :)
#sandrostrugee, eprodrom says: "please thunder forward as fast as you can!"