#gRegor`I think Dan's concern was basically telling people "copy the link to my note, go to your site, paste it as the in-reply-to, write your note, publish"
#gRegor`But I think that's further along in the /generations
#KartikPrabhu_in fact the silo situation is worse. It says "make an account, full your details, sacrifice your first born, upload picture, invite friends, then comment"
#gRegor`He's already not very keen on webmention due to spam/DDOS potential, so I think he wants a really solid UI before he implements it. Different priorities, I guess.
#KartikPrabhu_comparing a developing system to a developed and adopted one
#ben_thatmusthmm, perhaps its time to move off hostt.net, again i cannot connect to brid.gy due to DNS issues, and their 'create new ticket' form returns a 404 error D:
gr0k, mlncn, fmarier, musigny and squeakytoy2 joined the channel
#reedstrmLove them, except for the join messages. Just installed Stylish in FireFox to add user custom CSS to suppress them. Such a pleasure to work with well marked up HTML. Anyone have a better ways to do this?
#aaronpk_someone suggested I add a checkbox to hide them
#danlykeOf course this is also when I chose to run an "apt-get upgrade" on that server :-) Will both add more debug info, and have an answer, shortly...
#danlykeAnd obviously something else is going on in my network, because my home server just went all wonky and my colo server is acting like it's got way more load than it should have.
#ben_thatmustbemei had it there too... i started all my templates off of sempress, and there is definitely some confusion still in there. I need to fully rework them.
#gRegor`Right, I'm saying it should only be in the <data>, not both.
#gRegor`The "rsvp" key in the parsed data should only be "yes" or "no", case-sensitive too
#@tpost-Facebook #CyborgCamp session I predicted all here: 10y: have+use their "site">"cell" 20y: no "cell" like no pager (ttk.me t4YZ2) (twitter.com/_/status/520628777731489792)
stream7, RichardLitt and shaners joined the channel
#reedstrmParse all the things! No that's backwards: All the things, parse this! Now that sounds rude. *sigh*
#reedstrmapparently suffering from Fridayitis myself!
#shanersMy thinking is that on my video posts, the <vide> tag has a placeholder attribute. Feels to me like that is a place for a .u-photo. But the .u-* parsing rules don't say anything about looking in @placeholder for a URL.
#shanerswhich, in my case, is on my `video > source` tag.
#shanersbut the placeholder image, eg the poster frame, is an image. so, u-photo feels right there.
#KartikPrabhu_shakers the whole source and srcset things are not in the mf2 parsing rules yet. maybe document your use case which will prompt addition to rules
#emmakshaners: oh, i think i misunderstood your original question
#shanersemmak did you do a design iteration with your avatar in your sidebar in the About section? So it's once on every page (home/permalink/post with comments).
#danlykeso totally abstract q: If we're going to be including source pages on Webmentions, has anyone thought through license implications? I'm kinda assuming that we're cool as long as we're playing, but...
#danlykeie: on the one hand, a webmention can be thought of as assumed consent, on the other hand, the first time a site that's been webmentioned switches to paid advertising, and the original referencing site has a CC attribution non-commercial license on it...
#kylewmdanlyke: is it any different for facebook and ogp: and twitter and their metadata cards?
#danlykekylewm, probably not. And probably not really worth worrying about. Just a "huh, I'm spidering their site and republishing it, this feels weird" vibe.
#gRegor`On the plus side, if the target site supports webmention properly, the original note author should be able to remove it from the target site by returning 410 and sending a follow-up wm
#danlykegRegor-phone parsed the h-entry, there's still some room for grossness and I need to expand the tags my HTML parser allows in user input text to modern standards, but I'm now including your reply text at http://www.flutterby.com/archives/comments/20399.html
#carmennever been in MSFT's NERD either , despite for years living within 22LR projectile distance of Kendall Sqr in Chelsea, the land thats 1/5th the price of Cambridge
#pdurbin:) I've been to NERD a couple times. I'm glad I checked that the event got moved to Stata. I was planning on the Berkman Center.
#danlyke!tell kylewm semantic-wise, I'm trying to figure out how to determine which portion of your page to excerpt. Your h-entry is the whole body, I guess I'm looking for the e-content that's not inside the p-in-reply-to?
#kylewmit's a little tricky to do correctly without a proper mf2 parser
#Loqikylewm: danlyke left you a message 14 minutes ago: semantic-wise, I'm trying to figure out how to determine which portion of your page to excerpt. Your h-entry is the whole body, I guess I'm looking for the e-content that's not inside the p-in-reply-to?
#kylewmyou want the e-content that is a child of the h-entry but not a child of any other h-* class... if thatmakes sense
#danlykeOkay, I need to put two-way linking into my parse trees so I can tell if any of the parents of this node aren't of a type... Or put some more smarts in my node finder to allow negative searches for 'h-\w+'
#gRegor`I have no idea the quality/upkeep of this, but http://buzzword.org.uk/swignition/ might be a good start at least for parsing mf2 with Perl, danlyke.
#gRegor`5 years old, looks like. But if it's parsing microformats already, might be a good start for mf2.
#kylewmjust naively, I'd guess it'd be a lot easier to write a new mf2 parser than to "upgrade" a mf1 parser
#gRegor`Yeah? I've not really dug into the parsing rules/differences
#kylewmmy understanding is that microformats had a lot of special cases and context-specific rules
#kylewmthe microformats wiki gives pretty decent pseudo code for writing a new mf2 parser...
#kylewmor you could just shell out to php or python :)
#danlykeit's relatively easy for me to parse an HTML block and find specific classes within that block. I think I just need to add some exclude capabilities.
#danlykeBut the problem I'm having with microformats is that the semantics seem to be way more complex than the application of them warrants.
#joskarWhile on the subject of parsers: when parsing e-* properties, are there some guidelines for sanitizing the HTML? (preventing XSS etc)
#gRegor`Part of the problem with just looking for class names is the different prefixes inform how to parse the value. Is that what you mean by overly complex semantics?
#danlykejoskar: My CMS parses user input to parse to a fairly limited subset of HTML3, strips any attributes that look like they could be Javascripted.
#joskardanlyke: Ok, so you just have a whitelist of tags and attributes and discard the rest?
#gRegor`joskar: I don't display the html value of e-content, just the plaintext value.
#danlykegRegor` yeah. It'd be great if I could just do positive searches, but mixing negative and positive searches (ie: find h-entry, prune all other h-* classes, find the remaining *-content. When that's the main thing I want for a given Webmention.
#danlykejoskar yes. Pretty much treat it like user input HTML generally, ie: You want to let the users do some HTML (bbcode/other proprietary markup must die!!!1!!elevenses!!) but you don't want 'em breaking the page or adding malicious code, so...
#danlykegRegor` just pulling the text and throwing away all the markup.
#danlykeI've seen so much bit rot over the years that there's no way I'm gonna use a parser someone else is hosting. That's what I'm trying to get *away* from.
#danlykeie: my Perl system uses lots of regex, so I just put a regex library in C++ and got a little smarter about my string handling.
#danlykeI did start a lex based parser too, may continue that for the HTML parts, but the HTML parser is just a "parse the string subsets of the markdown parser and change the allowed tags" thing.
#danlykeI thought "oh, microformats is all class based, it's easy for me to say "get a list of all tags of type X with class of Y from this tree" (ie: the ways you'd play with the DOM), but I haven't done "unless they're underneath a tag of type Z" yet.
#kylewmthe parsing algorithm is a recursive descent sort of thing... oh i'm at an h-class, grab all its properties, recursing on nested h-classes
#kylewmtantek queries his mf2 data store with xpath, if i understand correctly
#kylewmbut obviously he knows what subset of mf2 markup he is using
#danlykeyeah, this stuff should be easy (in fact, easier in the C++ version than the Perl version), I just need to grind through the spec to figure out the right things. And rewrite the Flutterby.com stuff in C++ (right now only the Flutterby.net stuff is)