kylewmparzzix: you have to read the next sentence of the description ;) "...but unstable and undocumented and probably not of much use to anyone but the author, at least for now."
gRegor`"indieweb" is a plurality. Lots of people using different server software, blog software, but interacting with a core set of protocols / building blocks. It's really flexible that way.
parzzixgRegor`, not exactly. But I like to keep it simple, I'm not a coder by any means. And the Idea of throwing things under one umbrella easily is appealing.
gRegor`I'm totally guessing, but I presume the Known update process should be relatively painless. Upload new files, maybe run a PHP script that updates the database.
dlykekylewm, re webmentions: Yeah, I'm already looking for mentions to blog entries from sites in my OPML feed reader. I want to expand this to find friend's OPML files (or maybe I just need to semi-intelligently spider links in sites I read looking for more RSS feeds)
dlykekylewm also, Webmention looks like a horrendous DDOS amplification attack vector, and given that the guys next cube over spend much of their days figuring out how to mitigate amplification attacks, that's a concern.
barnabywaltersbut yeah building a webmention spam/abuse-prevention proxy is one of the goals of Shrewdness, and one of the false starts made before building it as mentioned on indiewebcamp.com/Shrewdness
kylewmdlyke: but I definitely like the idea of getting out in front of some of this. could you elaborate on the DDOS issue? basically I send a thousand webmentions to a thosand servers with the 'source' all set as the attack target, and then those thousand servers all try to GET at once?
tantekthat DDOS issues has not be a problem in practice, since e.g. Pingback has the same vulnerability, and that particular issue has not been a problem with Pingback
dlykeYeah, I went and looked at my logs yesterday... added a whole bunch more IP addresses to my ufw rules, but I definitely don't want to recreate the {track,ping}back disaster.
dlykeBut I'm light-weight spidering (checking for changes in as light a weight way as HTTP allows) 196 RSS feeds daily for mentions to my blog, there's no reason that couldn't be a few thousand.
dlykeThe vector for spam seems largely to be from the lack of an introduction protocol. If you just do mention discovery by friend-of-a-friend RSS feeds, you have an opt-in spam prevention system, rather than opt-out.
dlykebret, having been through this with three different protocols before (Referer tracking, trackback, pingback, and, yes, all with "make sure the linked page actually references your page"), I'm uninterested in doing much more coding to recreate the mistakes of the past.
dlykeI don't have anything against y'all doing webmention, aside from what names the ops guys in that next cube over will be calling it if it ever gains traction, I'm just interested in a different way to build that network of discussion.
kylewmKartikPrabhu: i agree insofar as solving a problem that doesn't exist is premature optimization, but if webmention is a "better version" of a protocol that does have those problems, we know they're coming :)
dlykeAs I said, what I'm doing right now is spidering my OPML file for RSS feeds, checking for mentions of my site in those RSS feeds. What I'd *like* to do is auto-discover the next ring out of RSS (and, yes, Atom) feeds.
barnabywalterswould the suggested fix in the original wordpress bug report (of only fetching if the webmention request came from the same host as the source URL) work?
bretdlyke: overal design is simple, but tottally fine. what would make that better is actually present useful information from that link in thread like the actual comment or conversation thread if approrpiate
dlykebret, yes: I should probably coordinate with the few people who converse with me that way to put some sort of excerpt/mention tag in their site so I can easily figure out a good excerpt to grab. Or just grab the whole damned thing.
dlykeI think the real question is: What does the auto-discovery mechanism get you? RSS sucks because it's polled, but Webmention+whitelist is really just recreating NNTP, but less elegantly.
dlykeKartikPrabhu if we reinvented open SMTP gateways but said "It's okay, because HELO and EHLO are now deprecated in favor of OLEH", we'd all be rolling our eyes.
dlykeKartikPrabhu NNTP is Network News Transfer Protocol, a system for distributing articles that underlies the (alas, now no longer usable because of spam) Usenet discussion network which used to be the backbone of Internet discussions, but is also used in many private discussion networks.
dlykeKartikPrabhu re "... didn't understand any of that ...", I hate to be an old fart, but "...something something learn from history condemned to something..." [grin]
tantekdlyke the approach I've been taking is to be very upfront about documenting the expected vulnerability, while still building upon the tech since it is very simple to build upon
Loqitantek meant to say: dlyke the approach I've been taking is to be very upfront about documenting the expected vulnerabilities, while still building upon the tech since it is very simple to build upon
dlykeKartikPrabhu well, my webmention implementation is also really half-baked, and I'm also looking for a compelling reason to finish it vs pursuing alternate mechanisms which don't explicitly recreate the problems of the previous systems.
barnabywaltersit looks like getting the request host (at least from my VPS) is totally unreliable, but the IP is reliable, so looking up the IP of the source domain and comparing it to the client IP might work
danlyketantek did you mean "post" or "host"? I'm North Bay, but could probably work out technical issues for "post". Nobody's gonna come up to Petaluma if I offered to host, though.
danlyketantek: Ah, yeah. See my earlier comments about half-baked webmention implementation... But, good counter-example to my assertion that friend-of-friend trust web discovery is sufficient.
ben_thatmustbemefor syndicate-to i could see it being not much of an issue, its rare that they will have a fragment, fragmention, media query, or anything like that
danlyketantek so spidering http://indiewebcamp.com/irc-people still means I need to dig through the /User:... pages looking for likely URLs that might host RSS feeds. Any brainstorms for not just spidering every damned link on someone's user page there?
jonnybarnesahhh, I get it, without this waterpigs.co.uk would make a request from the source url, so someone could use all these webmention endpoints in a DDOS attack
jonnybarnesas in if someone gets a load of computers to all make requests to your endpoint tp make your endpoint make *loads* of requests to the source?
jonnybarnesalso barnabywalters, how are you checking ip address/hostname? as in my vps has several domains pointed at its ip address, could that cause a hiccup?
danlykebret, re effort to attack a single site: you'd be amazed at how much effort the unwashed masses will go through to make an individual's life hell, if, for whatever reason, they choose to pick on a person.
barnabywaltersI don’t know much about how DDOS attacks work, but I’m amazed that the cited wordpress pingback attack is actually a big deal, because the attacker has to send as many requests as are sent by the network
danlykebarnabywalters, the problem is the amplification: If you can find a big file on a target site, you're only sending a few hundred bytes to the intermediate sites, but each of those can ask the target site to serve a few megabytes (if that target site is hosting, say, a video).
barnabywaltersso that particular aspect could maybe be prevented by clients doing a HEAD request and check for a text/html content type before fetching full content
danlykebarnabywalters: yes, but the Content-Length of flutterby.com's index.html is currently 46131 bytes, if the initial POST request can be made in 400 bytes, that's a 100 to 1 amplification right there.
danlykeIf you hit someone's dynamically generated page, you can not only peg their bandwidth, but also their CPU to unusable levels (happened to me when some guy in Russia was spidering a friend's site hosted on one of my colo servers).
KartikPrabhuany recommendations on installing and running a local dev copy of Wordpress on Linux? The internet at large does not seem to be good at actual recommendations
danlyketantek ah, seeing the "h-feed" now on some of those linked pages. Seems way easier to get the <link rel="alternate" Atom & RSS feeds than debugging yet another parser...
danlykeKartikPrabhu uh? Install your favorite distro (I use Ubuntu at home, SL6 at work, both suck in different ways), they probably have a default Apache package, install WordPress under that?
waterpigs.co.ukcreated /DDOS (+2342) "Stubbed page with definition, webmention example, potential solutions, example code, myself as indieweb example" (view diff)
danlykeben_thatmustbeme (and others doing SSL/https), what's the cheapest way to get SNI certs? I hate paying the extortion money to the CAs, but see that SSL is in my future...
danlykealanpearce, thanks, a friend suggested that on my blog, but didn't link it and whatever I was typing in was redirecting to Trustico. I'll wade into the StartSSL thing and see about getting that working.
KartikPrabhuor any other Wordpress person? I have a local site setup now, but the only way to install themes is to upload a zip file through the wp interface...
Loqibarnabywalters: bret left you a message 49 minutes ago: you also managed to break your "Written a response to this post? Let me know the URL:" box :(:(
barnabywaltersI’d probably implement it by encrypting the current time, then decrypting it and making sure it’s not from more than X seconds/minutes/hours in the past
Mark87thats better, but what about also allowing different expirations for trusted partes. For instance, if a request comes in from bridgy, you might give them a weeklong, cacheable endpoint
aaronpke.g. for webmention.io, I could provide you a way to sign in and get a secret which you use to sign JWT tokens. that way your blog could encode the expiration date
aaronpkyou could then compare the target_url sent int he webmention request against the encoded target and toss it out right away as spam if it doesn't match
Mark87I'm a little skeptical on the expiring endpoints. If I'm an attacker, the expirations prevent me from building a long-lived list of endpoints, but i can still build a list of urls that have endpoints. Presumably I can troll that entire list to get the latest endpoint list and then launch my attack. Expiring the endpoints just adds an extra step
barnabywaltersit prevents the attack which was used on wordpress, which was “I know the wordpress URL structure and I can get a list of wordpress sites”
reedstrmagreed aaronpk - been there, done that, on a submit-a-bug web form. Had to leave it open to not-authenticated, so had to add every trick we could think of to avoid spambots. encrypted timestamp was one of those.
mkoI'm not saying that it's not a good idea. I actually like the idea of them. I'm just not sure it solves the problem well-enough to be the first line of defense.
reedstrmbtw, my experience w/ spam runs is that _any_ error code will send them off to the next target, even one that just requires a refetch to solve.
ben_thatmustbemehmmm, i just realized, if I ask for post access when a person logs in i can easily check if they have a micropub endpoint and then let them reply to my posts without leaving my site
aaronpkyeah, same. that's basically the "greylisting" email spam technique. if someone is sending you an email you first reply back with "come back later" and 80% of spam bots just go away, but real mail tries again.
reedstrm(unlike the damn poorly coded spiders, who've been known to spin hard on a 4XX error to the point that I had to block it at the network stack layer, and was considering talking to upstream about a border blockade)
danlykereedstrm I have pages and pages of 404 and 5xx hits from the same IP address attempting to spam. I have yet to see evidence of error checking in spamming bots.
aaronpkyou'd have to have a really good reason and I would have to trust you in person. I *may* be willing to issue you a short-lived access token, but even that is not likely
KartikPrabhugRegor`: the local WP install could not find a theme I put in the themes folder... turns out I had to simlink it to somewhere else and all that
aaronpkalso the bookmarklet is super useful! with it you can select text on a page, click the button, then it fills in the url,title and content on quill
Reykjavik___http://indiewebcamp.com/Homesteading is what i was more or less talking about. I'm pretty new to ruby but im trying to learn more about posting and then syndecating elsewhere
Reykjavik___i guess maybe if i were to explain it better, im more of a front end guy and designer and im trying to get my feet wet in more dev stuff. and parsing info has been something ive always been meaning to figure out
ben_thatmustbemeeveryone tends to check the headers and the <head> data for those links so if it was a problem i don't think anyone would have even noticed
IanVellosaHi Guys, after listening to the TWIT podcast I thought I'd come and play, but I'm having a few issues getting started. I'm trying to use dyndns and host my own server, but I think the http://indieauth.com server is having issues resolving my domain name. Has anyone else tried using dyndns before? Searching the WIKI and googling around, I've not been able to find anything.
IanVellosaHi gRegor, I'm getting an error come back "Error retrieving: http://www.vellosa.com" which makes me think it's not resolving the domain name even
IanVellosaI've also been trying to use a service http://ismywebsiteupnow.com/ which tests the site from a number of locations, and only about a thrid of them work for me
KartikPrabhugRegor`: re /Getting_Started revision: maybe this "Connect with indieweb experts and pioneers in our chat room" could be changed to suggest "people who have already set indieweb up" or something instead of "experts and pioneers"
LoqiWelcome to news about the IndieWeb where recent notable articles about the IndieWeb are cited and linked to keep you up to date http://indiewebcamp.com/new
danlykeOkay, some choice things to say about people who tag their links "rel="alternative" rather than "rel="alternate"", or don't put semantic information in their links at all and just say "You can read <a href="/atom.xml">my RSS feed</a>" or similar, but the pages from which I could make a good guess at finding RSS and Atom that are linked as participants from http://indiewebcamp.com/irc-people are now checked for inbound links to flutterby.com, and
danlykebarnabywalters: I had to adjust my parser to try Latin-1 as a fallback. But part of my impetus for rewriting the Flutterby.net CMS in C++ is trying to get character set issues right. Between Apache and Perl and PostgreSQL, everything things it's an expert...