#dev 2022-06-10

2022-06-10 UTC
jeremycherfas, jjuran, gRegor, cjw6k_, mro and gRegorLove_ joined the channel
#
@fakebaldur
“The Spam Has Arrived - Miriam Eric Suzanne” Tackbacks, pingbacks, webmentions—they all seem to handwave their way through the spam problem. https://www.miriamsuzanne.com/2022/06/09/the-spam/
(twitter.com/_/status/1535201942254891009)
crimsonkinda and tetov-irc joined the channel
#
[KevinMarks]1
well, we knew it was an issue
#
[aciccarello]
When I get webmentions working on my site, I'm probably going to do an allowlist.
#
[aciccarello]
It does appear CSS tricks gets pulled into a lot of content-stealing sites
#
[aciccarello]
I didn't see in that article if the spam was pingbacks or webmention
#
[aciccarello]
I can't find the developer who I'm thinking of who also had an issue when they got featured on css tricks
#
sknebel
I would assume pingbacks, but of course the issue is the same fundamentally
#
[aciccarello]
Interesting to see Chris Coyier's take on scrapers
#
crimsonkinda
does it mention non-profit versus for-profit scraping?
#
IWDiscordGateway
<capjamesg> crimsonkinda Can you give an example of non-profit scraping?
#
crimsonkinda
Yesterday there was chatter about Beautiful Soup
#
crimsonkinda
the library is going some right reservers and Creative Commons License
#
crimsonkinda
the "Beautiful Soup to scrape a Chinese medical site for information about COVID-19, making it easier for researchers to track" information around ... that
#
[aciccarello]
I think CSS tricks is talking about scrapers who just steal blog posts and display it on their site for seo/advertising money.
#
crimsonkinda
its strange given its web design and called CSS
#
crimsonkinda
its not really css zen garden is it
#
Loqi
[memorable] Beautiful Soup
#
Loqi
[memorable] Beautiful Soup
#
crimsonkinda
really there shouldnt be any problem because we should have Email to communicate in a polite manner any problems clearly...
#
crimsonkinda
scrape is a term used because often the HTML is ... often written quickly
#
crimsonkinda
CSS Zen Garden is a great resource for re-thinking it correctly
mro joined the channel
#
Loqi
capjamesg has 34 karma in this channel over the last year (85 in all channels)
#
sknebel
capjamesg++
#
Loqi
capjamesg has 35 karma in this channel over the last year (86 in all channels)
#
sknebel
ok but nothing there has to do anything with content-stealing sites that were the topic of the discussion
#
crimsonkinda
sknebel nice UX of a chat bot
#
IWDiscordGateway
<capjamesg> sknebel I wondered why I just got Karma there 😂
#
IWDiscordGateway
<capjamesg> crimsonkinda This is interesting.
#
crimsonkinda
describe content-stealing ... it should be clear that if content is online, it can be cited correctly and then linked back. The original author should be able to email and issue a soft-take-down of content
#
sknebel
and copying content wholesale is not "citing"
#
crimsonkinda
https://www.citationpod.com/ .... this is how people see it
#
crimsonkinda
dont follow
#
crimsonkinda
the Chat is about, "dealing-with-content-scrapers" from the URL
#
crimsonkinda
alternate solutions include using a log-in screen...
#
crimsonkinda
to make it clear the data requires a "subscription" membership
#
crimsonkinda
accounting of the browse -- but i dont know your chat
#
crimsonkinda
about copying content wholesale or the content issues you are having
#
[schmarty]
crimsonkinda: it's not reasonable to expect the operators of CSS tricks to contact sites that republish their content without permission.
#
crimsonkinda
User Login has been spoken about before on: https://github.com/ory/kratos
#
Loqi
[nottennis] Show HN: Ory Kratos – Open-source identity server written in Go
#
Loqi
[ory] kratos: Next-gen identity server (think Auth0, Okta, Firebase) with Ory-hardened authentication, MFA, FIDO2, profile management, identity schemas, social sign in, registration, account recovery, passwordless. Golang, headless, API-only - without templating or theming headaches. Available as a cloud service.
#
crimsonkinda
goes on about permissions
#
crimsonkinda
from alternate log-ins such as Google
#
[schmarty]
crimsonkinda: scraping prevention as a technical issue is unrelated to republishing of content without permission. Publicly available content is still protected by copyright
#
[schmarty]
Regardless, this discussion is getting out of scope. The more IndieWeb-relevant issue is how Miriam can deal with the unwanted mentions from sites republishing the CSS Tricks article
#
aaronpk
This happens to me when people write a blog post that links to OAuth.net, then I get a flood of notifications of spam blogs that copy the post
#
crimsonkinda
and notifications is something you can turn off or ignore?
#
aaronpk
I mostly just ignore them
#
aaronpk
what I should be doing is blocking those domains from future notifications
#
crimsonkinda
i sympathize with that situation... it can be difficult
#
aaronpk
Is anyone maintaining a list of these kinds of spam sites? Seems like akismet might have a list already
#
[aciccarello]
I remember seeing a list from someone else who had an issue after posting on css-tricks but I can't find it 😞
cjw6k, AramZS and mro joined the channel
#
@TerribleMia
The other day, I posted an article about implementing webMentions on my site. Today, I’m battling a constant stream of spam in my mentions. My simple precautions for display have had one false positive, and one false negative so far… https://www.miriamsuzanne.com/2022/06/09/the-spam/
(twitter.com/_/status/1535279368741724161)
mro joined the channel
#
crimsonkinda
going back the "indieweb yet" https://www.miriamsuzanne.com/2022/06/04/indiweb/
#
crimsonkinda
can social platforms and personal sites be different answers to different problems
#
Loqi
[Miriam Eric Suzanne] Am I on the IndieWeb Yet?
mro joined the channel
#
omz13
Are there any sites who generate feeds per the jf2feed+json spec?
mambang[m], mro, gRegor and jacky joined the channel
#
jacky
putting the finishing touches on this post about changes I made to my site
#
jacky
mainly about implementing some of the fields at https://github.com/indieweb/micropub-extensions/issues/4#issuecomment-412349552= (found it potentially useful for generating feeds)
#
Loqi
[grantcodes] I've updated my implementation so I support a bunch of new features expanding on what @EdwardHinkle already mentioned When using `q=source` there is now a lot of other parameters available: - `post-type=type` - Filters the returned posts to on...
#
gRegor
what is jf2feed?
#
Loqi
It looks like we don't have a page for "jf2feed" yet. Would you like to create it? (Or just say "jf2feed is ____", a sentence describing the term)
#
gRegor
do you mean jsonfeed, omz13?
jacky joined the channel
#
gRegor
what is jf2
#
Loqi
jf2 is a W3C Note and a JSON Post Serialization Format of microformats2 for that is optimized for h-entry consuming code, as compared to the standard microformats JSON representation https://indieweb.org/jf2
#
[snarfed]
hmm jf2feed docs were at https://dissolve.github.io/jf2/#jf2feed, but not any more...?
#
Loqi
[snarfed] #109 support JF2
mro joined the channel
#
gRegor
Ah! :)
#
gRegor
Updated on /jf2feed
#
omz13
its annoying that the names of things are confusing... jsonfeed is not jf2feed
#
gRegor
Aye, figured that out now. I had forgotten about jf2
#
gRegor
naming is hard, haha
jacky and [grantcodes] joined the channel
#
[grantcodes]
Still really love that query filtering idea [jacky], there's a lot of potential with it that has yet to be fulfilled
#
jacky
me too! I've kept it in the corner of my idea and I def see it being useful in social readers (or even if someone's signed into someone else's site, could use it to find what was linked to by the visitor)
tbbrown and jamietanna joined the channel
#
[manton]
It would be nice to merge JSON Feed and JF2 at some point. JF2 is used more in APIs, JSON Feed more in traditional feed files.
#
[manton]
I’m not aware of any JF2 feeds in the wild.
jacky joined the channel
#
[snarfed]
see now you're just showing off
#
[manton]
I stand corrected! 🙂
jacky joined the channel
#
omz13
aaronpk your notes.jf2 has "items" and instead of "children" to contain the entry?!
#
aaronpk
it's a feed, not an item with children
#
aaronpk
children is only for child objects of another object
#
aaronpk
same reason the top-level property of a parsed microformats feed is "items"
#
[KevinMarks]1
This is the tension between microformats (which are arbitrarily nestable) and "feeds" which are linear
#
omz13
jf2feed uses "children" not "items"
#
omz13
feed (collection?) uses "items"
#
omz13
my head hurts!
#
aaronpk
if the top-level item were an h-feed then it would have children
#
aaronpk
this is true with microformats json even before simplifying it to jf2
#
omz13
the jf2 spec talks "children" not "items"
jacky joined the channel
#
aaronpk
the example here is a feed item with children https://jf2.spec.indieweb.org/#jf2feed_example
#
aaronpk
mine doesn't follow that, it's just a bunch of entries
#
aaronpk
that's the same difference between tantek.com and aaronparecki.com, tantek's is an h-card with child items, mine is just a bunch of items
#
omz13
you have a rel=application/jf2feed+json and then return something that isn't quite what I'm expecting
#
aaronpk
clearly we need a jf2 feed validator
#
omz13
I have something that analyzes a body and tries to make sense of what is there... guess I'll have to add some jf2feed sniffing/linting to it
#
omz13
Some of its analysis/sniffing is exposed in toolbox under fetch public resource
Jamietanna1 joined the channel
#
[grantcodes]
I also have JF2 feed. Quite likely to be invalid though 😅 https://backend.grant.codes/micropub/plugin/feeds/jf2
#
[snarfed]
do any of our protocols use jf2feed?
#
[snarfed]
JF2 itself seems important, it's used in microsub etc. otherwise though, I wonder how important jf2feed is, if we already have h-feed (and JSONFeed)
jacky joined the channel
#
gRegor
Yeah I was wondering what the uses were
#
[tantek]
I think it's a chance to simplify and converge jf2 & jsonfeed
#
[tantek]
simplifying is a use-case
#
[tantek]
(less work for more people)
tetov-irc joined the channel