corlaez429 are, I think, impossible to implement for static websites. What if we take a more agressive stand and just block google altogether until they change their crawling decisions (maybe never, but I feel everyone can implement this and could create more buzz)
@ton_zylstra↩️ dat was ook het punt: maak WP compliant met microformats2, en de classes v post kinds. Zodat themes en blocks er standaard mee uit de voeten kunnen. Als je dan webmention aan zet is ineens 40% van het web een open sociaal platform, buiten de silo's. (twitter.com/_/status/1570685363835863045)
jjuran, gRegorLove_, mro_ and mro joined the channel
[tonz][snarfed] wrt ‘it may add up to significant bandwidth for big sites’ , an example given was a single page wordpress site w 4 files, getting single digit visitors per week, content not changing, still getting 600K hits from bots and crawlers per month. In part because of all those unnecessary URLs (and crawlers being wasteful themselves) A big site saw Google crawler hit their site every 2 seconds to index the whole thing. So it’
jordemort(moving from #indiweb) i'm working on a client-side search engine for my static site using sql.js and had the thought: what if there was something like <link rel="sqlsite" href="/path/to/sqllite.db" /> that served up a sqlite database with all of a site's posts indexed in some sort of agreed-upon schema? (prolly based on h-entry)
jordemortthen it'd be easy to build some "standard" client-side search javascript, or CLI tools to search sites that implemented it, or metasearch engines where you could pick sets of sites to search together
[schmarty]1federated and cross-site search, where each site hosts their own search and some tool makes multiple requests and aggregates results, has come up a few times but it has a lot of variables and i don't know that anyone has made a real go of it.
jordemortunrelated, except that i'm doing my indexing by parsing my mf2 metadata: did i miss it or is there no standard way to mark up tags/categories in h-entry?
Loqitags or tagging refers to categorizing or labeling content, your own or others (tag-reply), with words, phrases, names, or other information, optionally linked to specific people, events, locations, such as the practice of tagging posts being about certain people (person-tag), like tagging people or other items where (area-tag) they're depicted in a photo https://indieweb.org/tags
angelo_re: <lastmod> vs ETag/Last-Modified; one request to a sufficiently marked up sitemap will allow a bot to pinpoint the four documents out of thousands that need to be re-crawled. conditionally requesting still requires every document to be hit.
[snarfed]jordemort adoption is usually the biggest challenge with any idea like this. there's a big established ecosystem around the existing adopted standards (HTML etc) and big established search engines. they may not be perfect, but they're fully adopted
[snarfed]new ideas like this will struggle to get more than a few sites to adopt them at the beginning, so the resulting search engines' indexes will be unusably incomplete, so people won't use them much, so other publishers won't be incentivized to adopt
[snarfed](and this is all still just considering centralized search. federated search, ie send the query to all/many nodes and compile the results, I don't even know how to begin thinking about, so much of it seems so intractable. I'm honestly curious how the fediverse does it, if at all)
@wikipediachain↩️ XSL > Web Ontology Language > Semantic HTML > RDF/XML > DOAP > Rule-based system > Simple Knowledge Organization System > Agora (web browser) > IndieAuth > XPointer > XSL > WebXR > Oculus Rift S > Hack (programming language) > List of mergers and acquisitions by Meta Platforms (twitter.com/_/status/1570825778585010178)
angeloso i just began consuming opensearch files; see https://indieweb.rocks/adactio.com left column below his card; if you try a search you'll see that you wind up on his site and that the results are marked up
angelothat said, they do have sitemaps with <lastmod>: https://indieweb-test.tumblr.com/sitemap1.xml ; while i expect tumblr users don't do much updating of old posts i feel confident finding the few that do would be doable only by watching the sitemaps
[KevinMarks]Also I need to write tests for the different post and composite cases as I think I may have some of them wrong. Then we can evangelize other theme authors once we show value
jacky, tetov-irc, nertzy, gRegor and geoffo joined the channel