[gRegorLove] #101 Update validate-h-card
This was a nice problem to have. Had to fix my webmention comments on my posts because people were sharing useful links. All fixed now: https://darn.es/you-should-add-a-generator-tag-to-your-eleventy-site/#comments
<capjamesg> KevinMarks I already expose last_crawled dates on IndieWeb search "search for "last_crawled" in this JSON file:
<capjamesg> To access, append &format=results_page_json to the end of any query.
<capjamesg> I'm not sure if this data is fully backfilled.
<capjamesg> Is a first_crawled helpful?
capjamesg I use https://cloud.google.com/logging/ to capture Bridgy's logs, but there are lots of similar monitoring/observability tools. New Relic, Datadog, Splunk, Honeycomb, etc
a good monitoring tool is worth its weight in gold
<capjamesg> snarfed Do you serve those logs from the brid.gy web interface (i.e. the ones that shows whether retrieving a Tweet was successful)?
<capjamesg> Fascinating! That might solve this issue for me!
<capjamesg> How do you store logs in terms of what goes into each file?
it's a service, it doesn't use a file abstraction, but log entries come in many different types and have different metadata attached - project, module, URL for HTTP requests, module and line number for python log message, etc. you can query by those
all of these tools are similar
(in Python specifically, GCP attaches a `logging.Handler` that bundles up log messages and sends them to the GCP Logging service over the network to be stored)
<capjamesg> [snarfed] Wow. This makes logging so much easier.
<capjamesg> Context: I'm planning on making IndieWeb Search crawl logs public.
Ahhhh interesting!
<capjamesg> [snarfed]
<capjamesg> [snarfed] Are POST requests rate limited?
<capjamesg> Never mind. The issue I was having was caused by instantiating the logger too many times.
<capjamesg> This is going to be really difficult...
<capjamesg> I have a system through which I can query a domain and return logs. But, it takes 7 seconds to retrieve 200 lines of logs from Google Cloud.
capjamesg yup, latency can vary widely. the big thing I did to improve that was a narrow timestamp window
Another thing to look at for performance logging is honeycomb.io
neat, indieweb specs in spambot names
Is this webmention spam? 💁‍♂️🦋
Maybe not spam bot, looks like a twitter account to test gatsby webmention, though the profile URL doesn't work now. https://twitter.com/gtsb_webmention
yeah I noticed that too. also been inactive since 2020
<capjamesg> "Is this webmention spam?" 😂
<capjamesg> [snarfed] Thank you for sharing!
<capjamesg> Based on my intended use case, it might be quite difficult to work with GCP.
<capjamesg> (i.e. see all crawl logs for a domain in the last three months)
<capjamesg> The alternative is for me to store them locally.
<capjamesg> But that limits my capacity to search them without starting an ELK sort of thing 😂
understood! lots of other alternatives too, mentioned above
Thanks for the review, jacky. Updated https://github.com/indieweb/indiewebify-me/pull/101. Think someone with write access also needs to make an approving review, then it can be merged.
[gRegorLove] #101 Update validate-h-card
Soon I want to get back to the migration from Silex (no longer supported) to symofoy/flex. I think that will make the code a lot easier, plus Twig templates.
Heh, started on that for the 2019 IndieWeb Challenge. Some things happened since then.
<capjamesg> Has anyone played around with fly.io?
i just migrated a Rails app to fly.io. i also used to use fly.io as a custom domain proxy in front of a glitch.com app (they have since made custom domains a part of glitch so i dropped that)
gRegor I'm taking a look at that indiewebify pr. the whitespace changes at the same time as code changes is making it harder 😛
Thanks. There's a setting at the top of the Files tab to ignore whitespace
*Files Changed tab
Or I think query string ?w=1 does it
there's Hide/Show whitespace but not ignore
Sorry, hide is what I meant
reviewed. is that enough for you to move forward with it?
I think so. If it looks good it can be merged and I think it deploys
Just saw your comment. Want me to work on some tests first?
I tried it with a handful of regulars sites. Not sure I've ever actually run the indiewebify-me test suite though, haha
I think the key is to have tests that exercise each of the if/elseif/else clauses that you changed
phpunit 4.8.7, ouch
e.g. a site with only one h-card, not representative
they're on phpunit 9 now XD
a site with a representative h-card
a site with multiple h-cards, none representative
makes sense
a site with multiple h-cards, with one representative
that way you can test to see if you're seeing the expected messages that you wrote code for for each of those cases
Yeah, I tested on my copy with the links in the first post for those scenarios, but would be good to have them in the bundled tests
my guess is you can find most of those examples among our existing indieweb sites
