Skip to main content
Blog

Hardening dev3lop.com With a Fleet of Agents

How we hardened dev3lop.com with a coordinated fleet of specialized AI agents — a shared-memory brain, area locks, adversarial verification, and re-runnable scripts — plus the bug-hunting prompts behind each one.

· dev3lop Team

Dev3lop's Tigerblade mascot leading a fleet of autonomous AI coding agents that hardened dev3lop.com

dev3lop.com scores 100 across the board on Lighthouse. It is also a site that was dragged through two full migrations — WordPress to React, then React to Astro — and every migration leaves a basement. Underneath the perfect scores sat dead image hotlinks, a redirect file nobody could regenerate, serverless functions that only pretended to store data, and a small chat backend on a $6 droplet whose nightly backup had never once run.

We cleaned a lot of that up in a single sprint. The interesting part is not the bug list — it is how the work got done. We did not point one general-purpose agent at the repo and hope. We ran a coordinated fleet of specialized agents, each with a narrow domain, a shared memory, and a hard rule that nothing happens off the record.

This post is the field report: how the fleet is wired, the one habit that saved us from shipping confident nonsense, what actually shipped, and — for each agent — the exact bug-hunting prompts you can copy and run against your own codebase.

A site that scored 100s and still had a basement

The headline numbers hid the debt. Some of what was actually true:

  • 351 blog images were hot-linked from the old wp-content paths. WordPress is gone, so every one of them 404’d in production — broken-image icons scattered across years of posts.
  • The relay chat backend let any signed-in user read and modify any room’s boards and cards by guessing an id, and its WebSocket upgrades did no origin check at all, so any website could open an authenticated socket to it.
  • The droplet’s nightly SQLite backup had never produced a single file — it ran sqlite3 inside a container that has no sqlite3, against a filename mangled by a Terraform escaping bug.
  • The contact form’s API endpoint returned hardcoded mock data, and a 4,029-line redirect file was hand-edited with no job anywhere that could regenerate it.

None of that shows up in a performance score. It shows up at 3am, or in a security report, or when a customer clicks a dead link.

The shape of the work: a fleet, a brain, and no ghost commands

Three design choices made the sprint work.

One narrow agent per area

Instead of one agent that “knows the whole repo,” we defined six, each owning a slice it is allowed to touch: site-hardener (the Astro site), relay-hardener (the chat backend), api-hardener (the Netlify functions), infra-hardener (the droplet and deploy pipeline), debt-sweeper (repo cruft and the redirect pipeline), and seo-ranger (rankings). Each has a charter that spells out its scope and, just as importantly, what it must hand off to another agent rather than touch itself.

A shared brain

The agents run in separate sessions and would otherwise have amnesia between them. So they share a small brain: a local SQLite database with full-text search and a sqlite-vec vector index over an embedding model, holding architecture notes, a task backlog, an activity log, and a SERP-rank table. A new agent session starts by reading the brain — task list for its area, note search for anything a past agent learned — claims a task, and logs what it did. Findings become durable instead of evaporating when a session ends.

Area locks so agents never collide

Because more than one session can run at once, each agent acquires a lock on its area before it works. If relay-hardener holds area:relay, a second relay session that boots up gets told the area is taken and stops. Cross-cutting files — the root package.json, netlify.toml — take an extra named lock while they are edited. Two agents never stomp the same file.

No ghost commands

Every shell command an agent runs is written to a re-runnable script first, under a versioned fablescripts/ directory, before it executes. There are no invisible one-off commands. Anyone can read exactly what touched the repo and run it again. The image recovery, the SERP checker, the backup job, the live-site probes — all of it is on disk as scripts, not lost in a terminal scrollback.

A dev3lop hardening agent in a watchful, ready stance — the adversarial verification pass that tries to refute every finding before it ships

Trust, but verify your own agents

Here is the habit that mattered most. An agent that reads code and reports a bug is confident by construction — it found a thing, it wrote it up, it sounds sure. That confidence is not evidence. So before any high-severity finding entered the plan, a second wave of agents was sent in with one instruction: try to refute this.

That adversarial pass paid for itself immediately. Several of the scariest-sounding findings were wrong:

  • React is dead weight, remove it” — refuted. Five pages server-render shared components from the workspace UI package; ripping React out would have broken the build. (No client-side React ships, so it costs the visitor nothing — the original finding had the cost backwards.)
  • The contact form is silently losing submissions” — refuted. The contact page has no form at all; it links out to an external app. Nothing was being lost.
  • Nightly backups are filling the disk” — wrong, and wrong in the more dangerous direction. The backups were not filling the disk; they had never run. The real bug was worse than the reported one.

The lesson generalizes: a single agent’s finding is a hypothesis, not a fact. Make a different agent attack it before you act. We caught three confident, plausible, completely wrong conclusions in one pass — and found a worse bug hiding behind one of them.

What actually shipped

By the end of the sprint, live in production and verified against the real site and API:

  • Relay authorization + WebSocket origin fix. Board and card routes now resolve their owning room and run a real access check; WebSocket upgrades reject foreign origins. Regression tests cover both. Verified live: the endpoints stay gated and the backend came back healthy after the deploy.
  • 351 dead blog images recovered. 198 pulled back from the Wayback Machine into local assets; the 153 that were gone for good had their broken references removed. Zero wp-content image tags remain anywhere in the built site.
  • A backup that finally runs. The nightly job was rewritten to run host-side, back up the live SQLite database WAL-safely, verify its own output with an integrity check, gzip it, prune old copies, and leave a loud marker if it ever fails.
  • A modern, balanced footer. The old footer collapsed to a lopsided single column; the new one is genuinely multi-column at every width, with a call-to-action band and a tidy contact block — without changing the site’s look.

Every change went out as a focused commit, and a production poller confirmed each one landed. Now, the fleet itself.

A single specialized AI agent scanning its slice of the codebase, coordinated with the rest of the fleet through shared memory and area locks

The fleet — and how each agent becomes tooling you can run

Each agent below gets the same treatment: what it owns, what it found this sprint, how it generalizes into a reusable tool you could run on your own repo, and a set of copy-paste bug-hunting prompts grounded in real code paths. The prompts are the useful part — point them at the equivalent files in your stack.

site-hardener — the Astro marketing site

Keeps the Astro marketing site fast, indexable, and migration-clean — broken assets, SEO regressions, content-collection drift, and CSP gaps — without ever regressing the synthwave look.

The site-hardener owns apps/site plus the shared packages/ui and packages/seo, and its biggest win this sprint was recovering migration rot the WordPress -> React -> Astro move left behind: 351 dead wp-content blog images that 404’d in production. fablescripts/site/recover_wp_images.mjs walks every .md/.mdx file under apps/site/src/content (blog, services, work, agents — not just blog) for ![alt](https://dev3lop.com/wp-content/uploads/...) image syntax, asks the Wayback availability API for the closest snapshot, downloads the raw <ts>id_ form (no toolbar wrapper), and saves it under apps/site/public/images/blog/ — 198 recovered, 153 unrecoverable. The companion rewrite_image_refs.mjs then applies the .image-recovery.json map to the markdown: recovered URLs get rewritten to their local /images/blog/... path, the 153 unrecoverable ones get their image token stripped (it only rendered a broken-image icon). It also shipped a modern multi-column apps/site/src/components/Footer.astro (CTA band, link columns, contact block, accessible rating link) without touching the look. What it left open and flagged: duplicate blog slugs (the -2-suffixed files plus the collision notes already sitting in TODO-REDIRECTS.md), dead wp-content hyperlinks (the recovery pass only fixed ![]() image syntax, not []() links — those still point at 404s), and the CSP in the repo-root netlify.toml still carrying 'unsafe-inline' on script-src and style-src. It’s worth noting apps/site carries genuine duplication of its own: BaseLayout.astro hand-rolls its <head> meta/OG/Twitter/canonical block instead of reusing packages/seo’s SEOHead.astro, so the two drift (BaseLayout lacks og:image:width/height, hreflang alternates, article:* tags, and the auto-built Organization+LocalBusiness JSON-LD).

As reusable tooling. site-hardener generalizes cleanly because every Astro/Next marketing site that survived a CMS migration carries the same debt: dead asset hotlinks, slug collisions, hand-rolled <head> tags that drift from the SEO source of truth, a hand-edited redirect map that never runs in CI, and a CSP loosened to 'unsafe-inline' that never got tightened back. Packaged as a marketplace template, a team points it at their own repo and it runs the same passes — content-collection schema validation, Wayback asset recovery, duplicate-slug and canonical audits, redirect-map linting, CSP tightening — emitting a portable findings report plus re-runnable scripts under fablescripts/, and it self-checks that no page’s visual appeal regresses before claiming a fix.

Bug-hunting prompts for this domain:

Dead wp-content hyperlinks (not just images)

The Wayback recovery in fablescripts/site/recover_wp_images.mjs only matched markdown IMAGE syntax ![alt](...wp-content...) (its regex is /!\[[^\]]*\]\((https:\/\/dev3lop\.com\/wp-content\/uploads\/[^)\s]+)\)/g). Grep all of apps/site/src/content/**/*.{md,mdx} for markdown LINK syntax ](https://dev3lop.com/wp-content/...) and bare <a href> pointing at wp-content, /wp-admin, ?p=<id>, /wp-json, or /feed/. For each target, check whether it resolves against a rule in apps/site/public/_redirects or an existing route; report every link that falls through to the /* /404.html 404 catch-all at line 4030. These are dead outbound links the image pass never touched.

Duplicate blog slugs and canonical collisions

apps/site/src/pages/blog/[slug].astro builds paths in getStaticPaths from entry.slug (params: { slug: entry.slug }) and sets canonical: https://dev3lop.com/blog/${entry.slug}/. Across the 945 files in apps/site/src/content/blog/, find slug collisions: the ~11 filenames carrying a trailing numeric suffix (e.g. ...-2.md), any frontmatter that would derive the same slug as another file, and the collision cases already noted in TODO-REDIRECTS.md. Report any two entries that would emit the same /blog/<slug>/ URL or canonical, and whether apps/site/public/_redirects already 301s one onto the other.

Head-tag drift between BaseLayout and packages/seo

apps/site/src/layouts/BaseLayout.astro hand-rolls its own <head> meta/OG/Twitter/canonical/robots block instead of importing packages/seo/src/components/SEOHead.astro. Diff the two tag-by-tag: BaseLayout is missing og:image:width/og:image:height (SEOHead lines 72-73), the hreflang alternates (en + x-default), and the article:published_time/article:modified_time/article:author tags. Note that BaseLayout only injects whatever generic seo.jsonLd prop a page passes in (line 70), whereas SEOHead auto-builds the combined Organization + LocalBusiness JSON-LD via buildOrganizationSchema/buildLocalBusinessSchema — so BaseLayout pages ship no structured data unless the caller remembers to. Enumerate which apps/site/src/pages/*.astro use BaseLayout (e.g. index, about, contact, products/*, intel/stack/*) and report which SEO tags each silently loses.

Redirect map never runs in CI + dead rules

apps/site/public/_redirects is 4030 lines (~3992 non-comment rules) and is generated by scripts/heal-redirects.ts (repo root), but it is committed by hand and the generator is wired into no GitHub Actions workflow — confirm it appears in neither .github/workflows/ci.yml nor deploy-relay.yml. Then lint the file: the /* /404.html 404 catch-all is at line 4030, so any rule below it is unreachable; also find rules whose source equals their destination, 301 chains (A->B where B->C), and rules whose target is itself the /410 or /404 page. List each dead/unreachable rule with its line number.

Content-collection schema integrity

Validate every entry against the zod schemas in apps/site/src/content/config.ts. The shared seoSchema requires title and description (both non-optional) and defaults noindex to false; flag blog/services/work/agents/pages/intel entries missing either, with a description over ~160 chars, an ogImage/heroImage path that doesn’t exist under apps/site/public/, a canonical that fails the z.string().url() check, or noindex: true on a page still linked from Footer.astro or the sitemap. For the agents collection also confirm the required spec.version/spec.model/spec.instructions are present so the marketplace spec JSON stays portable.

CSP unsafe-inline and broken-asset audit on the built site

The CSP header in the repo-root netlify.toml allows 'unsafe-inline' on both script-src and style-src, and data: on script-src so Astro’s <ClientRouter /> view transitions can insert their data:application/javascript, sentinel. Run pnpm --filter site build, then scan apps/site/dist for: inline <script>/<style> and inline style=/on*= handlers that block moving to a nonce/hash CSP; <img>/<a href> pointing at /images/blog/... or wp-content paths that don’t exist on disk; and what relies on the data: script-src exception. Output the minimal CSP that would still pass and list every asset reference that 404s in the built output.


relay-hardener — the Relay chat backend

Hardens the Relay chat server (apps/agents) — Hono routes, jose sessions, WebSockets, SQLite, and Redis pub/sub — against authorization (IDOR), CSRF, cross-site WebSocket hijacking, missing rate limits, and unsafe secret defaults.

relay-hardener works the cross-origin trust boundary of Relay: a Hono + Node + SQLite + Redis app on a $6 droplet whose SPA lives on a different origin (dev3lop.com/relay/app) and authenticates with a SameSite=None jose cookie. This sprint it closed a boards/cards IDOR — routes in apps/agents/server/routes/boards.ts authenticated the user with requireSession but keyed off boardId/cardId and never checked room membership, so any logged-in user could read or mutate boards and cards in invite-only rooms by guessing ids. The fix added requireBoardAccess/requireCardAccess in server/routes/collab-utils.ts that resolve board/card -> room (stmts.roomById.get(board.room_id)) and run the same whitelist gate (isUserAllowed via gateRoom) as requireRoomAccess, backed by a regression test (server/__tests__/boards-access.test.ts) that asserts a non-member gets 403 WHITELIST_BLOCKED on every board/card route. It also closed cross-site WebSocket hijacking: WS upgrades skip CORS, so server/origins.ts now exposes isAllowedWsOrigin and both registerRoomsWebSocket (routes/rooms.ts, /api/ws/rooms/:slug) and registerUserWebSocket (user-realtime.ts, /api/ws/user) reject a present-but-unlisted Origin with close code 4403 before auth, asserted in server/__tests__/ws-origin.test.ts. Still open and explicitly tracked: no CSRF defense on state-changing POSTs (the SameSite=None cookie auto-attaches cross-site and nothing checks an Origin/Sec-Fetch-Site header or token), a SESSION_SECRET dev-default fallback in server/session.ts (?? 'dev-secret-change-in-production-32chars!!') that silently signs prod sessions if the env var is unset, and no per-user rate limiting on the message POST or WS send path.

As reusable tooling. Every Relay finding generalizes to any cross-origin SPA-plus-API with cookie auth, which is most B2B products. As a marketplace template, relay-hardener ships as a portable spec that maps a team’s own router files, session/cookie module, and WebSocket upgrade handlers, then runs a fixed battery: enumerate every id-keyed route and prove it re-derives the resource owner before acting (IDOR), confirm SameSite=None/credentialed-CORS endpoints carry an origin or token check (CSRF), confirm WS upgrades gate on an Origin allowlist before auth (CSWSH), and grep secret-loading for ?? '<literal>' and ?? '' fail-open patterns. Teams point it at their repo and get back the same IDOR/CSRF/CSWSH/rate-limit report plus drop-in regression tests in the style of boards-access.test.ts and ws-origin.test.ts.

Bug-hunting prompts for this domain:

Find the next IDOR after the boards fix

In apps/agents/server/routes, every router that resolves a resource by a non-slug id (boardId, cardId, notificationId, threadId, channelId) must re-derive the owning room and run isUserAllowed (via gateRoom), not just requireSession. The boards fix did this via requireBoardAccess/requireCardAccess in routes/collab-utils.ts. Audit notifications.ts especially: GET and POST /threads/:threadId/messages call stmts.threadById.get(…) and only check requireSession — they never resolve thread -> room -> isUserAllowed, so any authed user can read or post into a thread in an invite-only room by guessing the threadId. Confirm this, then check the /:id/status, /:id/saved, /:id/visibility handlers: they gate on canTouchNotification (target/actor/public visibility), so verify that check is sufficient versus a room-membership gate. Note that conversations.ts is already owner-scoped at the SQL layer (getConversation/deleteConversation take session.userId), so confirm that scoping holds rather than assuming a hole. For each real hole, write the room-resolving gate and a regression test mirroring server/tests/boards-access.test.ts.

Prove the CSRF exposure on state-changing POSTs

The Relay session cookie is set with sameSite:‘None’ in apps/agents/server/session.ts (setSessionCookie, only when IS_PROD) because the SPA on dev3lop.com is cross-origin to the api.dev3lop.com server. That means the browser auto-attaches d3_session on cross-site requests, and nothing in server/index.ts or the routers checks an anti-CSRF token or the Origin/Sec-Fetch-Site header on POST/PATCH/DELETE. The CORS middleware in index.ts uses origin:(origin)=>(isAllowedOrigin(origin)?origin:null) with credentials:true — that only restricts who can READ a credentialed response, it does not stop a forged request from executing. Enumerate every mutating route in server/routes (POST /api/rooms/:slug/messages in rooms.ts, the boards/cards mutations in boards.ts, /logout, whitelist add/remove, access-request approve/deny) and confirm each is forgeable from an attacker page. Propose an Origin-allowlist check (reusing isAllowedOrigin from origins.ts) or a double-submit token applied as Hono middleware, and note that simple GET endpoints are safe.

Audit secret fail-open and dev defaults

In apps/agents/server/session.ts the signing key is new TextEncoder().encode(process.env.SESSION_SECRET ?? 'dev-secret-change-in-production-32chars!!'). If SESSION_SECRET is unset in prod, the server boots silently and signs real jose sessions with a public, in-repo secret — anyone can forge a d3_session for any userId. Classify the other env defaults precisely: GITHUB_CLIENT_SECRET in routes/auth.ts is ?? '' (auth just breaks, fail-closed-ish), REDIS_URL in redis.ts is ?? 'redis://localhost:6379' (safe localhost default), APP_PUBLIC_URL in origins.ts is ?? 'https://dev3lop.com' (a sane prod default but it also drives the CORS/WS allowlist, so a wrong value silently widens trust). Only SESSION_SECRET’s literal fallback is a must-fail-fast. Add a startup assertion that throws when NODE_ENV===‘production’ and SESSION_SECRET is missing or equals the known dev default, so a misconfigured deploy crashes instead of running insecure. Cross-check .github/workflows/deploy-relay.yml and docker-compose.yml to confirm the var is actually injected on the droplet.

Re-verify the WebSocket origin gate and its ordering

apps/agents closed cross-site WebSocket hijacking by checking isAllowedWsOrigin (server/origins.ts) in both upgrade handlers: registerRoomsWebSocket in server/routes/rooms.ts (/api/ws/rooms/:slug) and registerUserWebSocket in server/user-realtime.ts (/api/ws/user), closing with code 4403 on a foreign Origin. Verify the policy holds end to end: (1) the Origin check runs BEFORE requireRoomAccess/getSession so an unlisted origin is rejected pre-auth (confirm by reading the handler order — rooms.ts checks isAllowedWsOrigin before requireRoomAccess, user-realtime.ts before getSession); (2) isAllowedWsOrigin returns true for a MISSING origin (non-browser clients) — confirm a real browser can never reach the API with an absent Origin header through Caddy, and note apps/agents/Caddyfile only sets X-Real-IP/X-Forwarded-For/X-Forwarded-Proto via header_up and does NOT strip or rewrite Origin, so a browser handshake always carries it; (3) ALLOWED_ORIGINS is the comma-split of APP_PUBLIC_URL and a lookalike like dev3lop.com.evil.example is rejected (already asserted in server/tests/ws-origin.test.ts). Then add an e2e that opens /api/ws/rooms/:slug from a disallowed Origin and asserts the 4403 close.

Stress the message + WS send path for missing rate limits

Relay has no per-user rate limiting. POST /api/rooms/:slug/messages (roomsRouter.post(’/:slug/messages’) in server/routes/rooms.ts) and the WebSocket { type:‘send’ } handler in registerRoomsWebSocket both publish-first / persist-async: they redisPub.publish to KEYS.roomChannel and queueMicrotask(() => persistRoomMessage(…)) a SQLite insert with no throttle, on a 1GB droplet with SQLite in WAL mode. A single authed user (or a script reusing one stolen SameSite=None cookie) can loop sends and fan out unbounded Redis traffic + DB writes to every connected client. Quantify the blast radius, then design a per-user (and per-IP, using the X-Real-IP Caddy sets) token-bucket — applied to both the HTTP POST and the WS send frame so they can’t be bypassed by switching transport. Also check routes/uploads.ts POST /room/:slug: it runs Sharp compression (compressToSmallWebp loops up to 36 encode passes) with a 16MB MAX_INPUT_BYTES cap but no per-user upload-rate cap, another CPU-exhaustion vector on the same box.

Validate untrusted input reaching SQLite and the upload store

Relay has no zod schemas; every router hand-rolls validation via cleanText/cleanVisibility in routes/collab-utils.ts and cleanCardPriority/cleanCardBody in boards.ts. Audit the gaps: in routes/boards.ts the POST /:boardId/cards handler takes body.columnId, body.notificationId, body.threadId, and body.assigneeUserId straight into stmts.insertCard with only a typeof check — confirm columnId is validated to belong to THIS board (statusFromColumn does findIndex then Math.max(0,index), so an unknown columnId silently falls back to the first column’s status rather than erroring) and assigneeUserId is a real user in the room, not an arbitrary id. In routes/uploads.ts confirm safeFileName (regex-strip + slice(0,120)) only sanitizes original_name for display and that the on-disk path is ${crypto.randomUUID()}.webp under UPLOAD_DIR — so original_name never reaches the filesystem — and that GET /api/uploads/:id (intentionally unauthenticated capability URL) reads attachment.storage_path from SQLite, not from the URL, so it cannot be coerced to read outside the upload dir. Flag any body field written to SQLite without a bounded allowlist and add the missing cleaners or a shared zod layer.


api-hardener — the Netlify serverless functions

Audits the apps/api Netlify Functions for the gap between what they claim to do and what they actually do — fake persistence, missing input validation, no CORS/OPTIONS handling, unbounded params, raw error leakage, and unauthenticated paths into LLM routing/inspection logic.

api-hardener read all five handlers in apps/api/src/functions and found the headline problem is honesty, not crashes. leads.ts ships a mockLeads array with the real Netlify Identity check commented out (lines 36-42), and rum.ts pushes Web Vitals into an in-process const metrics: MetricPayload[] = [] that evaporates on every cold start — both carry comments promising persistence (“in production, store in DB”) that the runtime can’t keep, since netlify.toml [functions] bundles these with esbuild onto Netlify’s ephemeral filesystem (no SQLite, no droplet — that lives in apps/agents). egress-report.ts parseInts a days query param with no clamp or NaN guard and feeds it into Array.from({ length: Math.ceil(days * 12) }), so ?days=100000000 is an unbounded allocation. Across every handler there is no zod validation (a repo-wide grep returns nothing), no Access-Control-* header or OPTIONS handling, and no rate limiting — and prompt-gateway.ts / agent-route.ts accept anonymous POSTs that JSON.parse arbitrary bodies into inspectOutbound/decideRoute from @dev3lop/agents-core, with prompt-gateway.ts letting a caller spread a policyOverride over DEFAULT_GATEWAY_POLICY. Three handlers also echo error.message straight into the response body. The one thing consistently done right is method allow-listing — every handler guards event.httpMethod and returns 405.

As reusable tooling. As a marketplace template, api-hardener ships as a portable spec a team points at their own serverless directory (Netlify Functions, Lambda, Cloudflare Workers, or Vercel). It encodes a checklist that travels: flag handlers whose comments promise a database the runtime’s filesystem can’t keep, demand a schema validator at every JSON.parse boundary, require explicit numeric-bound and request-body-size gates, escalate any unauthenticated route that fans into shared business logic or a metered third-party API, and catch raw error.message passthrough. Teams run it on a PR diff or a whole functions folder and get back a triaged list of “lies, missing guards, and abuse vectors” scoped to their stack rather than generic OWASP boilerplate.

Bug-hunting prompts for this domain:

Mocks pretending to persist

Read every handler under apps/api/src/functions (leads.ts, rum.ts, egress-report.ts, prompt-gateway.ts, agent-route.ts). For each, decide whether a caller would reasonably believe their data is durably stored or read from a real source. Flag the mockLeads constant in leads.ts, the in-process const metrics: MetricPayload[] = [] array in rum.ts (it accepts POSTs, console.logs, pushes to memory, caps at 1000, and drops everything on cold start), and the mockPromptLogs/mockDetections generators in egress-report.ts that fabricate randomized PromptLog rows. Cross-reference netlify.toml [functions] (directory = "apps/api/src/functions", node_bundler = "esbuild", ephemeral FS — the SQLite/droplet stack lives in apps/agents, not here) to prove there is no persistence layer, and list each endpoint that silently drops or fabricates data versus a comment claiming a DB.

Unbounded numeric and payload bounds

In apps/api/src/functions/egress-report.ts, trace the days query param: it is read via parseInt(event.queryStringParameters?.days ?? '7', 10) with no clamp, NaN guard, or upper bound, then used in Array.from({ length: Math.ceil(days * 12) }) inside mockPromptLogs before buildEgressReport. Confirm ?days=100000000 triggers a ~1.2B-element allocation, and that a non-numeric days yields NaN (so Math.ceil(NaN * 12) produces a length-0 array, not an error). Then audit leads.ts (minScore/limit via parseInt(params.x || 'n', 10), also unclamped and NaN-prone) and the POST bodies in prompt-gateway.ts/agent-route.ts (payload.estimatedInputTokens, maxLatencyMs, etc. accepted with only ?? defaults) for missing size/length/range limits. Propose explicit min/max clamps, a NaN fallback, and a max request-body size for each.

Missing schema validation at JSON.parse

Find every JSON.parse(event.body) / JSON.parse(event.body || '{}') boundary in apps/api/src/functions (rum.ts, prompt-gateway.ts, agent-route.ts). Note that prompt-gateway.ts casts body.request as OutboundRequest and body.policy as Partial<GatewayPolicy> with only a hand-rolled if (!request?.id || !request?.clientId || !request?.messages) truthiness check, agent-route.ts reads payload.useCase/payload.fallbackOrder/etc. behind ?? defaults, and rum.ts checks only body.name and typeof body.value !== 'number'. Confirm a repo-wide grep -rn 'zod' apps/api/src returns nothing — there is no schema validator anywhere. Identify the type-confusion surface this creates (an attacker controls the shape of objects passed into @dev3lop/agents-core), and draft a zod schema per endpoint that rejects malformed input with a 400 before it reaches the core library.

No CORS or OPTIONS handling, only method gates

Audit CORS and HTTP-method handling across apps/api/src/functions. Confirm via grep that no handler sets any Access-Control-* header or handles an OPTIONS preflight (none exist), so the only access control present is the event.httpMethod !== 'GET'/'POST' 405 guard at the top of each handler. Note these functions are served same-origin under dev3lop.com/api/* (per the netlify.toml redirects to /.netlify/functions/:splat), while the Relay backend on api.dev3lop.com is a separate origin with its own Caddy/Origin-allowlist gate. The real gap is not browser CORS (absent headers already block cross-origin browser reads) but that non-browser clients — curl, scripts, server-to-server — hit these endpoints with no auth or origin check at all. Recommend a shared CORS/OPTIONS helper with an explicit origin allowlist (mirroring the Origin-allowlist relay-hardener added for WebSocket upgrades in apps/agents), and flag any endpoint that would leak data to an arbitrary origin once auth is added.

Unauthenticated paths into LLM routing/inspection logic

Trace the abuse path for apps/api/src/functions/prompt-gateway.ts and agent-route.ts: both accept anonymous POSTs, JSON.parse the body, and call inspectOutbound/decideRoute from @dev3lop/agents-core — PII-inspection and provider-routing logic (the route’s fallbackOrder defaults to ['openai', 'anthropic', 'xai', 'gemini', 'open_source']). Confirm there is no auth, no API key, no per-client rate limiting, and no quota, so an attacker can replay these endpoints freely. Note that prompt-gateway.ts accepts a caller-supplied policyOverride spread over the default policy (const policy = { ...DEFAULT_GATEWAY_POLICY, ...policyOverride }), letting a request weaken the gateway’s own PII/blocking policy — a policy-bypass, not just compute abuse. Write up both the function-level compute/cost-amplification risk and the policy-override bypass, and propose an auth + per-client rate-limit gate that runs before any @dev3lop/agents-core call.

Disabled auth and raw error-message leakage

In apps/api/src/functions/leads.ts, confirm the Netlify Identity check is commented out (lines 36-42, // if (!clientContext?.user) { return 401 }) so the mockLeads payload — company, domain, score, status, reason — is served to anyone who GETs the endpoint; assess what re-enabling that check requires (a real Netlify Identity JWT verification, not just uncommenting). Separately, audit error handling: egress-report.ts, prompt-gateway.ts, and agent-route.ts return error instanceof Error ? error.message : 'unknown' directly in the JSON response body, leaking internal failure detail, while leads.ts and rum.ts correctly return a generic 'Internal server error' after console.error. Recommend implementing real auth on leads, and replacing the raw-message passthrough in the other three with a generic client message plus server-side console.error logging — matching the pattern leads/rum already use.


infra-hardener — the DigitalOcean droplet + deploy pipeline

An agent that audits the single-box droplet, cloud-init, Docker/Caddy stack, and SSH deploy pipeline for the failures that only show up at 3am: backups that never ran, disks and RAM that quietly fill, and health checks that lie.

infra-hardener owns the $6 DigitalOcean droplet that runs Relay behind Caddy at api.dev3lop.com, plus everything that builds and ships to it: infra/digitalocean (Terraform + cloud-init.yaml.tftpl), apps/agents/Dockerfile / docker-compose.yml / Caddyfile, and .github/workflows/deploy-relay.yml. Its headline find was a backup that had never run once: the nightly SQLite job ran sqlite3 inside the node:22-alpine relay container (which has no sqlite3) against a $$-mangled destination filename from a Terraform templatefile, so it failed silently every night and produced zero backups. We replaced it with a host-side, WAL-safe relay-backup-sqlite (written under write_files in cloud-init.yaml.tftpl, cron’d at 0 3 * * *) that resolves the relay_data docker volume via docker volume inspect, runs sqlite3 .backup against agents.db, verifies with PRAGMA integrity_check, gzips, prunes past 14 days, and drops a LAST_RUN_FAILED marker on failure. This was also the agent that corrected the sprint’s worst mis-diagnosis: the original theory was “nightly backups are filling the disk,” when in fact they had never run at all — wrong, and wrong in the more dangerous direction. Still open and documented: no Docker log rotation or image prune on a 25GB disk that rebuilds from source (build: context: ../..) on every deploy, no container memory limits on a 1GB box (only Redis is capped, at 128mb), ssh_allowed_cidrs defaulting to 0.0.0.0/0 despite a “keep this tight in production” comment (with cloud-init also running ufw allow 22/tcp), CI that SSHes in as root, and a /api/health handler that returns ok: true even when Redis is unavailable and never opens SQLite at all — so the compose healthcheck and the deploy probe both report green over a degraded backend.

As reusable tooling. As a marketplace template, infra-hardener ships as a portable spec JSON that points at four globs — a Terraform/cloud-init dir, a Dockerfile, a compose file, and a deploy workflow — and runs the same playbook against any team’s single-VM stack: prove the backup actually wrote a restorable file, model disk and RAM exhaustion on the real box size, and check that the health endpoint fails when a dependency fails. The high-leverage move it encodes is the restore drill — most teams have a backup script and zero evidence it produces a file you can sqlite3 open — and the templatefile-escaping class of bug ($$, ${}, %{}) that silently breaks generated shell scripts. Teams run it on their own repo, get a ranked list of “this will page you at 3am” findings, and can escalate to consulting for the box-specific tuning.

Bug-hunting prompts for this domain:

Prove the backup is restorable, not just that it ran

Open infra/digitalocean/cloud-init.yaml.tftpl and read the relay-backup-sqlite script written under write_files, plus the cron line echo "0 3 * * * root /usr/local/bin/relay-backup-sqlite" > /etc/cron.d/relay-backup in runcmd. Trace the full path: confirm the script runs on the host (where sqlite3 is installed via the packages block) and not inside the node:22-alpine relay container, that it resolves the real volume via docker volume inspect relay_data -f '{{ .Mountpoint }}', and that it backs up $${VOL}/agents.db — the actual DB name set in apps/agents/server/db.ts (DB_PATH = path.join(DATA_DIR, 'agents.db')). Verify the templatefile escaping is correct: this is a Terraform templatefile, so shell vars must be written $${VAR} and $(...) passes through literally — a single $ would mangle the filename like the original broken version. Then check the gap that matters more: nothing ever reads the LAST_RUN_FAILED marker or alerts on it, and there is no restore drill. Write the missing job that takes the newest agents-*.db.gz from /var/backups/relay, gunzips it, runs PRAGMA integrity_check and a SELECT count(*) against the real tables, and fails loudly if the backup is empty or unopenable.

Model disk exhaustion on a 25GB box that rebuilds from source

The droplet is s-1vcpu-1gb (25GB disk — see the droplet_size description in infra/digitalocean/variables.tf) and both .github/workflows/deploy-relay.yml and the relay-refresh script in cloud-init.yaml.tftpl run docker compose ... up -d --build on every deploy, where the relay service has build: context: ../.., dockerfile: apps/agents/Dockerfile (builds from source, no pinned image, no registry). Count every unbounded writer: dangling images from each rebuild (no docker image prune anywhere), the default Docker json-file log driver on the caddy/relay/redis containers in apps/agents/docker-compose.yml (no logging: max-size/max-file block on any service — note that Caddy’s own access log already rotates via roll_size 10mb/roll_keep 5 in Caddyfile, but the container’s stdout/stderr does not), /var/log/relay-backup.log (the backup script appends to it forever with no rotation), and 14 days of gzipped backups. Estimate steady-state disk growth and write the per-service compose logging: blocks plus a post-deploy docker image prune -f that stop it.

Find the missing memory limits on a 1GB box

Audit apps/agents/docker-compose.yml for memory limits on the 1GB droplet, which runs caddy + relay + redis (plus the host and a from-source docker build). Only redis is capped (--maxmemory 128mb --maxmemory-policy allkeys-lru); caddy and relay have no mem_limit or deploy.resources.limits. The cloud-init runcmd adds a 2G swapfile (fallocate -l 2G /swapfile), which hides an OOM as thrash instead of a clean kill. Determine what happens when the Node/Hono relay process leaks or a docker compose up --build runs concurrently with serving traffic: which container does the kernel OOM-killer target, and does it take caddy (TLS termination for api.dev3lop.com, and the only thing with ports: 80/443 published) or redis (room message fan-out via pub/sub) down with it? Propose explicit per-service memory limits that still leave headroom for the from-source build step, since the build runs on the same box that serves production.

Audit health-check honesty end to end

Read the /api/health handler in apps/agents/server/index.ts and the healthcheck block for the relay service in apps/agents/docker-compose.yml, plus the post-deploy probe docker compose ... exec -T relay wget -qO- http://localhost:3741/api/health in both .github/workflows/deploy-relay.yml and the relay-refresh script in cloud-init.yaml.tftpl. The handler catches the redis.ping() failure and still returns ok: true with checks.redis: 'unavailable', and despite the comment // Health check — includes Redis + Neon status it checks neither SQLite nor Neon — it only ever sets checks.server: 'ok' and a redis status. That means: (1) the compose healthcheck (wget on :3741/api/health) reports healthy while Redis is down, so caddy — which has depends_on: relay: condition: service_healthy — starts in front of a degraded backend; and (2) the deploy probe passes as long as the HTTP layer answers 200, so a deploy that left the DB unwritable is reported green. Fix the endpoint to actually open the SQLite DB (a cheap PRAGMA quick_check or SELECT 1) and return a non-200 when a hard dependency is down, and make the deploy probe assert on the JSON body (ok/checks), not just the wget exit code.

Close the firewall and deploy-credential gaps

Cross-check the two firewalls and the deploy identity. In infra/digitalocean/main.tf the digitalocean_firewall.relay inbound rule for port 22 uses var.ssh_allowed_cidrs, but variables.tf defaults that to ["0.0.0.0/0", "::/0"] with the comment ‘Keep this tight in production’ — so by default SSH is open to the entire internet at the cloud edge, and the cloud-init runcmd also runs ufw allow 22/tcp on the host, so neither layer constrains it. Then check outputs.tf: github_actions_secrets hardcodes RELAY_DROPLET_USER = "root", and .github/workflows/deploy-relay.yml SSHes in with secrets.RELAY_DROPLET_USER to run docker compose up --build — so CI logs in as root even though cloud-init provisions a dedicated relay user (groups: [docker, sudo]) that could run compose without root. Propose a tightened ssh_allowed_cidrs (bastion/office/GitHub-Actions ranges), switch the deploy identity to the non-root relay user, and flag that fail2ban is listed in packages but never configured for sshd anywhere in write_files/runcmd — it is installed but doing nothing.

Stress-test deploy and boot resilience for divergence and rollback

Walk the deploy and first-boot paths for failure modes. In .github/workflows/deploy-relay.yml and the relay-refresh script in cloud-init.yaml.tftpl, both run git pull --ff-only origin main followed by docker compose ... up -d --build, and the workflow sets script_stop: true. Determine: (1) if the droplet’s git checkout has diverged or has local edits, --ff-only hard-fails and the deploy aborts, leaving the box on an unknown commit with restart: unless-stopped still serving the old image; (2) there is no image tag or registry and the relay service builds from source (build: context: ../..), so a commit that compiles but crashes at runtime leaves a crash-looping container with no previous-good image to fall back to; (3) the relay-compose.service systemd unit is Type=oneshot with ExecStart=/usr/bin/docker compose up -d --build, so the box’s boot path runs a from-source build that needs the network for pnpm install — model what happens if a reboot coincides with a registry or GitHub outage and the build can’t fetch dependencies. Propose building and tagging an image once (deploy by digest), and a git reset --hard origin/main guard so the pull can’t wedge the box on a dirty checkout.


debt-sweeper — repo technical debt

Hunts down the repo’s accumulated cruft — tracked binary bloat, dead and duplicate content, and hand-edited artifacts that drift from their source scripts — and decides what to archive versus delete.

In this sprint debt-sweeper found a ~42MB WordPress export (dev3lop.WordPress.2026-04-24.xml, 41.9MB) and ~8.5MB of Lighthouse output (14 reports/lh-*.json and reports/trace-*.json files) committed straight into the working tree, plus root strays like redirection-404-April 23, 2026.csv and redirection-dev3lop-com-april-23-2026.json that belong under data/ or .gitignore — and reports/ is not ignored at all even though it is pure generated output. The bigger finding was process drift: apps/site/public/_redirects is a 4,029-line file that scripts/build-redirects.ts (wired as pnpm build:redirects) and scripts/heal-redirects.ts (pnpm heal:redirects) are meant to generate, but it has been hand-edited and committed, and .github/workflows/ci.yml — which already runs typecheck, lint, the agents test suite, a Playwright e2e job, Lighthouse CI, a lychee link check, and an image-size check — has no step that rebuilds the redirects or diffs them against their sources, so the file and the generator have silently diverged. It also surfaced 6 duplicate from-path rules (/testimonials, /testimonies, /team-page, /our-values, /contact-dev3lopcom, and /contact-dev3lopcom/) where Netlify takes first-match and the later rule is dead, on top of the broken-target and chain-flattening work already flagged in TODO-REDIRECTS.md.

As reusable tooling. Packaged as a marketplace template, debt-sweeper is a read-only repo auditor a team points at their own monorepo: it greps git ls-files for tracked files over a size threshold, flags generated artifacts that have no CI step regenerating them, and detects duplicate or shadowed rules in routing files like Netlify _redirects. The portable spec ships as config (size budgets, glob patterns for “should be ignored”, and a map from each generated file to its generator script) plus a CI drift check it can install — so the same agent that finds the debt also leaves behind the guardrail that stops it recurring, and every finding comes with an explicit archive-vs-delete recommendation rather than a blind git rm.

Bug-hunting prompts for this domain:

Find tracked binary and generated bloat

Run git ls-files | xargs du -k | sort -rn | head -40 in the dev3lop repo. For every tracked file over ~500KB, classify it: source asset, generated artifact, or one-time import dump. Flag dev3lop.WordPress.2026-04-24.xml (41.9MB), everything under reports/ (the lh-*.json and trace-*.json Lighthouse output, ~8.5MB across 14 files), and root strays redirection-404-April 23, 2026.csv and redirection-dev3lop-com-april-23-2026.json. For each, decide archive-vs-delete: is it reproducible from a script (delete + gitignore), a historical import (move out of the working tree), or a real source asset (keep)? Cross-check against .gitignore — note that it ignores node_modules/, files/, dist/, *.log, data/*.sqlite, and coverage/ but NOT reports/, even though that directory is pure generated Lighthouse output. Propose the exact .gitignore lines plus a git rm --cached plan that does not rewrite history.

Detect _redirects drift from its generator

In dev3lop, apps/site/public/_redirects (4,029 lines) is supposed to be produced by scripts/build-redirects.ts (pnpm build:redirects) and healed by scripts/heal-redirects.ts (pnpm heal:redirects), but .github/workflows/ci.yml has no step that rebuilds redirects or diffs them — its jobs are lint/typecheck, the agents test suite, a Playwright e2e job, a build, Lighthouse CI, a lychee link-check, and an image-size-check, none of which touch the redirect generator. Run pnpm build:redirects into a temp path and diff its output against the committed apps/site/public/_redirects. Report every rule present in the committed file but not in the generated output (hand-edits the script would clobber) and vice versa. Then write a CI job for ci.yml that regenerates the file and fails on a non-empty git diff, so the artifact can never silently diverge from its source again.

Find dead and duplicate redirect rules

Audit apps/site/public/_redirects in dev3lop for rules that never fire. Netlify uses first-match, so: (1) extract all duplicate from-paths with awk '/^\//{print $1}' apps/site/public/_redirects | sort | uniq -d — there are exactly 6 (/testimonials, /testimonies, /team-page, /our-values, /contact-dev3lopcom, and /contact-dev3lopcom/) where the second occurrence is dead. (2) Find rules shadowed by an earlier splat rule (e.g. an /about-us* wildcard above a more specific /about-us-and-business-intelligence rule). (3) Cross-reference the 410 kill-list rules and the /* /404.html 404 fallback. Reconcile findings against the MongoDB-tutorial DELETE list and other cleanup items in TODO-REDIRECTS.md, and produce a deduplicated, ordered rule list plus the count of rules safe to remove.

Reconcile redirect targets against real routes

For dev3lop, build the set of real published URLs by scanning apps/site/src/pages/ (Astro routes) and the generated blog slugs, then load every redirect target (the second column) from apps/site/public/_redirects. Report any redirect whose to target resolves to a path that no longer exists or that itself 301s again (a chain that should be flattened to a single hop). Watch specifically for the dev3lop.dom-instead-of-dev3lop.com typo class called out in TODO-REDIRECTS.md (it lists this as a broken-target class to review in the 244-rule redirection-dev3lop-com-april-23-2026.json import — confirm whether any survive in the committed _redirects), and for stale /services/general/* targets that wpToNewMapping in scripts/build-redirects.ts later moves under /services/workflow-automation/ (e.g. the Knime and Alteryx consulting pages). Output a table of broken-target rules with a suggested corrected target or a recommendation to 410 the source.

Audit scripts/ for orphaned one-shot tooling

Inventory scripts/ and the root package.json scripts in dev3lop. For each .ts/.mjs (import-wp-xml.ts, scrape-wp.ts, heal-redirects.ts, build-redirects.ts, clean-mdx-junk.ts, launch-checklist.ts, build-data-products.ts, sync-relay-app.mjs, and the three files under scripts/page speed/), determine whether it is still wired to an npm script (note import:wp, build:wp-url-map, heal:redirects, scrape:wp, build:redirects, data:products, and the perf* set all exist) and whether its inputs still exist in the tree. Flag migration one-shots that consumed the now-removable 41.9MB WordPress XML (import-wp-xml.ts, scrape-wp.ts) and will never run again, scripts whose output is committed but whose source data is gone, and the directory named scripts/page speed/ with a literal space — which forces quoted invocation in every perf* package.json script and is fragile in shell and CI. Recommend archive (move to scripts/archive/ or out of the build path) vs delete for each, and note any that should become a documented, re-runnable command instead of a buried script.

Find data-dump files that should be gitignored, not committed

In dev3lop, scan for one-time exports or local scratch checked into git: the root dev3lop.WordPress.2026-04-24.xml, redirection-404-April 23, 2026.csv, redirection-dev3lop-com-april-23-2026.json, and everything under reports/. Compare against .gitignore, which already ignores files/, data/*.sqlite, *.log, and coverage/, and identify the categories it MISSES — notably the entire reports/ Lighthouse output and the root-level redirect CSV/JSON dumps. Propose the precise .gitignore additions, a git rm --cached command list, and for each removed file a one-line note on where the canonical copy should live (e.g. move the historical WordPress import to an out-of-tree archive, keep data/wp-url-map.json if a script still reads it) so nothing load-bearing is lost.


seo-ranger — search rankings

seo-ranger maps every target keyword to a single canonical page, tracks where dev3lop.com actually ranks via a polite real-browser SERP check, and files content-gap and thin/duplicate-page fixes as tasks for site-hardener.

seo-ranger builds the rank-tracking habit dev3lop never had. fablescripts/seo/build_keywords.sh derives one keyword per services/** page title (55 service pages, lowercased) and appends hand-curated head terms from keywords-manual.tsv (so regeneration never clobbers manual entries), then serp_check.sh drives Safari over Apple Events — DuckDuckGo by default, one engine per run, jittered 8-15s delays — and writes ranks into the shared brain’s serp table for trend queries. It is observation of our own domain, not scraping at scale. Crucially it is recommendation-only: it never edits apps/site itself, it hands concrete page-path fixes to site-hardener as --area site tasks. Walking the 945-post blog collection it surfaced exactly the kind of work it is built to catch: real duplicate-content liabilities like the -2-suffixed WordPress import dupes (working-sessions-reduce-miscommunication-in-analytics-projects-2.md and its original collide on title) and a sitemap whose filter only excludes /admin/ while pages set noindex in frontmatter, both of which became hand-offs rather than silent edits.

As reusable tooling. As a marketplace template, seo-ranger ships as a portable spec: a keyword-derivation step pointed at the team’s own content collection, a polite real-browser SERP checker (no paid rank-API, no headless-scraper TOS risk), and a brain-backed history table — all driven from fablescripts/-style re-runnable scripts so every run is auditable. A team drops in their own domain and head terms, points build_keywords.sh at their content directory, and gets the same recommendation-only loop: rank trends in, prioritized content/meta/dedup tasks out. The hard separation between “ranger observes and recommends” and “hardener edits” is the reusable pattern teams want — it keeps an autonomous SEO agent from rewriting pages on its own.

Bug-hunting prompts for this domain:

noindex pages leaking into the sitemap

In apps/site/astro.config.ts the @astrojs/sitemap integration’s filter only excludes URLs containing ‘/admin/’ (filter: (page) => !page.includes('/admin/')). But pages can set seo.noindex: true and BaseLayout honors it — see apps/site/src/layouts/BaseLayout.astro lines ~49-52 which emit <meta name="robots" content="noindex, nofollow"> — and apps/site/src/pages/404.astro, 410.astro, and dashboard/index.astro all set noindex. Enumerate every route/content entry that is noindex’d or robots-disallowed (cross-check apps/site/public/robots.txt, which Disallows /admin/, /api/, and /_astro/) and confirm whether each is excluded from the generated sitemap-index.xml. Any page that is noindex but still listed in the sitemap is a contradictory crawl signal — list each offending URL and propose a precise filter predicate that also covers /api/ and the noindex’d routes. Recommendation-only: emit findings as a site-hardener task with exact paths, do not edit the config.

duplicate-content WordPress import dupes

The blog collection apps/site/src/content/blog/ has 945 posts, 11 of which carry a numeric suffix from the WordPress import. Three are true -2 collision dupes (working-sessions-reduce-miscommunication-in-analytics-projects-2.md, the-saas-you-picked-yesterday-will-be-more-expensive-tomorrow-2.md, using-analytics-to-measure-brand-sentiment-across-channels-2.md); the rest are year-suffixed (e.g. -2025, -2023) and may be legitimately distinct, so classify carefully. For each -2 slug, find the un-suffixed original, diff their title: (they collide on title) and body, and classify as true duplicate, near-duplicate, or distinct. For true/near duplicates, check apps/site/public/_redirects (a 3,954-rule generated file) for existing redirects on either slug and recommend a canonical winner plus a 301 from the loser. Write the result as a site-hardener task listing slug pairs and the exact redirect lines to add — do not delete files yourself.

keyword cannibalization — many pages, one query

Run fablescripts/seo/build_keywords.sh to regenerate fablescripts/seo/keywords.tsv, then detect cannibalization where build_keywords.sh’s lowercasing of service titles produces an exact collision with a manual head term. Concretely, services/analytics-bi/tableau.md lowercases to ‘tableau consulting’ and services/analytics-bi/power-bi-consulting-services.md to ‘power bi consulting’ — both also appear verbatim in keywords-manual.tsv — so two source pages now target one query, and dozens of blog posts overlap that service-page intent. For each contested keyword, list every competing page in apps/site/src/content/ (services/, blog/, work/**), identify which one is canonical for that term, and recommend internal-link or noindex/consolidation moves so only one page ranks. Output as a prioritized site-hardener task; this agent recommends, site-hardener edits.

canonical URL correctness for blog and services

Audit canonical generation. apps/site/src/pages/blog/[slug].astro line 48 hardcodes canonical: https://dev3lop.com/blog/${entry.slug}/, BaseLayout (apps/site/src/layouts/BaseLayout.astro line 26) computes currentUrl = seo.canonical || new URL(Astro.url.pathname, siteUrl).href, AND packages/seo/src/utils/canonical.ts exports a buildCanonicalUrl() with its own trailing-slash logic that is NOT imported anywhere in apps/site (dead/parallel code — confirm and flag). Verify: (1) every page type resolves to one canonical matching the URL the sitemap and _redirects serve (astro.config.ts uses build.format ‘directory’, so trailing slash is expected); (2) blog [slug].astro’s seo object omits noindex, so a post with frontmatter seo.noindex:true is silently still indexed even though the content config (apps/site/src/content/config.ts seoSchema, noindex default false) defines the field — confirm the omission and recommend passing noindex: entry.data.seo?.noindex; (3) no two posts emit the same canonical. Report mismatches as a site-hardener task with exact file and line.

thin and orphaned pages with no ranking potential

Find thin/orphaned content across apps/site/src/content/. Flag blog and service entries whose rendered body is under ~300 words, whose seo.description (or top-level description) is missing or duplicated across pages, or that no other page links to (orphans). Cross-reference apps/site/src/content/config.ts (seoSchema: title required, description required, keywords default [], noindex default false) to catch entries relying on a thin/boilerplate description, and check whether each thin page appears in sitemap-index.xml and in fablescripts/seo/keywords.tsv. For each, recommend expand, merge, noindex (via seo.noindex), or 410 — apps/site/src/pages/410.astro already exists as the gone pattern. Deliver as a ranked site-hardener task — recommendation only.

SERP checker rank-attribution and parser drift

Review the SERP rank pipeline for silent mis-attribution. fablescripts/seo/parse_serp.mjs unwraps DuckDuckGo /l/?uddg= links, drops engine-noise hosts via the NOISE regex array, dedupes by host.replace(/^www\./,'') + pathname in DOM order, and reports the first organic index where host.replace(/^www\./,'').endsWith(target). Stress-test: (1) that endsWith match would also match ‘notdev3lop.com’ or ‘dev3lop.com.evil.com’ — confirm and tighten to a host-equality or boundary check; (2) when the target is absent it prints \t\t${organic.length} and serp_check.sh records total-only with no rank, but a stale/empty Safari DOM (CAPTCHA or consent wall from serp_query.jxa.js) would log rank-missing as if dev3lop dropped — add a sanity floor on organic count before recording; (3) verify serp_check.sh honors etiquette (default N=5, sleep $((8 + RANDOM % 8)) jitter, one engine) and that it exits 1 with the ‘Allow JavaScript from Apple Events’ hint when Apple Events JS is disabled. Report concrete fixes to the seo-ranger scripts; this is the one area seo-ranger owns directly.


An autonomous AI agent mid-stride running an automated code-review and hardening pass across dev3lop.com

From an internal fleet to agent tooling you can buy

We already run a workspace agent marketplace — portable templates with spec JSON and consulting to deploy them safely. The hardening fleet in this post is the same idea pointed inward, and every agent in it is a candidate template.

The packaging is straightforward because the agents were already built to be portable. Each one is a charter plus a set of fablescripts/-style scripts and a spec that points at a handful of globs — a Terraform directory, a Hono routes folder, a content collection, a redirect file. To run infra-hardener against your own single-VM stack, you change four paths and it runs the same playbook: prove the backup restores, model disk and RAM exhaustion on your real box size, check that your health endpoint actually fails when a dependency fails. To run site-hardener on your post-migration marketing site, you point it at your content directory and it runs Wayback asset recovery, duplicate-slug and canonical audits, redirect-map linting, and a CSP-tightening pass.

But the more valuable product is not any single agent — it is the operating model underneath them, which is what most teams reaching for “AI agents on our codebase” actually get wrong:

  • One narrow agent per area, with a charter that says what to hand off. An agent that can touch everything will eventually touch the wrong thing. seo-ranger recommends; site-hardener edits. That hard separation is a feature, not a limitation.
  • Shared memory so findings survive the session. Without it, agent number two re-discovers what agent number one already knew, and neither leaves a trail.
  • Locks so parallel agents don’t collide, and no ghost commands so every action is a re-runnable, reviewable script.
  • Adversarial verification before action. Make a second agent try to refute every serious finding. It is the cheapest insurance you can buy against shipping confident nonsense.

That is the thing worth productizing: not a clever prompt, but a disciplined way to put a fleet of agents to work on a real codebase without them lying to you or stepping on each other.

What’s next

The sprint closed a backlog item, not the backlog. Still queued, and tracked in the brain:

  • Relay: CSRF protection on state-changing POSTs (the session cookie is SameSite=None for the cross-origin SPA, so an origin or token check is needed), a hard fail-fast when SESSION_SECRET is missing in production, and per-user rate limiting on message posts and WebSocket sends.
  • API: decide each Netlify function’s fate — wire real persistence or delete the mock — and add zod validation, method allow-listing, CORS, and numeric bounds to whatever stays.
  • Infra: apply the backup fix to the live droplet and run a restore drill, add Docker log rotation and image pruning, and set container memory limits on the 1GB box.
  • Debt: a CI step that regenerates the redirect file and fails on drift, and an honest archive-vs-delete pass over the tracked binary bloat.

Each of those is a ticket with an agent already assigned and a prompt already written — several of them are in the lists above.

If you have a site that survived a CMS migration, or a small backend you have not had time to harden, this is exactly the kind of work we do — as a fleet of agents, or as humans, or both. Start a project and tell us where your basement is.