Hydra — white-hat security training

Repo: MillionOnMars/hydra · Active tactic: TwinSpires meeting this week

One-page handout (for the meeting): /hydra-twinspires-overview.html — print- and screenshot-friendly.

Editor's note. This page is groomed from a PR inventory across Lumina5, Polaris, and VibesWire (April 7 → May 14, 2026). All numbers and findings link to the actual merged work.

What it is

Hydra is Bike4Mind's proprietary AI-guided security framework — a coordinated stack of three specialized agents built on the same general-purpose agentic AI substrate that powers everything else in the B4M platform: Deep Agents at bike4mind.com, OptiHashi for the IonQ engagement, the Polaris analyst intelligence work for The Futurum Group, and the rest of the product portfolio.

hydra-poc.mjs — 149 attack tests, broad reconnaissance probes
hydra-full.mjs — authenticated IDOR + endpoint sweep + payload fuzzing
hydra-heartbeat.mjs — continuous diff scanner with persistent state, auto-files GitHub issues with LLM-enriched context

The methodology: map every API endpoint, sample with Monte Carlo to break reviewer priors, reason over the codebase to name emergent vulnerability patterns (e.g. “No-BaseApi endpoints”), audit every instance of the pattern, then bake detection into the heartbeat scanner so the same class can't come back.

Hydra is the realized version of what was sketched as “Mythos Agentic Security” in earlier strategy notes — B4M's productionized continuous defense layer, built on the platform we already own.

The headline result — what every other scanner missed

Before Hydra ran, the B4M app was already protected by seven mainstream commercial security tools running continuously:

Tool	Category	What it caught
Semgrep	Static SAST (pattern rules)	✗ Missed every finding
OWASP ZAP	Dynamic web scanner	✗ Missed every finding
Snyk	SCA + code scanning	✗ Missed every finding
AWS Inspector	Workload vulnerability mgmt	✗ Missed every finding
AWS GuardDuty	Account / behavior monitoring	✗ Missed every finding
Gitleaks	Secret scanning	✗ Missed every finding
AWS WAF	Runtime request filtering	✗ Missed every finding
🐉 Hydra	AI-guided application-logic auditor	✓ 9+ CRITICALs in one afternoon

Found 5 production-live CRITICALs across 622 endpoints. Continuous monitoring via the heartbeat scanner; new findings auto-file as GitHub issues; coverage scales with the platform.

Hydra does not replace any of those tools — it covers the layer none of them can see: application logic. “Should this endpoint exist?” “Does this user actually own this resource?” “Does the JWT actually bind the OAuth flow to this browser?” Pattern matchers don't answer those questions. Hydra does. Both layers are needed.

At a glance

Metric	Number
Security PRs merged	35+ (Lumina5: 17 merged + 2 open · Polaris: 17 · VibesWire: 1)
Hydra-tagged GitHub issues filed	40+ across two issue-tracked repos (all closed)
Distinct vulnerability categories	12
P0 / CRITICALs surfaced	9+
Time period	April 7 → May 14, 2026 (most concentrated remediation: April 8–30)
Continuous coverage	Heartbeat scanner runs 24/7; new findings auto-file as GitHub issues with LLM-enriched context

What Hydra surfaces — the 12 categories

Ranked by frequency across all three repos:

No-BaseApi endpoints (auth bypass) — by far the dominant pattern. API files that skip the standard auth middleware. Polaris created a dedicated label for it; both repos now ship CI checks that hard-fail any PR introducing a new one.
IDOR / missing ownership checks — app-files, leads, business-links, signal runs, datasets, keep/command, user-profile. Multiple P0s in this class.
Debug endpoints in production — the pattern: dev-only handler shipped to prod with no auth.
Info disclosure in unauthenticated endpoints — S3 bucket names, MongoDB hostname, GitHub integration status, AWS account ID, SQS ARNs, org name/domain in invite responses.
SSRF — open proxy + IPv4-mapped IPv6 bypass + RFC 2544 benchmark range + image cache with no SSRF guard.
OAuth hijacking — GitHub OAuth callback derived userId solely from the state JWT; if leaked, attacker could link their GitHub to a victim's account.
Credential / secret leakage to logs — VibesWire test handler logging two API keys to CloudWatch on every invocation; emergency-login over-logging.
Input validation gaps — regex injection, missing Zod, NoSQL operator injection.
Rate-limit gaps and bypasses — emergency-login (no limiter), comment-creation (X-Forwarded-For precedence bug), password reset.
Missing security headers — HSTS, Permissions-Policy, CSP, X-Content-Type-Options, X-Frame-Options, Referrer-Policy. Found in every repo.
Stripe / billing fraud (red-team extension) — chargeback arbitrage, subscription cycling, missing Stripe idempotency on payment_intent, voice-session credit reservation, transcription upload abuse, admin loginAs without MFA. Modeled at ~$20K/wk worst-case exposure on Lumina5.
Architecture cleanups — helper modules misplaced under pages/api/ (route crawlers ignore them, scanners trip on them), TOCTOU race on invite codes, JWT_SECRET fallback to 'your-secret-key' replaced with fail-closed.

Standout wins

1. `/api/debug/database` — 9 months of live MongoDB hostname disclosure

Lumina5 PR #7708. An unauthenticated endpoint returned the production MongoDB Atlas hostname (cluster0.sreox.mongodb.net), database name, and SST stage to any caller. Created June 2025 during a debugging session, touched by three developers, nobody noticed auth: false. Confirmed live: curl https://app.bike4mind.com/api/debug/database returned the cluster hostname. File deleted entirely. Nine months. Three reviewers. Every commercial scanner. A ~$6 Hydra pass found and killed it in one afternoon.

2. `/api/external-image` — SSRF + the IPv4-mapped IPv6 bypass

Lumina5 PRs #7725 + #7741. An endpoint that fetched any URL on the internet and uploaded to S3 with zero auth. Initial fix added an SSRF block list. Then Hydra ran a second pass and discovered https://[::ffff:127.0.0.1]/ bypassed every check — Node's URL parser returns [::ffff:7f00:1] which matched neither the IPv4 regex nor the IPv6 prefix list. Loopback was reachable. AWS metadata service (169.254.169.254) was reachable. Fix: hostname normalization, redirect re-check, ReadableStream reader that aborts at 10 MB so a malicious server can't OOM the Lambda.

3. GitHub OAuth account-linking hijack

Lumina5 PR #7741 (closing #7733). The OAuth callback derived userId entirely from the state JWT. If that state token leaked through any side channel — URL logs, browser history, corporate proxy caches — an attacker could complete the OAuth flow in a different browser and silently link their GitHub account to the victim's B4M account. Fix: short-lived gh_oauth_uid cookie (HttpOnly; Secure; SameSite=Lax; Max-Age=600) binding the OAuth flow to the originating browser; callback cross-checks the cookie against the state token's userId and rejects mismatches.

4. VibesWire API-key leak to CloudWatch

VibesWire PR #10. The /api/test handler logged both Resource.B4mApiKey.value AND Resource.GuardianApiKey.value to CloudWatch on every invocation. HTTP response was innocuous ({ok:true}) so Hydra's HTTP scanner rated it MEDIUM. The CRITICAL was caught on a code-review pass during the fix. Handler deleted, route removed from sst.config.ts, both API keys rotated. The lesson: scanners attack from outside, code review attacks from inside — both are necessary.

5. `baseApi()` enforcement in CI — the process change

Lumina5 PR #8097 + Polaris PR #3652. Nothing in either repo prevented a new API file from being added without auth. The 9-month-old debug endpoint had been invisible. Now both repos run a structural scan in CI (scripts/check-no-baseapi.sh) that hard-fails any PR introducing a pages/api/*.ts file without baseApi() unless it's on an allowlist with a comment explaining the alternative auth mechanism. Wired into husky pre-commit too. Allowlist is 9 entries on Lumina5 and 5 on Polaris, each documented. Hydra didn't just find the bugs — it changed the process so the same bug class can't reach production again.

Honorable mention: the red-team threat model

Lumina5 PRs #8224 (model) + #8265 (fixes). 457-addition threat-model PR prosecuting B4M's own billing flow for chargeback fraud, agent farms, credit-system abuse — modeling ~$20K/wk worst-case exposure. Follow-up PR ships Stripe idempotency, voice-session credit reservation, transcription file-size limits, admin loginAs MFA step-up, per-tier agent caps, absolute anomaly thresholds.

What got hardened, per repo

Polaris (TFG)

17 merged Hydra-related PRs (#3257–#3674 window). Concentration: April 9–30. Polaris created a Hydra-specific label taxonomy: hydra:idor, hydra:no-baseapi, hydra:architecture, hydra:info-disclosure. 22+ tagged issues, all closed.

Highlights: signal-progress endpoint hardening with run-owner checks (#3257), leads/[id] ownership scoping (#3258, #3600), app-files mutation scoping (#3259), business-links DELETE admin gate (#3260), MCP server endpoint auth (#3286), invite generate hardening with uniform 404 + dedupe (#3333), datasets IDOR via unconditional CASL read (#3337), input validation hardening across 6 LiveOps fixes (#3476), Hydra cleanup with CI heartbeat for baseApi enforcement (#3652), strip orgName/domain from invite-generate (#3654), HSTS header on all responses (#3674).

Lumina5

17 merged Hydra-related PRs + 2 open (#7708–#8298 window). The original target. 20+ Hydra-tagged issues filed, all closed.

Highlights: original Hydra security sweep with 8 fixes (#7708), app-files scoped to owner (#7724), P0 endpoints wrapped in baseApi for SSRF + GitHub MCP exposure + OAuth hijack (#7725), MCP/github status protected (#7726), OAuth session cross-check + JWT fallback removal + IPv6 SSRF bypass closed + redirect re-check + streaming size cap (#7741), checkBlockedIP on emergency-login (#7757), serverConfig split into public + auth'd endpoints (#7794), emergency-login behind feature flag (#7795), /api/users/:id restricted to public DTO for non-owner non-admin (#7858), CI baseApi enforcement (#8097), billing-fraud red-team output (#8265), RFC 2544 benchmark IP range blocked in SSRF protection (#8298). Open: SecOps Triage / Active Defense integration (#8375), red-team threat model doc (#8224).

VibesWire / BedrockNews

1 bundled remediation PR (#10) — Hydra adapted to attack vibeswire.com production in ~40 minutes (hydra-vibeswire.mjs, 587 lines). 147 tests / 5 attack heads / 77-second wall-clock sweep.

Findings: 1 CRITICAL (API keys to CloudWatch, caught on review), 3 HIGH (debug endpoints leaking DynamoDB / AWS account ID, X-Forwarded-For rate-limit bypass, stored XSS in comment body), 6 LOW (missing security headers, 500-on-bad-cursor, log-injection vector).

What Hydra is not

Not Claude Mythos. Hydra is a publicly-available-model approximation of the gated Glasswing methodology. Honest framing on first slide.
Not zero-day discovery. The value is application-logic coverage at scale, not novel CVEs.
Not a replacement for signature tools. Semgrep, ZAP, Gitleaks, the WAF all stay. Hydra catches what those layers can't see.
Not done. The heartbeat scanner runs continuously; new findings file new issues; the threat model gets new sections; the SecOps Triage / Active Defense pipeline (#8375 open) operationalizes it further.

Connection to other docs

OpenClaw security lessons — the architectural commitments Hydra exists to validate (hard execution boundaries, outbound-only by default, no ambient credentials, untrusted content quarantine).
B4M V3 Blueprint — the package / blueprint boundaries Hydra probes for divergence; auth and security must be packages, not blueprints, because of findings like the No-BaseApi pattern.
Mission — Security as Architectural Property — the operating principle Hydra enforces in practice.
Mission — AI-Augmented Review — Hydra is the same pattern applied to security: AI catches structural classes, humans review the spicy edge cases.
Connections page — Hydra connects to every B4M product as the validation layer. Its findings improve the substrate that every other Forge and Engine product runs on.
Active tactic: Hydra → TwinSpires meeting — this week.
Aegis — the defensive sibling. Same operation, inverted output: agentic vuln sweep across the top-N OSS packages with responsible disclosure as the deliverable. Parked in Lab pending an ROI model that pencils. The shield to Hydra's sword.

What it is​

The headline result — what every other scanner missed​

At a glance​

What Hydra surfaces — the 12 categories​

Standout wins​

1. /api/debug/database — 9 months of live MongoDB hostname disclosure​

2. /api/external-image — SSRF + the IPv4-mapped IPv6 bypass​

3. GitHub OAuth account-linking hijack​

4. VibesWire API-key leak to CloudWatch​

5. baseApi() enforcement in CI — the process change​

Honorable mention: the red-team threat model​

What got hardened, per repo​

Polaris (TFG)​

Lumina5​

VibesWire / BedrockNews​

What Hydra is not​

Connection to other docs​