Hydra — white-hat security training
Repo: MillionOnMars/hydra · Active tactic: TwinSpires meeting this week
One-page handout (for the meeting):
/hydra-twinspires-overview.html— print- and screenshot-friendly.Editor's note. This page is groomed from a PR inventory across Lumina5, Polaris, and VibesWire (April 7 → May 14, 2026). All numbers and findings link to the actual merged work.
What it is
Hydra is Bike4Mind's proprietary AI-guided security framework — a coordinated stack of three specialized agents built on the same general-purpose agentic AI substrate that powers everything else in the B4M platform: Deep Agents at bike4mind.com, OptiHashi for the IonQ engagement, the Polaris analyst intelligence work for The Futurum Group, and the rest of the product portfolio.
hydra-poc.mjs— 149 attack tests, broad reconnaissance probeshydra-full.mjs— authenticated IDOR + endpoint sweep + payload fuzzinghydra-heartbeat.mjs— continuous diff scanner with persistent state, auto-files GitHub issues with LLM-enriched context
The methodology: map every API endpoint, sample with Monte Carlo to break reviewer priors, reason over the codebase to name emergent vulnerability patterns (e.g. “No-BaseApi endpoints”), audit every instance of the pattern, then bake detection into the heartbeat scanner so the same class can't come back.
Hydra is the realized version of what was sketched as “Mythos Agentic Security” in earlier strategy notes — B4M's productionized continuous defense layer, built on the platform we already own.
The headline result — what every other scanner missed
Before Hydra ran, the B4M app was already protected by seven mainstream commercial security tools running continuously:
| Tool | Category | What it caught |
|---|---|---|
| Semgrep | Static SAST (pattern rules) | ✗ Missed every finding |
| OWASP ZAP | Dynamic web scanner | ✗ Missed every finding |
| Snyk | SCA + code scanning | ✗ Missed every finding |
| AWS Inspector | Workload vulnerability mgmt | ✗ Missed every finding |
| AWS GuardDuty | Account / behavior monitoring | ✗ Missed every finding |
| Gitleaks | Secret scanning | ✗ Missed every finding |
| AWS WAF | Runtime request filtering | ✗ Missed every finding |
| 🐉 Hydra | AI-guided application-logic auditor | ✓ 9+ CRITICALs in one afternoon |
Found 5 production-live CRITICALs across 622 endpoints. Continuous monitoring via the heartbeat scanner; new findings auto-file as GitHub issues; coverage scales with the platform.
Hydra does not replace any of those tools — it covers the layer none of them can see: application logic. “Should this endpoint exist?” “Does this user actually own this resource?” “Does the JWT actually bind the OAuth flow to this browser?” Pattern matchers don't answer those questions. Hydra does. Both layers are needed.
At a glance
| Metric | Number |
|---|---|
| Security PRs merged | 35+ (Lumina5: 17 merged + 2 open · Polaris: 17 · VibesWire: 1) |
| Hydra-tagged GitHub issues filed | 40+ across two issue-tracked repos (all closed) |
| Distinct vulnerability categories | 12 |
| P0 / CRITICALs surfaced | 9+ |
| Time period | April 7 → May 14, 2026 (most concentrated remediation: April 8–30) |
| Continuous coverage | Heartbeat scanner runs 24/7; new findings auto-file as GitHub issues with LLM-enriched context |
What Hydra surfaces — the 12 categories
Ranked by frequency across all three repos:
- No-BaseApi endpoints (auth bypass) — by far the dominant pattern. API files that skip the standard auth middleware. Polaris created a dedicated label for it; both repos now ship CI checks that hard-fail any PR introducing a new one.
- IDOR / missing ownership checks —
app-files,leads,business-links,signal runs,datasets,keep/command,user-profile. Multiple P0s in this class. - Debug endpoints in production — the pattern: dev-only handler shipped to prod with no auth.
- Info disclosure in unauthenticated endpoints — S3 bucket names, MongoDB hostname, GitHub integration status, AWS account ID, SQS ARNs, org name/domain in invite responses.
- SSRF — open proxy + IPv4-mapped IPv6 bypass + RFC 2544 benchmark range + image cache with no SSRF guard.
- OAuth hijacking — GitHub OAuth callback derived
userIdsolely from the state JWT; if leaked, attacker could link their GitHub to a victim's account. - Credential / secret leakage to logs — VibesWire test handler logging two API keys to CloudWatch on every invocation; emergency-login over-logging.
- Input validation gaps — regex injection, missing Zod, NoSQL operator injection.
- Rate-limit gaps and bypasses — emergency-login (no limiter), comment-creation (
X-Forwarded-Forprecedence bug), password reset. - Missing security headers — HSTS, Permissions-Policy, CSP, X-Content-Type-Options, X-Frame-Options, Referrer-Policy. Found in every repo.
- Stripe / billing fraud (red-team extension) — chargeback arbitrage, subscription cycling, missing Stripe idempotency on
payment_intent, voice-session credit reservation, transcription upload abuse, adminloginAswithout MFA. Modeled at ~$20K/wk worst-case exposure on Lumina5. - Architecture cleanups — helper modules misplaced under
pages/api/(route crawlers ignore them, scanners trip on them), TOCTOU race on invite codes,JWT_SECRETfallback to'your-secret-key'replaced with fail-closed.
Standout wins
1. /api/debug/database — 9 months of live MongoDB hostname disclosure
Lumina5 PR #7708. An unauthenticated endpoint returned the production MongoDB Atlas hostname (cluster0.sreox.mongodb.net), database name, and SST stage to any caller. Created June 2025 during a debugging session, touched by three developers, nobody noticed auth: false. Confirmed live: curl https://app.bike4mind.com/api/debug/database returned the cluster hostname. File deleted entirely. Nine months. Three reviewers. Every commercial scanner. A ~$6 Hydra pass found and killed it in one afternoon.
2. /api/external-image — SSRF + the IPv4-mapped IPv6 bypass
Lumina5 PRs #7725 + #7741. An endpoint that fetched any URL on the internet and uploaded to S3 with zero auth. Initial fix added an SSRF block list. Then Hydra ran a second pass and discovered https://[::ffff:127.0.0.1]/ bypassed every check — Node's URL parser returns [::ffff:7f00:1] which matched neither the IPv4 regex nor the IPv6 prefix list. Loopback was reachable. AWS metadata service (169.254.169.254) was reachable. Fix: hostname normalization, redirect re-check, ReadableStream reader that aborts at 10 MB so a malicious server can't OOM the Lambda.
3. GitHub OAuth account-linking hijack
Lumina5 PR #7741 (closing #7733). The OAuth callback derived userId entirely from the state JWT. If that state token leaked through any side channel — URL logs, browser history, corporate proxy caches — an attacker could complete the OAuth flow in a different browser and silently link their GitHub account to the victim's B4M account. Fix: short-lived gh_oauth_uid cookie (HttpOnly; Secure; SameSite=Lax; Max-Age=600) binding the OAuth flow to the originating browser; callback cross-checks the cookie against the state token's userId and rejects mismatches.
4. VibesWire API-key leak to CloudWatch
VibesWire PR #10. The /api/test handler logged both Resource.B4mApiKey.value AND Resource.GuardianApiKey.value to CloudWatch on every invocation. HTTP response was innocuous ({ok:true}) so Hydra's HTTP scanner rated it MEDIUM. The CRITICAL was caught on a code-review pass during the fix. Handler deleted, route removed from sst.config.ts, both API keys rotated. The lesson: scanners attack from outside, code review attacks from inside — both are necessary.
5. baseApi() enforcement in CI — the process change
Lumina5 PR #8097 + Polaris PR #3652. Nothing in either repo prevented a new API file from being added without auth. The 9-month-old debug endpoint had been invisible. Now both repos run a structural scan in CI (scripts/check-no-baseapi.sh) that hard-fails any PR introducing a pages/api/*.ts file without baseApi() unless it's on an allowlist with a comment explaining the alternative auth mechanism. Wired into husky pre-commit too. Allowlist is 9 entries on Lumina5 and 5 on Polaris, each documented. Hydra didn't just find the bugs — it changed the process so the same bug class can't reach production again.
Honorable mention: the red-team threat model
Lumina5 PRs #8224 (model) + #8265 (fixes). 457-addition threat-model PR prosecuting B4M's own billing flow for chargeback fraud, agent farms, credit-system abuse — modeling ~$20K/wk worst-case exposure. Follow-up PR ships Stripe idempotency, voice-session credit reservation, transcription file-size limits, admin loginAs MFA step-up, per-tier agent caps, absolute anomaly thresholds.
What got hardened, per repo
Polaris (TFG)
17 merged Hydra-related PRs (#3257–#3674 window). Concentration: April 9–30. Polaris created a Hydra-specific label taxonomy: hydra:idor, hydra:no-baseapi, hydra:architecture, hydra:info-disclosure. 22+ tagged issues, all closed.
Highlights: signal-progress endpoint hardening with run-owner checks (#3257), leads/[id] ownership scoping (#3258, #3600), app-files mutation scoping (#3259), business-links DELETE admin gate (#3260), MCP server endpoint auth (#3286), invite generate hardening with uniform 404 + dedupe (#3333), datasets IDOR via unconditional CASL read (#3337), input validation hardening across 6 LiveOps fixes (#3476), Hydra cleanup with CI heartbeat for baseApi enforcement (#3652), strip orgName/domain from invite-generate (#3654), HSTS header on all responses (#3674).
Lumina5
17 merged Hydra-related PRs + 2 open (#7708–#8298 window). The original target. 20+ Hydra-tagged issues filed, all closed.
Highlights: original Hydra security sweep with 8 fixes (#7708), app-files scoped to owner (#7724), P0 endpoints wrapped in baseApi for SSRF + GitHub MCP exposure + OAuth hijack (#7725), MCP/github status protected (#7726), OAuth session cross-check + JWT fallback removal + IPv6 SSRF bypass closed + redirect re-check + streaming size cap (#7741), checkBlockedIP on emergency-login (#7757), serverConfig split into public + auth'd endpoints (#7794), emergency-login behind feature flag (#7795), /api/users/:id restricted to public DTO for non-owner non-admin (#7858), CI baseApi enforcement (#8097), billing-fraud red-team output (#8265), RFC 2544 benchmark IP range blocked in SSRF protection (#8298). Open: SecOps Triage / Active Defense integration (#8375), red-team threat model doc (#8224).
VibesWire / BedrockNews
1 bundled remediation PR (#10) — Hydra adapted to attack vibeswire.com production in ~40 minutes (hydra-vibeswire.mjs, 587 lines). 147 tests / 5 attack heads / 77-second wall-clock sweep.
Findings: 1 CRITICAL (API keys to CloudWatch, caught on review), 3 HIGH (debug endpoints leaking DynamoDB / AWS account ID, X-Forwarded-For rate-limit bypass, stored XSS in comment body), 6 LOW (missing security headers, 500-on-bad-cursor, log-injection vector).
What Hydra is not
- Not Claude Mythos. Hydra is a publicly-available-model approximation of the gated Glasswing methodology. Honest framing on first slide.
- Not zero-day discovery. The value is application-logic coverage at scale, not novel CVEs.
- Not a replacement for signature tools. Semgrep, ZAP, Gitleaks, the WAF all stay. Hydra catches what those layers can't see.
- Not done. The heartbeat scanner runs continuously; new findings file new issues; the threat model gets new sections; the SecOps Triage / Active Defense pipeline (#8375 open) operationalizes it further.
Connection to other docs
- OpenClaw security lessons — the architectural commitments Hydra exists to validate (hard execution boundaries, outbound-only by default, no ambient credentials, untrusted content quarantine).
- B4M V3 Blueprint — the package / blueprint boundaries Hydra probes for divergence; auth and security must be packages, not blueprints, because of findings like the No-BaseApi pattern.
- Mission — Security as Architectural Property — the operating principle Hydra enforces in practice.
- Mission — AI-Augmented Review — Hydra is the same pattern applied to security: AI catches structural classes, humans review the spicy edge cases.
- Connections page — Hydra connects to every B4M product as the validation layer. Its findings improve the substrate that every other Forge and Engine product runs on.
- Active tactic: Hydra → TwinSpires meeting — this week.
- Aegis — the defensive sibling. Same operation, inverted output: agentic vuln sweep across the top-N OSS packages with responsible disclosure as the deliverable. Parked in Lab pending an ROI model that pencils. The shield to Hydra's sword.