PILLAR_MTH-01 LIVE STATUS // PUBLISHED · METHODOLOGY V1.0 · 2026-05-07

Methodology · MCP Security Scorecard

Adoption posture, not certification. What we look at, what we don't, how disputes work.

1 methodology version
THE PROMISE

Adoption posture, not certification

Snyk and Socket cover npm packages. There's no equivalent for MCP servers. The MCP Security Scorecard fills that gap with an opinionated adoption posture per server, based on public-source review. The point isn't to gate — it's to help builders make adoption decisions explicit instead of vibes-based.

This is not a formal security audit. We don't claim to find every vulnerability. Each scorecard is a snapshot — a methodology version, a confidence level, source citations, and a recommended action. Dispute mechanism below.

VERDICT BANDS

Four adoption postures

BandWhat it means
ADOPT Corp- or community-maintained · code reviewable · narrow blast radius · current dep tree.
ADOPT WITH LIMITS Workable with documented mitigations: scoped tokens, sandbox, human approval on writes.
REVIEW FIRST Gaps you must understand before adopting (opaque secrets handling, unsigned releases, broad default permissions).
DO NOT USE FOR SENSITIVE WORK Abandoned · unresolved active incidents · or known exfiltration vectors.

Verdicts are scoped to a use-case (e.g. local read-only repo exploration vs production write access). ADOPT for one use-case can be REVIEW FIRST for another.

WHAT WE MEASURE

22 dimensions per server

  • Licence. MIT / Apache-2.0 / etc.
  • Security policy. Disclosure contact + handling commitment.
  • Signed releases. Yes / no / partial.
  • Install methods. npm / Docker / binary / source.
  • Code review. Yes / partial / no / unknown.
  • Secrets handling. Scoped / env-only / leaky / unknown.
  • Credential storage. OS keychain / file / memory / unknown.
  • OAuth handling. Scoped / broad / n/a.
  • Sandbox model. Container / subprocess / process / none.
  • Filesystem access. Scoped / home / root / none.
  • Shell access. Yes / no / sandboxed.
  • Network egress. None / allow-list / open.
  • Telemetry. None / opt-in / opt-out / forced.
  • Data residency. Local / cloud / mixed.
  • Default permissions. Minimal / moderate / broad.
  • Destructive tools. Which tools delete / modify, and is human approval supported?
  • Prompt-injection exposure. Low / medium / high.
  • Supply chain surface. Direct + transitive dep counts.
  • Dependency CVE surface. Last scanned, severity.
  • Rate-limit risk. Resource exhaustion potential.
  • Maintainer type + bus factor. Corp / indie / community / abandoned · primary maintainers.
  • Issue response (median days) + incident history (with sources).

Unknowns reduce confidence. They never silently pass. A scorecard with many unknowns ships at lower confidence and the verdict reflects that.

WHAT WE DON'T MEASURE

Where this is NOT a substitute

  • Source code audit. We read public docs, package.json, recent commits, and obvious patterns. We do not run code-level static analysis or fuzz testing.
  • Penetration testing. We don't probe deployed instances.
  • Threat modelling for your specific deployment. Your network, your data classification, your compliance — those are yours to map.
  • Cryptographic review. We note signed-releases yes/no but don't verify key infrastructure.
  • Compliance attestation. SOC2 / ISO / HIPAA / GDPR — we link to vendor claims if present, we don't certify.

If your context demands any of the above, treat the scorecard as an orienting starting point, not a substitute.

EVIDENCE & FRESHNESS

Every scorecard carries

  • Methodology version — the rubric used at time of review.
  • Confidence level — high / medium / low based on number of measurable dimensions.
  • Last reviewed + next review due dates. Stale badge if overdue.
  • Sources cited — public links for every claim.
  • Reviewed by — person whose adoption posture this represents.
DISPUTE / CORRECT

If you maintain an MCP we've scored

Open an issue. Every scorecard page has a "Report a correction" link that pre-fills a GitHub issue with the slug. Reviewers respond within seven calendar days.

Report a scorecard correction

Disputes that change a verdict are merged into the next methodology version. The page records the version history publicly.

VERSION HISTORY

Methodology versions

VersionDateNotes
1.0 (planned) Session 2 · 2026-05 Initial public methodology. Scorecards begin shipping for the 5 most-used MCPs.