Security

Skillsmith protects your development environment through automated security scanning, trust verification, and a multi-tier defense system.

Security Boundaries

Skillsmith operates within a layered security model:

Trusted Zone

Claude model safety guardrails
Claude Code runtime sandbox
OS-level file access controls

Semi-Trusted Zone (Skillsmith)

Curated skill index
Quality scoring
Static security analysis
Trust tier verification
Conflict detection

Untrusted Zone

GitHub repositories
Third-party skill authors
Community registries

What We Scan For

Every skill is scanned for security issues before being made available. Findings are categorized by severity:

Critical Severity (Blocks Installation)

These patterns always block installation:

Category	Detected Patterns
Jailbreak Attempts	`"ignore previous instructions"`, `"developer mode"`, `"bypass safety"`, `"act as an AI without restrictions"`
Malicious URLs	External domains not on the allowlist (github.com, anthropic.com, claude.ai are allowed)

High Severity (Requires Confirmation)

These patterns require explicit user confirmation before proceeding:

Category	Detected Patterns
SSRF Patterns	`file://`, `gopher://`, `ldap://` protocols, localhost references, private IP ranges
Sensitive File Access	`.env`, `.pem`, `.key`, `credentials`, `secrets`
Dangerous Commands	`rm -rf`, `curl`/`wget` to unknown domains, `eval`/`exec` with dynamic input

Medium Severity (Warning Only)

These patterns generate warnings but don't block installation:

Category	Detected Patterns
Obfuscation	High entropy content, possible base64 payloads, unusual character sequences
Permission Keywords	References to sudo, root, admin, system modification commands

Threat Model

Skillsmith actively mitigates the following threats:

Threat	Severity	Mitigation	Status
Malicious SKILL.md	Critical	Pattern scanning, trust tiers	Active
Prompt injection	Critical	Pattern detection, entropy analysis	Active
Typosquatting	High	Levenshtein distance, character substitution detection	Active
Dependency hijacking	Medium	URL allowlist	Active
Author key compromise	Medium	Anomaly detection	Planned
Supply chain attack	High	Registry signing	Planned

PII Detection

Skillsmith automatically scans skill content for personally identifiable information (PII) patterns before installation. Detected patterns include:

Email addresses — hardcoded recipient addresses or credential patterns
Phone numbers — embedded contact information
API keys and tokens — leaked credentials in skill code
Credential patterns — passwords, secret keys, and authentication tokens

PII findings are reported in scan results with severity levels. Critical PII findings (leaked credentials) block installation; other PII findings surface as warnings for your review.

Risk Trend Analysis

Skillsmith tracks security posture changes for skills over time using risk trend analysis. This helps detect supply chain attacks where a previously safe skill becomes compromised.

20-point increase — Warning threshold. Unusual risk score change flagged for review.
35-point increase — Critical threshold. Significant risk change triggers investigation.
40-point boundary crossing — A skill crossing from "safe" to "risky" territory is automatically flagged.

Risk history is tracked per skill version, enabling comparison across releases.

Dual-Scanner Architecture

Every skill assessment runs through two complementary security scanners:

AIDefence — Detects prompt injection, social engineering, and behavioral threats. Specializes in catching attempts to manipulate the AI agent through skill content.
SecurityScanner — Detects SSRF, jailbreak patterns, data exfiltration, privilege escalation, and structural threats. Catches technical attack vectors in skill code.

Running both scanners on every assessment provides defense in depth — each scanner catches threat categories the other may miss.

Best Practices for Users

1. Always Check Trust Tier

Before installing a skill, verify its trust tier:

Official/Verified: Generally safe for production use
Community: Review skill content before installing
Unverified: Only install if you personally trust the author

2. Review Unverified Skills

For unverified skills, always check:

Read the SKILL.md content for suspicious instructions
Look for unusual URLs or command patterns
Check the author's GitHub profile and other projects

3. Use Validation Before Manual Install

Use skillsmith validate <path> to run security scans on locally-downloaded skills.

4. Keep Skillsmith Updated

New security patterns are added regularly. Update with: npx @skillsmith/mcp-server@latest

Best Practices for Authors

To ensure your skill passes security scans:

Avoid external URLs unless necessary. Prefer documented APIs (github.com, npm registry)
Never request sensitive file access. Don't read .env files or credential stores
Be explicit about permissions. Document what files you read/write and what commands you execute
Submit for verification. Verified skills get more visibility and trust. See Trust Tiers

What Happens When Scans Fail

Skills that fail security scans are placed in quarantine. The severity of findings determines what happens next:

Critical findings: Skill cannot be installed
High findings: Manual review required before installation allowed
Medium findings: Skill can be installed with warnings displayed

Learn more about the quarantine system →

Reporting Security Issues

Vulnerabilities in Skillsmith

If you discover a vulnerability in Skillsmith itself:

Contact form: Report a security issue
GitHub: Security Advisories

Malicious Skills

To report a malicious or suspicious skill:

Open an issue on GitHub
Include the skill ID and specific concern
We investigate and blocklist within 24 hours