Security
Skillsmith protects your development environment through automated security scanning, trust verification, and a multi-tier defense system.
Security Boundaries
Skillsmith operates within a layered security model:
Trusted Zone
- Claude model safety guardrails
- Claude Code runtime sandbox
- OS-level file access controls
Semi-Trusted Zone (Skillsmith)
- Curated skill index
- Quality scoring
- Static security analysis
- Trust tier verification
- Conflict detection
Untrusted Zone
- GitHub repositories
- Third-party skill authors
- Community registries
What We Scan For
Every skill is scanned for security issues before being made available. Findings are categorized by severity:
Critical Severity (Blocks Installation)
These patterns always block installation:
| Category | Detected Patterns |
|---|---|
| Jailbreak Attempts | "ignore previous instructions",
"developer mode",
"bypass safety",
"act as an AI without restrictions" |
| Malicious URLs | External domains not on the allowlist (github.com, anthropic.com, claude.ai are allowed) |
High Severity (Requires Confirmation)
These patterns require explicit user confirmation before proceeding:
| Category | Detected Patterns |
|---|---|
| SSRF Patterns | file://, gopher://, ldap:// protocols,
localhost references, private IP ranges
|
| Sensitive File Access | *.env*, *.pem, *.key,
*credentials*, *secrets* |
| Dangerous Commands | rm -rf, curl/wget to unknown domains,
eval/exec with dynamic input
|
Medium Severity (Warning Only)
These patterns generate warnings but don't block installation:
| Category | Detected Patterns |
|---|---|
| Obfuscation | High entropy content, possible base64 payloads, unusual character sequences |
| Permission Keywords | References to sudo, root, admin, system modification commands |
Threat Model
Skillsmith actively mitigates the following threats:
| Threat | Severity | Mitigation | Status |
|---|---|---|---|
| Malicious SKILL.md | Critical | Pattern scanning, trust tiers | Active |
| Prompt injection | Critical | Pattern detection, entropy analysis | Active |
| Typosquatting | High | Levenshtein distance, character substitution detection | Active |
| Dependency hijacking | Medium | URL allowlist | Active |
| Author key compromise | Medium | Anomaly detection | Planned |
| Supply chain attack | High | Registry signing | Planned |
PII Detection
Skillsmith automatically scans skill content for personally identifiable information (PII) patterns before installation. Detected patterns include:
- Email addresses — hardcoded recipient addresses or credential patterns
- Phone numbers — embedded contact information
- API keys and tokens — leaked credentials in skill code
- Credential patterns — passwords, secret keys, and authentication tokens
PII findings are reported in scan results with severity levels. Critical PII findings (leaked credentials) block installation; other PII findings surface as warnings for your review.
Risk Trend Analysis
Skillsmith tracks security posture changes for skills over time using risk trend analysis. This helps detect supply chain attacks where a previously safe skill becomes compromised.
- 20-point increase — Warning threshold. Unusual risk score change flagged for review.
- 35-point increase — Critical threshold. Significant risk change triggers investigation.
- 40-point boundary crossing — A skill crossing from "safe" to "risky" territory is automatically flagged.
Risk history is tracked per skill version, enabling comparison across releases.
Dual-Scanner Architecture
Every skill assessment runs through two complementary security scanners:
- AIDefence — Detects prompt injection, social engineering, and behavioral threats. Specializes in catching attempts to manipulate the AI agent through skill content.
- SecurityScanner — Detects SSRF, jailbreak patterns, data exfiltration, privilege escalation, and structural threats. Catches technical attack vectors in skill code.
Running both scanners on every assessment provides defense in depth — each scanner catches threat categories the other may miss.
Best Practices for Users
1. Always Check Trust Tier
Before installing a skill, verify its trust tier:
- Official/Verified: Generally safe for production use
- Community: Review skill content before installing
- Unverified: Only install if you personally trust the author
2. Review Unverified Skills
For unverified skills, always check:
- Read the SKILL.md content for suspicious instructions
- Look for unusual URLs or command patterns
- Check the author's GitHub profile and other projects
3. Use Validation Before Manual Install
Use skillsmith validate <path> to run security scans on locally-downloaded skills.
4. Keep Skillsmith Updated
New security patterns are added regularly. Update with:
npx @skillsmith/mcp-server@latest
Best Practices for Authors
To ensure your skill passes security scans:
- Avoid external URLs unless necessary. Prefer documented APIs (github.com, npm registry)
- Never request sensitive file access. Don't read .env files or credential stores
- Be explicit about permissions. Document what files you read/write and what commands you execute
- Submit for verification. Verified skills get more visibility and trust. See Trust Tiers
What Happens When Scans Fail
Skills that fail security scans are placed in quarantine. The severity of findings determines what happens next:
- Critical findings: Skill cannot be installed
- High findings: Manual review required before installation allowed
- Medium findings: Skill can be installed with warnings displayed
Learn more about the quarantine system →
Reporting Security Issues
Vulnerabilities in Skillsmith
If you discover a vulnerability in Skillsmith itself:
- Contact form: Report a security issue
- GitHub: Security Advisories
Malicious Skills
To report a malicious or suspicious skill:
- Open an issue on GitHub
- Include the skill ID and specific concern
- We investigate and blocklist within 24 hours
Related Documentation
- Trust Tiers - Understand skill verification levels
- Quarantine System - What happens to flagged skills
- Privacy Policy - How we handle your data