Buttercup CRS: AI-Powered Vulnerability Discovery and Patching System
┌─────────────────────────────────────────────────────┐
│ Analysis Summary │
├─────────────────────────────────────────────────────┤
│ Type: Project │
│ Purpose: AI-Powered Vulnerability Discovery and Patching System│
│ Primary Language: python + json + yaml │
│ LOC: 125K │
│ Test Files: 110 │
│ Architecture: python │
│ Confidence: High │
└─────────────────────────────────────────────────────┘
Analyzed: 008bb9cd from 2025-10-03
Buttercup CRS: AI-Powered Vulnerability Discovery and Patching System
Buttercup is a Cyber Reasoning System (CRS) developed by Trail of Bits for the DARPA AI Cyber Challenge (AIxCC). The system automates the discovery and patching of software vulnerabilities in open-source C and Java repositories through AI/ML-assisted fuzzing campaigns built on OSS-Fuzz. When vulnerabilities are found, Buttercup analyzes them and uses a multi-agent AI-driven patcher to repair the vulnerability automatically.
The system consists of five core components: an Orchestrator that coordinates the overall workflow, a Seed Generator for creating fuzzing inputs, a Fuzzer for vulnerability discovery, a Program Model for code analysis, and a Patcher for generating security fixes.
Quick Start
git clone --recurse-submodules https://github.com/trailofbits/buttercup.git
cd buttercup
make setup-local
make deploy
make send-libpng-task
Time to first vulnerability scan: ~10 minutes (requires third-party AI API keys)
Alternative Approaches
| Solution | Setup Complexity | AI Integration | Target Languages | Cost Model |
|---|---|---|---|---|
| Buttercup | High | Native LLM | C, Java | Pay-per-API-call |
| CodeQL | Medium | None | 20+ languages | Enterprise license |
| Semgrep | Low | Limited | 30+ languages | Freemium |
| Snyk | Low | Basic | 10+ languages | SaaS subscription |
| SonarQube | Medium | None | 25+ languages | Self-hosted/SaaS |
Architecture and Implementation
The system uses a distributed architecture with Kubernetes orchestration. The competition API serves as the central coordination point, implementing a FastAPI-based REST interface with Pydantic models for type safety.
Task management follows a structured approach with clearly defined data models:
class TaskInfo(BaseModel):
task_id: str
name: str | None = None
project_name: str
status: str # active, expired
duration: int
deadline: str
challenge_repo_url: str | None = None
challenge_repo_head_ref: str | None = None
challenge_repo_base_ref: str | None = None
fuzz_tooling_url: str | None = None
fuzz_tooling_ref: str | None = None
povs: list[dict[str, Any]] = []
patches: list[dict[str, Any]] = []
bundles: list[dict[str, Any]] = []
File: orchestrator/src/buttercup/orchestrator/ui/competition_api/main.py:64-78
The artifact management system uses a file-based approach with organized directory structures. The implementation handles different artifact types through a unified interface:
def save_artifact(
task_id: str,
artifact_type: str,
artifact_id: str,
content: str | dict,
is_base64: bool = False,
) -> bool:
"""Save an artifact to the appropriate directory structure."""
try:
run_dir = get_run_data_dir()
task_dir = run_dir / task_id / artifact_type
task_dir.mkdir(parents=True, exist_ok=True)
if artifact_type == "bundles":
file_path = task_dir / f"{artifact_id}.json"
File: orchestrator/src/buttercup/orchestrator/ui/competition_api/main.py:158-172
The system implements singleton patterns for core services with lazy initialization:
def get_database_manager() -> DatabaseManager:
"""Get database manager singleton."""
global _database_manager
if _database_manager is None:
settings = get_settings()
_database_manager = DatabaseManager(settings.database_url)
return _database_manager
File: orchestrator/src/buttercup/orchestrator/ui/competition_api/main.py:140-154
Performance Characteristics
System Requirements:
- CPU: 8 cores minimum
- Memory: 16 GB RAM
- Storage: 100 GB available space
- Network: Stable internet for AI API calls
Component Distribution:
- Total codebase: 124,911 lines across 544 files
- Primary language: Python (73,185 lines)
- Configuration: JSON (24,340 lines) + YAML (16,338 lines)
- Test coverage: 110 test files
Runtime Dependencies:
- Kubernetes cluster for orchestration
- Redis for task registry and state management
- Third-party LLM APIs (OpenAI, Anthropic, Google)
- Docker containers for component isolation
Best for: Organizations needing automated vulnerability discovery and patching for C/Java codebases with budget for AI API consumption.
Security Architecture
Credential Management:
- API keys for third-party LLM services stored as Kubernetes secrets
- Database credentials managed through environment variables
- No hardcoded secrets in codebase
Network Security:
- Component isolation through Kubernetes networking
- External API calls limited to configured LLM providers
- Web UI exposed through controlled port forwarding
Audit Logging:
- Comprehensive logging through SigNoz deployment
- LLM usage tracking via optional LangFuse integration
- Task execution traces for compliance monitoring
Transport Security:
- HTTPS for all external LLM API communications
- Internal component communication over Kubernetes service mesh
When to Use Buttercup CRS
The evidence suggests this project fits well for:
- Research organizations participating in AI security challenges requiring automated vulnerability discovery and patching capabilities
- Security teams with budget for LLM API consumption who need to process C/Java codebases at scale
- Organizations with Kubernetes infrastructure looking to integrate AI-driven security analysis into existing workflows
Consider alternatives when:
- Working primarily with languages other than C and Java (limited target language support)
- Operating under strict budget constraints (requires ongoing LLM API costs)
- Needing immediate deployment without complex setup (high infrastructure requirements with Kubernetes dependency)