Buttercup CRS: AI-Powered Vulnerability Discovery and Patching System

┌─────────────────────────────────────────────────────┐
│ Analysis Summary                                    │
├─────────────────────────────────────────────────────┤
│ Type: Project                                       │
│ Purpose: AI-Powered Vulnerability Discovery and Patching System│
│ Primary Language: python + json + yaml              │
│ LOC: 125K                                           │
│ Test Files: 110                                     │
│ Architecture: python                                │
│ Confidence: High                                    │
└─────────────────────────────────────────────────────┘

Analyzed: 008bb9cd from 2025-10-03

Buttercup CRS: AI-Powered Vulnerability Discovery and Patching System

Buttercup is a Cyber Reasoning System (CRS) developed by Trail of Bits for the DARPA AI Cyber Challenge (AIxCC). The system automates the discovery and patching of software vulnerabilities in open-source C and Java repositories through AI/ML-assisted fuzzing campaigns built on OSS-Fuzz. When vulnerabilities are found, Buttercup analyzes them and uses a multi-agent AI-driven patcher to repair the vulnerability automatically.

The system consists of five core components: an Orchestrator that coordinates the overall workflow, a Seed Generator for creating fuzzing inputs, a Fuzzer for vulnerability discovery, a Program Model for code analysis, and a Patcher for generating security fixes.

Quick Start

git clone --recurse-submodules https://github.com/trailofbits/buttercup.git
cd buttercup
make setup-local
make deploy
make send-libpng-task

Time to first vulnerability scan: ~10 minutes (requires third-party AI API keys)

Alternative Approaches

Solution Setup Complexity AI Integration Target Languages Cost Model
Buttercup High Native LLM C, Java Pay-per-API-call
CodeQL Medium None 20+ languages Enterprise license
Semgrep Low Limited 30+ languages Freemium
Snyk Low Basic 10+ languages SaaS subscription
SonarQube Medium None 25+ languages Self-hosted/SaaS

Architecture and Implementation

The system uses a distributed architecture with Kubernetes orchestration. The competition API serves as the central coordination point, implementing a FastAPI-based REST interface with Pydantic models for type safety.

Task management follows a structured approach with clearly defined data models:

class TaskInfo(BaseModel):
    task_id: str
    name: str | None = None
    project_name: str
    status: str  # active, expired
    duration: int
    deadline: str
    challenge_repo_url: str | None = None
    challenge_repo_head_ref: str | None = None
    challenge_repo_base_ref: str | None = None
    fuzz_tooling_url: str | None = None
    fuzz_tooling_ref: str | None = None
    povs: list[dict[str, Any]] = []
    patches: list[dict[str, Any]] = []
    bundles: list[dict[str, Any]] = []

File: orchestrator/src/buttercup/orchestrator/ui/competition_api/main.py:64-78

The artifact management system uses a file-based approach with organized directory structures. The implementation handles different artifact types through a unified interface:

def save_artifact(
    task_id: str,
    artifact_type: str,
    artifact_id: str,
    content: str | dict,
    is_base64: bool = False,
) -> bool:
    """Save an artifact to the appropriate directory structure."""
    try:
        run_dir = get_run_data_dir()
        task_dir = run_dir / task_id / artifact_type
        task_dir.mkdir(parents=True, exist_ok=True)

        if artifact_type == "bundles":
            file_path = task_dir / f"{artifact_id}.json"

File: orchestrator/src/buttercup/orchestrator/ui/competition_api/main.py:158-172

The system implements singleton patterns for core services with lazy initialization:

def get_database_manager() -> DatabaseManager:
    """Get database manager singleton."""
    global _database_manager
    if _database_manager is None:
        settings = get_settings()
        _database_manager = DatabaseManager(settings.database_url)
    return _database_manager

File: orchestrator/src/buttercup/orchestrator/ui/competition_api/main.py:140-154

Performance Characteristics

System Requirements:

  • CPU: 8 cores minimum
  • Memory: 16 GB RAM
  • Storage: 100 GB available space
  • Network: Stable internet for AI API calls

Component Distribution:

  • Total codebase: 124,911 lines across 544 files
  • Primary language: Python (73,185 lines)
  • Configuration: JSON (24,340 lines) + YAML (16,338 lines)
  • Test coverage: 110 test files

Runtime Dependencies:

  • Kubernetes cluster for orchestration
  • Redis for task registry and state management
  • Third-party LLM APIs (OpenAI, Anthropic, Google)
  • Docker containers for component isolation

Best for: Organizations needing automated vulnerability discovery and patching for C/Java codebases with budget for AI API consumption.

Security Architecture

Credential Management:

  • API keys for third-party LLM services stored as Kubernetes secrets
  • Database credentials managed through environment variables
  • No hardcoded secrets in codebase

Network Security:

  • Component isolation through Kubernetes networking
  • External API calls limited to configured LLM providers
  • Web UI exposed through controlled port forwarding

Audit Logging:

  • Comprehensive logging through SigNoz deployment
  • LLM usage tracking via optional LangFuse integration
  • Task execution traces for compliance monitoring

Transport Security:

  • HTTPS for all external LLM API communications
  • Internal component communication over Kubernetes service mesh

When to Use Buttercup CRS

The evidence suggests this project fits well for:

  • Research organizations participating in AI security challenges requiring automated vulnerability discovery and patching capabilities
  • Security teams with budget for LLM API consumption who need to process C/Java codebases at scale
  • Organizations with Kubernetes infrastructure looking to integrate AI-driven security analysis into existing workflows

Consider alternatives when:

  • Working primarily with languages other than C and Java (limited target language support)
  • Operating under strict budget constraints (requires ongoing LLM API costs)
  • Needing immediate deployment without complex setup (high infrastructure requirements with Kubernetes dependency)