Key Concepts

This guide explains the core terminology and concepts used throughout SmartSRE.

Findings

A Finding represents an issue or optimization opportunity detected during a scan.

{
  "id": "finding-abc123",
  "service": "cloudrun",
  "resource_id": "projects/my-project/locations/us-central1/services/api",
  "issue_type": "high_memory_usage",
  "severity": "medium",
  "details": {
    "current_memory": "2Gi",
    "avg_utilization": "15%",
    "recommended_memory": "512Mi"
  }
}

Severity Levels

Severity	Description	Examples
Critical	Immediate action required—security breach or service down	Public bucket with sensitive data, expired SSL certificate
High	Significant impact—performance degradation or cost overrun	Memory OOM crashes, runaway query costs
Medium	Should address soon—suboptimal configuration	Over-provisioned resources, missing lifecycle policies
Low	Nice to fix—minor optimization	Unused but cheap resources
Info	Informational only—no action needed	Successful configurations, compliance confirmations

ChangeSets

A ChangeSet is a collection of atomic operations proposed to remediate one or more findings.

{
  "service": "cloudrun",
  "intent": "Reduce memory allocation to match actual usage",
  "steps": [
    {
      "op": "scale_memory",
      "resource_ref": {
        "project_id": "my-project",
        "region": "us-central1",
        "service_name": "api"
      },
      "params": {
        "memory": "512Mi"
      },
      "impact_score": 25,
      "estimated_cost_usd": -15.00
    }
  ]
}

ChangeStep Properties

Property	Description
`op`	Canonical operation name (e.g., `scale_memory`, `set_lifecycle_rule`)
`resource_ref`	Target resource identifier
`params`	Operation-specific parameters
`impact_score`	0-100 score indicating potential disruption (higher = more risky)
`estimated_cost_usd`	Expected monthly cost change (negative = savings)

Scopes

A Scope defines which resources SmartSRE will scan and what operations are permitted.

Use Cases

Limit by project: Only scan prod-project, not dev-project
Limit by service: Only scan Cloud Run services, not BigQuery
Limit by region: Only scan us-central1 resources
Limit by operation: Allow scale_memory but not delete_service

Scope Policies

Scopes can include an allowed_ops policy that restricts which operations SmartSRE can execute:

{
  "policy": {
    "allowed_ops": ["scale_memory", "scale_cpu", "set_min_instances"],
    "risk_profile": "guarded"
  }
}

If an operation is not in allowed_ops, SmartSRE will block execution even if the finding is valid.

Risk Profiles

A Risk Profile determines how aggressively SmartSRE acts on findings.

Profile	Behavior
Conservative	All changes require approval; low-impact changes still flagged
Balanced	Default; follows standard risk/cost guardrails
Aggressive	Lower thresholds for auto-approval; suited for non-production

Approvals

When a ChangeSet exceeds risk thresholds, SmartSRE creates an Approval Request.

Approval States

Approvals can be delivered via:

Webhook — POST to a configured URL with HMAC signature
Email — Notification to designated approvers
In-App — Visible on the Approvals page

Checkpoints & Rollbacks

Before executing a ChangeSet, SmartSRE creates a Checkpoint capturing the pre-change state.

{
  "checkpoint_id": "cp-789xyz",
  "execution_id": "exec-456",
  "service": "cloudrun",
  "before_state": {
    "memory": "2Gi",
    "cpu": "2",
    "min_instances": 0
  },
  "change_steps_applied": ["scale_memory"],
  "ttl_hours": 72
}

If issues arise post-execution, the Rollback operation uses this checkpoint to restore the original state.

Tenants

A Tenant represents an organization using SmartSRE. Each tenant has:

Isolated data (projects, scopes, findings, audit logs)
Separate billing and subscription
Independent RBAC configuration

Users can belong to multiple tenants and switch between them.

Next Steps

Running Scans — Execute and configure scans
Scope Management — Create and manage scopes
Risk Guardrails — Configure safety thresholds

Findings​

Severity Levels​

ChangeSets​

ChangeStep Properties​

Scopes​

Use Cases​

Scope Policies​

Risk Profiles​

Approvals​

Approval States​

Checkpoints & Rollbacks​

Tenants​

Next Steps​