Introduction to SmartSRE

SmartSRE is an Intelligent Remediation Service that automatically detects and fixes common issues in your Google Cloud Platform (GCP) environment using AI-powered agents.

What SmartSRE Does

Capability	Description
Scan	Discovers resources across 8 GCP services and identifies optimization opportunities, performance issues, and security gaps
Plan	AI agents analyze findings and generate safe, reversible remediation plans with cost/impact estimates
Approve	Risk-based guardrails ensure high-impact changes require human approval before execution
Execute	Applies approved changes to your GCP resources with full audit trails
Rollback	Checkpoint-based rollbacks enable safe recovery if issues arise post-execution

Supported GCP Services

SmartSRE provides deep integration with:

BigQuery — Slot optimization, query cost analysis, table lifecycle management
Cloud Run — Auto-scaling, memory/CPU right-sizing, cold start mitigation
Cloud SQL — Connection pooling, HA configuration, storage management
Compute Engine (GCE) — Disk cleanup, snapshot management, instance scheduling
Cloud Storage (GCS) — Lifecycle policies, public access controls, archive transitions
Google Kubernetes Engine (GKE) — Node scaling, HPA tuning, resource quotas
Pub/Sub — Backlog monitoring, dead letter policies
Secret Manager — Rotation schedules, version management

Core Principles

Truth in Automation

SmartSRE follows a strict "Truth in Automation" policy:

Only capabilities that are physically implemented are presented to users
Features in development are clearly labeled "Coming Soon"
Visual mockups never show non-existent integrations

Human-in-the-Loop by Default

For safety, SmartSRE requires human approval for all changes by default:

Free, Team, and Pro tiers always require approval before execution
Enterprise tenants may enable "Zero-Touch" policies for low-risk changes
All changes create rollback checkpoints for safe recovery

Fail-Closed Security

When in doubt, SmartSRE blocks rather than guesses:

Ambiguous permissions result in explicit denial
Unknown operations require approval
Risk guardrails enforce cost and impact limits

Next Steps

Quickstart Guide — Connect your first GCP project in 5 minutes
GCP Onboarding — Detailed guide to the connection process
Key Concepts — Understand Findings, ChangeSets, Scopes, and more

What SmartSRE Does​

Supported GCP Services​

Core Principles​

Truth in Automation​

Human-in-the-Loop by Default​

Fail-Closed Security​

Next Steps​