Introduction to SmartSRE
SmartSRE is an Intelligent Remediation Service that automatically detects and fixes common issues in your Google Cloud Platform (GCP) environment using AI-powered agents.
What SmartSRE Does
| Capability | Description |
|---|---|
| Scan | Discovers resources across 8 GCP services and identifies optimization opportunities, performance issues, and security gaps |
| Plan | AI agents analyze findings and generate safe, reversible remediation plans with cost/impact estimates |
| Approve | Risk-based guardrails ensure high-impact changes require human approval before execution |
| Execute | Applies approved changes to your GCP resources with full audit trails |
| Rollback | Checkpoint-based rollbacks enable safe recovery if issues arise post-execution |
Supported GCP Services
SmartSRE provides deep integration with:
- BigQuery — Slot optimization, query cost analysis, table lifecycle management
- Cloud Run — Auto-scaling, memory/CPU right-sizing, cold start mitigation
- Cloud SQL — Connection pooling, HA configuration, storage management
- Compute Engine (GCE) — Disk cleanup, snapshot management, instance scheduling
- Cloud Storage (GCS) — Lifecycle policies, public access controls, archive transitions
- Google Kubernetes Engine (GKE) — Node scaling, HPA tuning, resource quotas
- Pub/Sub — Backlog monitoring, dead letter policies
- Secret Manager — Rotation schedules, version management
Core Principles
Truth in Automation
SmartSRE follows a strict "Truth in Automation" policy:
- Only capabilities that are physically implemented are presented to users
- Features in development are clearly labeled "Coming Soon"
- Visual mockups never show non-existent integrations
Human-in-the-Loop by Default
For safety, SmartSRE requires human approval for all changes by default:
- Free, Team, and Pro tiers always require approval before execution
- Enterprise tenants may enable "Zero-Touch" policies for low-risk changes
- All changes create rollback checkpoints for safe recovery
Fail-Closed Security
When in doubt, SmartSRE blocks rather than guesses:
- Ambiguous permissions result in explicit denial
- Unknown operations require approval
- Risk guardrails enforce cost and impact limits
Next Steps
- Quickstart Guide — Connect your first GCP project in 5 minutes
- GCP Onboarding — Detailed guide to the connection process
- Key Concepts — Understand Findings, ChangeSets, Scopes, and more