Cloud Run
SmartSRE provides auto-scaling optimization and performance tuning for Google Cloud Run.
What SmartSRE Scans
| Category | Checks |
|---|---|
| Performance | CPU/memory utilization, request latency, error rates |
| Scaling | Min/max instances, cold start frequency |
| Cost | Over-provisioning, idle resources |
| Security | IAM bindings, invoker permissions |
Findings
High-Priority
| Issue Type | Severity | Description |
|---|---|---|
high_memory_usage | High | Memory > 90% for extended period |
high_cpu_usage | High | CPU > 85% for extended period |
high_error_rate | High | Error rate > 5% |
oom_crashes | Critical | Out-of-memory terminations detected |
Medium-Priority
| Issue Type | Severity | Description |
|---|---|---|
low_memory_usage | Medium | Memory < 20% utilization |
cold_start_risk | Medium | Min instances = 0 with high traffic |
high_request_latency | Medium | P95 latency > 2 seconds |
Available Fixes
Memory & CPU Scaling
| Operation | Description | Impact |
|---|---|---|
scale_memory | Adjust memory allocation | Low |
scale_cpu | Adjust CPU allocation | Low |
Instance Management
| Operation | Description | Impact |
|---|---|---|
set_min_instances | Configure minimum warm instances | Low |
set_max_instances | Set maximum scaling limit | Low |
IAM & Security
| Operation | Description | Impact |
|---|---|---|
grant_invoker | Add service account as invoker | Medium |
revoke_public_access | Remove allUsers invoker binding | High |
Required Permissions
For Scanning
roles/run.viewer
roles/monitoring.viewer
For Remediation
roles/run.admin
roles/iam.serviceAccountUser
Example ChangeSet
{
"service": "cloudrun",
"intent": "Right-size memory allocation based on actual usage",
"steps": [
{
"op": "scale_memory",
"resource_ref": {
"project_id": "my-project",
"region": "us-central1",
"service_name": "api-service"
},
"params": {
"current_memory": "2Gi",
"target_memory": "512Mi"
},
"estimated_cost_usd": -35.00,
"impact_score": 15
},
{
"op": "set_min_instances",
"resource_ref": {
"project_id": "my-project",
"region": "us-central1",
"service_name": "api-service"
},
"params": {
"min_instances": 1
},
"estimated_cost_usd": 8.00,
"impact_score": 10
}
]
}
Configuration Options
Configure Cloud Run thresholds in Settings → Risk Policy:
| Setting | Default | Description |
|---|---|---|
max_memory_gi | 8 | Maximum memory allocation (GiB) |
max_cpu_m | 4000 | Maximum CPU (millicores) |
cpu_high_threshold | 85 | CPU % to trigger high_cpu finding |
memory_high_threshold | 90 | Memory % to trigger high_memory finding |
memory_low_threshold | 20 | Memory % to trigger over-provisioning finding |
Rollback Capability
All Cloud Run operations support full rollback:
| Operation | Rollback Method |
|---|---|
scale_memory | Restore previous memory allocation |
scale_cpu | Restore previous CPU allocation |
set_min_instances | Restore previous min instances |
set_max_instances | Restore previous max instances |
Best Practices
- Start with monitoring — Run scans before applying fixes to understand baseline
- Right-size memory first — Memory affects billing and CPU allocation
- Use min instances for latency-sensitive services — Prevents cold starts
- Set max instances for cost control — Prevents runaway scaling