Skip to main content

Google Kubernetes Engine (GKE)

SmartSRE provides cluster optimization and scaling for Google Kubernetes Engine.

What SmartSRE Scans

CategoryChecks
ScalingNode pool utilization, HPA configuration
ResourcesPod resource requests/limits
HealthNode conditions, pod restarts
CostOver-provisioned nodes, spot opportunities

Findings

Issue TypeSeverityDescription
node_pool_high_utilizationHighCPU/memory > 85% on node pool
pod_memory_limits_missingMediumPods without memory limits
node_not_readyCriticalNode in NotReady state
high_pod_restart_rateHighPods restarting frequently

Available Fixes

OperationDescriptionImpact
scale_node_poolAdd/remove nodesMedium
enable_node_autoscalingEnable cluster autoscalerLow
set_pod_resourcesUpdate pod resource limitsMedium
drain_nodeSafely drain nodeHigh

Required Permissions

For Scanning

roles/container.clusterViewer
roles/monitoring.viewer

For Remediation

roles/container.clusterAdmin

Best Practices

  1. Enable cluster autoscaler — Automatic node scaling
  2. Set resource requests — Enable accurate scheduling
  3. Use node taints — Isolate workloads
  4. Monitor pod restarts — Early warning for issues