Six Months In — What GKE Actually Looks Like in Production
I’ve been running production workloads on Google Kubernetes Engine for about six months now, and the experience has been a mix of pleasant surprises and “why didn’t anyone warn me about this” moments. Before touching anything, I spent too long reading docs that skipped the practical tradeoffs. So this writeup is the article I wish I had.
The core question when you start with GKE is simple: Autopilot or Standard? The answer isn’t obvious, and choosing wrong means either overpaying or losing flexibility at the worst time.
Approach Comparison: GKE Autopilot vs Standard Mode
GKE comes in two fundamentally different operating models, and they’re not just feature tiers — they represent different philosophies about who manages what.
Standard Mode
You provision and manage node pools yourself. You decide machine types, autoscaling parameters, disk sizes, and node counts. Kubernetes runs exactly as you configure it. This is familiar territory if you’ve run self-managed clusters before.
The tradeoff: you pay for reserved node capacity whether pods are scheduled on it or not. A 3-node pool of e2-standard-4 machines costs you 24 vCPUs and 96GB RAM, even if your actual workload needs 6 vCPUs at 2 AM.
Autopilot Mode
Google manages the underlying nodes entirely. You only define pods — no node pools, no machine type decisions. Billing is per-pod CPU/memory request, rounded to the nearest second. Idle node capacity doesn’t cost you anything because there are no “nodes” you own.
The tradeoff: you lose control over node configuration. Some DaemonSets don’t work, privileged containers are restricted, and certain workloads (GPU-heavy ML inference, for instance) need special handling.
My Actual Comparison Numbers
Running a typical web backend (3 microservices, moderate traffic, staging + production environments), here’s what the monthly bill looked like:
Standard Mode (3 × e2-standard-4, always-on):
Node cost: ~$180/month (3 nodes × $60)
Actual usage: ~35-40% average CPU utilization
Effective cost per used vCPU-hour: ~2.8×
Autopilot Mode (same workloads):
Pod requests: ~$110/month (paying for what we asked)
Savings: ~38%
For teams that don’t run 24/7 high-utilization workloads, Autopilot wins on cost. The moment you have sustained 70%+ node utilization, Standard starts making sense again.
Pros and Cons
GKE Autopilot
- Pro: No node management overhead — no patching, no capacity planning, no “why is this node NotReady at 3 AM”
- Pro: Pay-per-pod billing eliminates idle capacity waste
- Pro: Built-in security hardening (workload identity, shielded nodes, binary authorization)
- Con: Pod startup can be slower (30–60s) if Autopilot needs to provision underlying infrastructure
- Con: No privileged containers — breaks some monitoring agents and legacy apps
- Con: Minimum resource requests enforced (0.25 vCPU, 0.5GB RAM per container) — tiny cronjobs get padded
GKE Standard
- Pro: Full control over node configuration — GPUs, custom machine types, spot VMs
- Pro: Faster pod scheduling (nodes already warm)
- Pro: Works with any workload, including privileged containers and custom DaemonSets
- Con: You pay for unused node capacity
- Con: Node upgrades, pool management, and capacity planning are your responsibility
- Con: Easier to misconfigure and end up with a security footgun
Recommended Setup for Most Teams
In my real-world experience, this is one of the essential skills to master: start with Autopilot, migrate to Standard only when you hit a concrete limitation. Most teams never hit that limitation.
Here’s the setup I’d recommend for a typical web application team:
- Use Autopilot for staging and production web workloads
- Use Standard with Spot VMs if you have batch/ML jobs with flexible timing
- Set resource requests conservatively but accurately — Autopilot bills on requests, not actual usage
- Enable Workload Identity from day one — it’s much harder to add later
Implementation Guide
Step 1: Create a GKE Autopilot Cluster
# Install and configure gcloud
gcloud init
gcloud auth application-default login
# Enable required APIs
gcloud services enable container.googleapis.com
# Create Autopilot cluster
gcloud container clusters create-auto my-app-cluster \
--location=asia-northeast1 \
--release-channel=regular
# Get credentials
gcloud container clusters get-credentials my-app-cluster \
--location=asia-northeast1
The --release-channel=regular flag gives you stable Kubernetes versions without manual upgrade management. Use rapid if you want latest features; stable if you’re risk-averse.
Step 2: Deploy Your Application
Create a basic deployment manifest. The key for Autopilot is always setting resource requests:
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: gcr.io/PROJECT_ID/my-app:latest
ports:
- containerPort: 8080
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
---
apiVersion: v1
kind: Service
metadata:
name: my-app-service
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
kubectl apply -f deployment.yaml
kubectl get pods -w
Step 3: Set Up CI/CD with Cloud Build
Cloud Build integrates naturally with GKE and requires minimal IAM configuration. Create a cloudbuild.yaml at your repo root:
# cloudbuild.yaml
steps:
# Build container image
- name: 'gcr.io/cloud-builders/docker'
args:
- 'build'
- '-t'
- 'gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA'
- '.'
# Push to Container Registry
- name: 'gcr.io/cloud-builders/docker'
args:
- 'push'
- 'gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA'
# Deploy to GKE
- name: 'gcr.io/cloud-builders/kubectl'
args:
- 'set'
- 'image'
- 'deployment/my-app'
- 'my-app=gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA'
env:
- 'CLOUDSDK_COMPUTE_REGION=asia-northeast1'
- 'CLOUDSDK_CONTAINER_CLUSTER=my-app-cluster'
images:
- 'gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA'
Connect this to your GitHub repository via Cloud Build Triggers:
# Grant Cloud Build permission to deploy to GKE
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$(gcloud projects describe $PROJECT_ID \
--format='value(projectNumber)')@cloudbuild.gserviceaccount.com" \
--role="roles/container.developer"
# Create a trigger (or do this through the Console UI)
gcloud builds triggers create github \
--repo-name=my-app \
--repo-owner=my-github-username \
--branch-pattern='^main$' \
--build-config=cloudbuild.yaml
Every push to main now rebuilds and redeploys automatically. The whole pipeline — build, push, rollout — typically takes under 3 minutes for a small Go or Node.js service.
Step 4: Cost Optimization — The Numbers That Matter
After the cluster is running, these three adjustments cut my bill noticeably without touching application behavior:
Right-size your resource requests. Run your app under load for a week, then check actual usage:
# Check actual CPU/memory usage vs requests
kubectl top pods --sort-by=cpu
# Get recommendations from GKE's built-in Vertical Pod Autoscaler
kubectl describe vpa my-app
Enable Horizontal Pod Autoscaler. Scale down during off-hours automatically:
kubectl autoscale deployment my-app \
--cpu-percent=60 \
--min=1 \
--max=10
Use committed use discounts. If you’ve been running for a month and have stable baseline usage, a 1-year committed use discount gives 37% off on-demand pricing for Autopilot pod compute. This alone paid for itself in my case within two months.
One Thing I’d Do Differently
Set up Artifact Registry instead of the older Container Registry (gcr.io) from the start. It has better IAM controls, regional storage, and vulnerability scanning built in. Migrating container image URLs later means updating every deployment manifest and CI config — not hard, but annoying.
# Create an Artifact Registry repository
gcloud artifacts repositories create my-app-repo \
--repository-format=docker \
--location=asia-northeast1
# Image URL format
# asia-northeast1-docker.pkg.dev/PROJECT_ID/my-app-repo/my-app:tag
Where to Go From Here
The setup above gets you a working, cost-optimized GKE deployment with automated CI/CD in an afternoon. From this foundation, the natural next steps are adding a proper ingress controller (the GKE-managed one works well for Autopilot), setting up namespace-based environment isolation, and wiring in Cloud Monitoring for alerting.
Autopilot in particular keeps surprising me with how much operational overhead it eliminates — six months in, I’ve had exactly zero node-level incidents. That alone is worth the slightly higher per-vCPU cost compared to self-managed nodes.

