Stop the Kubectl Snowflake: Automating Kubernetes Infrastructure with FluxCD

Table of Contents

The 2 AM Drift Disaster

My pager screamed at 2:14 AM. A critical production service was throwing 503 errors. After digging into the cluster, I found the culprit: the deployment configuration didn’t match our repository. A well-meaning engineer, under pressure to fix a routing bug, had used kubectl edit to bypass the CI/CD pipeline. They solved the immediate issue but forgot to commit the change back to the source code.

Your cluster is a ‘snowflake’ the moment it drifts from your code. In a manual or push-based world, your repository is merely a suggestion, not the source of truth. Moving our infrastructure to GitOps changed everything. It turned my team’s focus from firefighting manual errors to peer-reviewed stability, ensuring we actually sleep through the night.

Why FluxCD Wins the ‘Push vs. Pull’ Debate

Jenkins and GitHub Actions typically use a ‘Push’ model. They run a script, execute kubectl apply, and hope the network doesn’t drop. If the cluster is unreachable for ten seconds, the push fails, and your environment stays out of sync. FluxCD flips this logic. It lives inside your cluster as a set of controllers that ‘pull’ configurations from Git.

Flux monitors your repository every minute. If it detects a discrepancy—like that manual 2 AM edit—it immediately overwrites the cluster state to match the code. It doesn’t ask for permission; it enforces the truth.

Setting Up the Flux Control Plane

To start, you’ll need the Flux CLI and a Kubernetes cluster. Whether you are running on GKE, EKS, or a local Kind cluster, the process is identical. The Flux CLI is the best tool here because it handles the complex bootstrap process, including SSH key generation and controller deployment, in a single pass.

1. Get the Flux CLI

For Linux or macOS, use this quick script to fetch the binary:

curl -s https://fluxcd.io/install.sh | sudo bash
# Confirm the binary is in your path
flux --version

2. The Pre-flight Check

Not every cluster is ready for GitOps out of the box. Run this check to verify your Kubernetes version and RBAC permissions:

flux check --pre

Bootstrapping: Giving the Cluster its Brain

The bootstrap command is the foundation of your GitOps setup. It creates a private repository (if it doesn’t exist), configures deployment keys, and installs the Flux controllers into the flux-system namespace. This is where Flux becomes self-managing. If you want to upgrade Flux itself later, you just change a version number in Git.

Export Your Credentials

Generate a Personal Access Token (PAT) with repo permissions on GitHub and export it:

export GITHUB_TOKEN=your_token_here
export GITHUB_USER=your_username

Execute the Bootstrap

flux bootstrap github \
  --owner=$GITHUB_USER \
  --repository=fleet-infra \
  --branch=main \
  --path=./clusters/my-cluster \
  --personal

After about 90 seconds, you’ll see a new fleet-infra repo in your account. This repository is now the brain of your cluster.

Defining Your Source of Truth

With the engine running, we need to point it at our application code. Flux uses two main resources: a GitRepository (where the code lives) and a Kustomization (how to apply it).

1. Connect Your App Repository

Tell Flux to check your webapp-config repo for changes every 30 seconds:

flux create source git webapp-repo \
  --url=https://github.com/my-org/webapp-config \
  --branch=main \
  --interval=30s \
  --export > ./clusters/my-cluster/webapp-source.yaml

2. Set the Deployment Strategy

The Kustomization resource is the actual instruction set. Use the --prune=true flag religiously. It ensures that if you delete a YAML file in Git, Flux removes that resource from the cluster within 5 minutes. Without pruning, your cluster becomes a graveyard of old, orphaned services.

flux create kustomization webapp-deploy \
  --target-namespace=production \
  --source=webapp-repo \
  --path="./deploy/prod" \
  --prune=true \
  --wait=true \
  --interval=5m \
  --export > ./clusters/my-cluster/webapp-kustomization.yaml

Commit these files to fleet-infra. Within seconds, Flux will pull the new instructions and start the deployment.

Testing the Enforcement

Trust but verify. I always run a ‘Chaos Test’ to prove the system works. Manually scale your deployment to 10 replicas using the CLI:

kubectl scale deployment my-webapp --replicas=10 -n production

Now, watch the Flux logs or wait for the reconciliation interval. You will see Flux detect that the cluster state (10 replicas) violates the Git state (3 replicas). It will automatically scale the pods back down. That is the moment you realize you can finally trust your infrastructure.

Alerting: Catching Errors Early

Production GitOps requires visibility. You shouldn’t have to poll the CLI to see if a sync failed. Flux integrates with Slack or Microsoft Teams to alert you the moment a developer pushes a broken YAML file.

flux create alert-provider slack-notifier \
  --type=slack \
  --channel=ops-alerts \
  --address=https://hooks.slack.com/services/XXXXX \
  --export > ./clusters/my-cluster/slack-provider.yaml

We catch ‘Broken Git States’ during business hours now. If a manifest is invalid, the team gets a Slack notification within 15 seconds of the merge, long before it becomes a 2 AM emergency.

The Bottom Line

GitOps with FluxCD turns your cluster into a mirror of your code. It removes the ‘human element’ from the deployment phase, forcing every change through a documented, peer-reviewed Pull Request. It might feel rigid at first. However, the 99.9% configuration consistency and the ability to audit every single infrastructure change make it an indispensable standard for modern DevOps teams.