Scaling Kubernetes Traffic: Deploying Kong for JWT, Rate Limiting, and Observability

Table of Contents

Choosing the Right Gateway Strategy for Kubernetes

Scaling from five microservices to fifty is where the real pain starts. While managing a few services is a breeze, baking authentication and traffic logic into every single app creates a massive maintenance debt. I’ve watched teams burn weeks trying to sync rate-limiting policies across different languages, only to realize they need a centralized way to handle the edge.

When mapping out your Kubernetes traffic strategy, you usually run into three distinct options:

Standard Ingress Controllers (like Nginx): These are your workhorses for basic L7 routing and SSL. They fall short when you need advanced logic like consumer-tiering or complex API keys.
Service Mesh (like Istio): This is the heavy artillery. It’s perfect for internal service-to-service security. However, the operational cost and sidecar overhead are often too high for teams just starting out.
API Gateways (like Kong): This is the sweet spot. Kong manages North-South traffic (users to services) with a massive library of over 100 plugins ready to go.

In practice, Kong gains an edge because it balances raw performance with a huge ecosystem. I’ve deployed this in production environments where sub-millisecond latency was non-negotiable. The Kubernetes Ingress Controller (KIC) model makes this setup feel like a native part of the cluster rather than a bolted-on extra.

The Trade-offs: Why Kong?

Why it works

Plugin Ecosystem: You can toggle JWT, OAuth2, and CORS via YAML. No more writing repetitive boilerplate code in your Go or Java services.
High Throughput: Built on Nginx and OpenResty, Kong handles thousands of concurrent requests with minimal CPU jitter.
GitOps Ready: Configuration lives in Custom Resource Definitions (CRDs). This means your gateway settings are versioned right alongside your application code.

The Reality Check

Overkill for Simple Apps: If you only have two services and no need for auth at the edge, a basic Nginx controller is simpler.
State Management: While Kong traditionally needs Postgres, the DB-less mode is the way to go for K8s, though it requires a different mindset for configuration updates.

The Recommended Setup: DB-less and CRD-driven

For modern Kubernetes environments, I always recommend DB-less mode. Instead of maintaining a separate Postgres instance, we treat Kubernetes as the source of truth. Your settings live in ETCD. The Kong Ingress Controller watches for changes and pushes them directly into Kong’s memory.

This architecture is lean and resilient. We’ll focus on the three pillars of a solid API: Security (JWT), Traffic Control (Rate Limiting), and Visibility (Prometheus/Grafana).

Implementation Guide: Setting Up Kong

1. Installation via Helm

Helm is the standard here because it manages the messy RBAC and service definitions for us. Let’s get the gateway running in a dedicated namespace.

helm repo add kong https://charts.konghq.com
helm repo update

# Create a namespace for API management
kubectl create namespace kong

# Install Kong in DB-less mode
helm install kong kong/kong -n kong \
  --set ingressController.enabled=true \
  --set env.database=off

2. Deploying a Test Service

We need a target. This simple echo service will act as our internal microservice to verify that traffic is flowing correctly.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-service
spec:
  replicas: 1
  selector:
    matchLabels:
      app: echo
  template:
    metadata:
      labels:
        app: echo
    spec:
      containers:
      - name: echo
        image: ealen/echo-server
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: echo-service
spec:
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: echo

3. Protecting the Backend with Rate Limiting

A quick win for production stability is preventing brute-force attacks or runaway scripts. We’ll define a KongPlugin to cap requests at 5 per minute for this demo. In production, you might set this to 1,000 per second depending on your capacity.

apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: global-rate-limit
  namespace: kong
config:
  minute: 5
  policy: local
plugin: rate-limiting
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: echo-ingress
  annotations:
    konghq.com/plugins: global-rate-limit
spec:
  ingressClassName: kong
  rules:
  - http:
      paths:
      - path: /echo
        pathType: ImplementationSpecific
        backend:
          service:
            name: echo-service
            port:
              number: 80

4. Offloading JWT Authentication

Moving auth logic to the gateway keeps your services clean. First, enable the JWT plugin. Then, create a “Consumer”—an entity allowed to call your API.

apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: jwt-auth
  namespace: kong
plugin: jwt
---
apiVersion: configuration.konghq.com/v1
kind: KongConsumer
metadata:
  name: mobile-app-user
  namespace: kong
username: mobile-app-user
credentials:
- mobile-app-jwt-secret
---
apiVersion: v1
kind: Secret
metadata:
  name: mobile-app-jwt-secret
  namespace: kong
type: Opaque
stringData:
  kongCredType: jwt
  key: "issuer-77" # Matches the 'iss' claim in your JWT
  secret: "keep-this-very-secret"

After updating your Ingress annotations to include jwt-auth, Kong will block any request lacking a valid token. Your backend never even sees the unauthorized traffic.

5. Visibility with Prometheus

Observability isn’t optional for production traffic. I always link Kong to Prometheus to track latencies and 4xx/5xx error rates. Kong provides a native plugin that exposes these metrics instantly.

apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: prometheus-metrics
  namespace: kong
plugin: prometheus
config:
  per_consumer: true

Once you apply this, Kong serves a metrics endpoint on port 8001. Point your Prometheus scraper there. For a shortcut to professional monitoring, import Grafana Dashboard ID 7424 to see your traffic volume and health in real-time.

Final Thoughts on Scaling

I used to worry about adding another hop in the network path. However, the benefits of centralizing security and traffic control far outweigh the tiny latency hit. By going DB-less, you remove the risk of database connection bottlenecks and keep your gateway truly ephemeral.

If you eventually need OIDC or enterprise-grade support, the jump from this setup is simple. For most engineering teams, this CRD-based approach offers the perfect balance of control without the operational nightmare of a full service mesh.