keda
Overview
- Namespace:
keda - Purpose: Kubernetes Event-Driven Autoscaling - PRODUCTION
- Age: ~213 days (since May 2024)
- Status: Active - Event-driven autoscaling platform
- Workloads: 3 deployments (all active)
- Environment: PRODUCTION - Enables queue-based scaling
Architecture
KEDA (Kubernetes Event-Driven Autoscaling) for event-driven workload scaling:
- KEDA Operator: Core operator for ScaledObject management (1 replica)
- Admission Webhooks: Validation for ScaledObject creation (1 replica)
- Metrics API Server: Exposes custom metrics for scaling (1 replica)
Auto-Scaling Configuration
Not Applicable:
- KEDA itself provides autoscaling capability (not auto-scaled)
- Fixed 1 replica for each component
- Core infrastructure component
Workload Categories
KEDA Operator (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| keda-operator | 1/1 | Running | Event-driven autoscaling operator |
Admission Webhooks (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| keda-admission-webhooks | 1/1 | Running | Webhook validation for ScaledObjects |
Metrics API Server (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| keda-operator-metrics-apiserver | 1/1 | Running | Custom metrics provider |
Services
| Name | Type | Cluster IP | Ports | Purpose |
|---|---|---|---|---|
| keda-operator | ClusterIP | 10.8.30.55 | 9666 | Operator metrics |
| keda-admission-webhooks | ClusterIP | 10.8.19.108 | 443 | Webhook service |
| keda-operator-metrics-apiserver | ClusterIP | 10.8.27.114 | 443, 8080 | Custom metrics API |
Access & Management
View all resources:
kubectl get all -n keda
kubectl get scaledobjects --all-namespaces | head -20
Check operator status:
# View KEDA deployments
kubectl get deployments -n keda
# View KEDA logs
kubectl logs -f deployment/keda-operator -n keda
kubectl logs -f deployment/keda-admission-webhooks -n keda
# Check for errors
kubectl logs deployment/keda-operator -n keda --tail=100 | grep -i "error"
Monitor scaling:
# List all ScaledObjects
kubectl get scaledobjects --all-namespaces
# Watch ScaledObject activity
kubectl get scaledobjects -A -w
# Check ScaledObject details
kubectl describe scaledobject <name> -n <namespace>
Restart services:
# Restart KEDA operator
kubectl rollout restart deployment/keda-operator -n keda
# Restart all KEDA components
kubectl rollout restart deployment --all -n keda
Monitoring
Operator metrics:
kubectl top pods -n keda
# Check operator health
kubectl port-forward -n keda service/keda-operator 9666:9666
Metrics server:
# Check metrics API
kubectl port-forward -n keda service/keda-operator-metrics-apiserver 8080:8080
# List custom metrics
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq .
Events:
kubectl get events -n keda --sort-by='.lastTimestamp' | head -20
Scaling Sources
KEDA supports scaling based on:
- Message Queues: RabbitMQ, Kafka, Azure Service Bus
- Databases: PostgreSQL, MySQL, Redis
- Cloud Services: AWS SQS, Azure Queue Storage
- Metrics: Prometheus, Datadog, New Relic
- HTTP: Custom webhooks and HTTP endpoints
- Schedulers: Cron jobs, calendars
- External: Custom scalers
Data Flow
Event Source (Queue, Metrics, etc.)
↓
KEDA Operator (Monitoring)
↓
ScaledObject (Scaling configuration)
↓
Metrics API Server (Provides metrics)
↓
Kubernetes HPA (Scales deployment)
↓
Pods scale up/down based on events
KEDA Workflow
1. Operator
- 1 replica (core component)
- Watches ScaledObjects across cluster
- Monitors event sources
- Creates/manages HPA resources
- Handles scaling logic
2. Admission Webhooks
- Validates ScaledObject YAML
- Prevents invalid configurations
- Webhook validation rules
- Ensures correct trigger syntax
3. Metrics API Server
- Exposes custom metrics
- Integrates with Kubernetes metrics
- Provides scaling metrics
- Enables native HPA integration
Production Considerations
High Availability
Single Point of Failure:
- Operator: 1 replica (no HA)
- Webhooks: 1 replica (no HA)
- Metrics API: 1 replica (no HA)
- Single member can cause scaling delays
Recommendations
-
Operator Resilience:
- Current: 1 replica (acceptable for non-critical scaling)
- Consider 2+ replicas if scaling criticality requires HA
- Pod restart = brief scaling delay (not permanent)
-
Webhook Redundancy (Optional):
- Current: 1 replica (acceptable for validation)
- No user impact if briefly unavailable
- New ScaledObjects would temporarily queue
-
Monitor ScaledObjects:
- Verify ScaledObjects are active
- Check scaling triggers are working
- Monitor metrics collection
- Alert on operator failures
-
Error Handling:
- Configure fallback HPA alongside KEDA
- Document scaling trigger failures
- Have manual scaling procedures
- Regular testing of scaling
-
Performance Tuning:
- Adjust reconcile intervals
- Configure concurrent scaling limits
- Tune metrics polling frequency
- Monitor API latency
Troubleshooting
Operator issues:
# Check operator logs
kubectl logs -f deployment/keda-operator -n keda
# Check operator status
kubectl get deployments -n keda -o wide
# Check for errors
kubectl logs deployment/keda-operator -n keda --tail=50 | grep -i "error\|fail\|warn"
# Restart operator
kubectl rollout restart deployment/keda-operator -n keda
ScaledObject issues:
# List all ScaledObjects
kubectl get scaledobjects -A
# Check specific ScaledObject
kubectl describe scaledobject <name> -n <namespace>
# Check ScaledObject events
kubectl describe scaledobject <name> -n <namespace> | grep -A 20 "Events:"
# Check generated HPA
kubectl get hpa -n <namespace> | grep keda
Metrics collection issues:
# Check if metrics are available
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq '.resources[] | .name'
# Verify metrics endpoint
kubectl port-forward -n keda service/keda-operator-metrics-apiserver 8080:8080
curl http://localhost:8080/metrics
# Check scaler logs
kubectl logs deployment/keda-operator -n keda | grep -i "scaler\|trigger\|metric"
Webhook issues:
# Check webhook service
kubectl get svc -n keda keda-admission-webhooks
# Check webhook logs
kubectl logs deployment/keda-admission-webhooks -n keda --tail=50
# Verify webhook is active
kubectl get validatingwebhookconfigurations | grep keda
Performance Metrics
Current Scale
- Operator: 1 replica (core component)
- Webhooks: 1 replica (validation)
- Metrics API: 1 replica (metrics provider)
- Total Active Pods: 3 pods
Stability
- KEDA Age: ~213 days (mature)
- Deployment Status: All healthy
- Pod Restarts: Check for recent restarts
- ScaledObjects: Monitor active count
Architecture Notes
- Event-Driven: Enables queue-based scaling (Kafka, RabbitMQ, etc.)
- Custom Metrics: Extends Kubernetes metrics system
- Extensible: Pluggable scalers for different event sources
- Standard: Uses native Kubernetes HPA under the hood
- Critical Role: Powers all queue-based autoscaling