etcd
Overview
- Namespace:
etcd - Purpose: Key-Value Store for APISIX Configuration - PRODUCTION
- Age: ~567 days (since November 2023 / older)
- Status: Active - Distributed key-value store
- Workloads: 1 StatefulSet (1 replica)
- Environment: PRODUCTION - Configuration storage for APISIX
Architecture
etcd distributed key-value store for APISIX configuration and service discovery:
- StatefulSet: Single etcd member (1 replica) - No clustering
- Persistent Storage: Data persistence (state persistent)
- Service: Both ClusterIP and headless service for discovery
Auto-Scaling Configuration
Not Applicable:
- StatefulSets don't use HPAs
- Single member (not clustered)
- Fixed 1 replica
Workload Categories
Distributed Key-Value Store (1 StatefulSet)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| apisix-etcd | 1/1 | Running | etcd key-value store (Single member) |
Services
| Name | Type | Cluster IP | Ports | Purpose |
|---|---|---|---|---|
| apisix-etcd | ClusterIP | 10.8.20.89 | 2379, 2380 | Client and peer communication |
| apisix-etcd-headless | ClusterIP | None (Headless) | 2379, 2380 | DNS discovery for StatefulSet |
Access & Management
View all resources:
kubectl get all -n etcd
kubectl get statefulset -n etcd
kubectl get pvc -n etcd
Check etcd pod:
# View etcd pod
kubectl get pods -n etcd
# View etcd logs
kubectl logs -f statefulset/apisix-etcd -n etcd
# Check etcd health
kubectl exec -it statefulset/apisix-etcd -n etcd -- etcdctl endpoint health
Access etcd CLI:
# Get shell access
kubectl exec -it statefulset/apisix-etcd -n etcd -- sh
# Check keys (from within pod)
etcdctl --endpoints=localhost:2379 get --prefix "/"
# Get APISIX routes
etcdctl --endpoints=localhost:2379 get --prefix "/apisix"
Monitor storage:
# Check persistent volume usage
kubectl get pvc -n etcd
# Check volume details
kubectl describe pvc -n etcd
Restart/Maintenance:
# Careful with StatefulSet restarts - may lose data if no persistent storage
kubectl rollout restart statefulset/apisix-etcd -n etcd
# Scale operations (not recommended for single-member)
# kubectl scale statefulset apisix-etcd --replicas=1 -n etcd
Monitoring
Pod metrics:
kubectl top pods -n etcd
# Check pod resource requests/limits
kubectl describe pod -n etcd | grep -A 5 "Requests\|Limits"
Storage metrics:
# Check persistent volume usage
kubectl get pvc -n etcd -o wide
# Watch etcd metrics
kubectl port-forward -n etcd statefulset/apisix-etcd 2379:2379
Events:
kubectl get events -n etcd --sort-by='.lastTimestamp' | head -20
Data Flow
APISIX Gateway
↓
etcd Admin API (Port 2379)
↓
apisix-etcd pod (StatefulSet)
↓
Persistent Volume (data storage)
↓
Stored key-value data
etcd Workflow
1. Key-Value Storage
- Single member (no clustering)
- Configuration persistence
- Atomic transactions
- Watch mechanism for changes
- APISIX configuration storage
2. Service Discovery
- DNS discovery via headless service
- Peer communication on port 2380
- Client requests on port 2379
- Health checks and status
3. Data Persistence
- Persistent volume for state
- Automatic state recovery
- Data consistency guarantees
- Backup considerations
Production Considerations
High Availability
CRITICAL ISSUE - NO CLUSTERING:
- Single member (1 replica) - NO REDUNDANCY
- Single point of failure for APISIX configuration
- Pod restart = temporary configuration unavailability
- No leader election or consensus
Data Safety
PRODUCTION RISK:
- Single member means no fault tolerance
- Pod failure = configuration loss (if no persistent storage)
- No replication across nodes
- Backup/recovery dependent on persistent volume
Recommendations
-
URGENT: Cluster etcd:
- Current: Single member ( not production-ready)
- Recommended: 3-member cluster (good HA)
- Provides fault tolerance and leader election
- Prevents split-brain scenarios
-
Persistent Storage:
- Verify persistent volume is healthy
- Monitor disk space usage
- Check volume backup strategy
- Ensure volume snapshots are configured
-
Monitoring:
- Monitor pod restart count
- Check disk space availability
- Monitor etcd commit latency
- Alert on health check failures
-
Backup Strategy:
- Regular etcd snapshots
- Test restore procedures
- Store backups in safe location
- Document recovery procedures
-
Disaster Recovery:
- Document single-member etcd limitations
- Plan cluster upgrade path
- Prepare restore procedures
- Have backup keys/configuration
Troubleshooting
Health checks:
# Check etcd health (from pod)
kubectl exec -it statefulset/apisix-etcd -n etcd -- etcdctl endpoint health
# Check cluster status (single-member)
kubectl exec -it statefulset/apisix-etcd -n etcd -- etcdctl member list
# Monitor etcd metrics
kubectl exec -it statefulset/apisix-etcd -n etcd -- etcdctl metrics
Storage issues:
# Check persistent volume
kubectl get pvc -n etcd
kubectl describe pvc -n etcd
# Check disk usage (from pod)
kubectl exec -it statefulset/apisix-etcd -n etcd -- df -h
# Check data directory
kubectl exec -it statefulset/apisix-etcd -n etcd -- ls -la /bitnami/etcd/data/
Data integrity:
# Defragment etcd (when disk fragmentation occurs)
kubectl exec -it statefulset/apisix-etcd -n etcd -- etcdctl defrag
# Check database size
kubectl exec -it statefulset/apisix-etcd -n etcd -- du -sh /bitnami/etcd/data/
# List all keys
kubectl exec -it statefulset/apisix-etcd -n etcd -- etcdctl get --prefix "/" | head -20
Connection issues:
# Test connectivity from APISIX
kubectl exec -it deployment/apisix -n apisix -- curl -v http://apisix-etcd.etcd:2379/v2/keys/
# Test from external pod
kubectl exec -it <pod> -- curl -v http://apisix-etcd.etcd.svc.cluster.local:2379/health
Performance Metrics
Current Scale
- etcd Members: 1 (single member, NOT CLUSTERED)
- Storage: Persistent volume (check PVC for size)
- Replicas: 1 (StatefulSet)
- Age: ~567 days (very mature)
Stability
- StatefulSet Age: ~567 days (very stable)
- Pod Restarts: Check recent restart count
- Data Persistence: Depends on PVC
- Critical Role: Stores all APISIX configuration
Architecture Notes
- Single Member: Current deployment has no redundancy
- StatefulSet: Maintains stable identity and persistent storage
- Headless Service: DNS discovery for peer communication
- etcd v3: Modern version with strong consistency guarantees
- PRODUCTION ISSUE: Single member not suitable for production HA