ecg
Overview
- Namespace:
ecg - Purpose: ECG (Electrocardiogram) Services - PRODUCTION
- Age: ~4 years 139 days
- Status: Active - ECG data processing and storage
- Workloads: 3 deployments (3 active) + 1 CronJob
- Environment: PRODUCTION - Medical ECG data management
Architecture
ECG data processing platform:
- API: REST API service (1 replica)
- Worker: Celery background worker (1 replica)
- Flower: Celery monitoring dashboard (1 replica)
- CronJob: Daily drivewatch task (1:00 AM daily)
Auto-Scaling Configuration
Not Applicable:
- All deployments at fixed 1 replica
- No HPA configured
- Manual scaling only
Workload Categories
API Service (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| api | 1/1 | Running | ECG API service |
Background Processing (2 deployments)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| worker | 1/1 | Running | Celery background worker |
| flower | 1/1 | Running | Celery task monitoring |
Scheduled Tasks (1 CronJob)
| Name | Schedule | Status | Purpose |
|---|---|---|---|
| drivewatch | 1 0 * * * | Active | Daily drive monitoring (1:00 AM) |
Services
| Name | Type | Cluster IP | Ports | Purpose |
|---|---|---|---|---|
| api | NodePort | 10.8.19.205 | 80:31658 | ECG API |
| flower | NodePort | 10.8.21.173 | 80:32745 | Celery monitoring dashboard |
Access & Management
View all resources:
kubectl get all -n ecg
kubectl get cronjobs -n ecg
Check API:
# View API pod
kubectl get pods -n ecg | grep "^api"
# View logs
kubectl logs -f deployment/api -n ecg
# Test API
kubectl port-forward -n ecg service/api 8080:80
# Access: http://localhost:8080
Check worker:
# View worker pod
kubectl get pods -n ecg | grep worker
# View logs
kubectl logs -f deployment/worker -n ecg
# Check Celery tasks via Flower
kubectl port-forward -n ecg service/flower 5555:80
# Access: http://localhost:5555
Check CronJob:
# View CronJob schedule
kubectl get cronjobs -n ecg
# View recent job runs
kubectl get jobs -n ecg --sort-by=.metadata.creationTimestamp | tail -5
# View job logs
kubectl logs job/drivewatch-<timestamp> -n ecg
Restart services:
# Restart API
kubectl rollout restart deployment/api -n ecg
# Restart worker
kubectl rollout restart deployment/worker -n ecg
# Restart Flower
kubectl rollout restart deployment/flower -n ecg
Monitoring
Pod metrics:
kubectl top pods -n ecg
# Sort by resource usage
kubectl top pods -n ecg --sort-by=memory
CronJob monitoring:
# Check recent job executions
kubectl get jobs -n ecg
# Check for failed jobs
kubectl get jobs -n ecg --field-selector status.successful=0
# View job events
kubectl get events -n ecg --sort-by='.lastTimestamp' | grep -i "job"
Events:
kubectl get events -n ecg --sort-by='.lastTimestamp' | head -20
Data Flow
External ECG Devices/Systems
↓
API (NodePort 31658)
↓
Worker (Celery tasks)
↓
ECG Data Processing
↓
Storage/Database
↓
CronJob (drivewatch - daily at 1 AM)
ECG Workflow
1. API Service
- api (1 replica)
- Receives ECG data
- REST endpoints
- NodePort access (31658)
- Data validation and storage
2. Background Processing
- worker (1 replica)
- Celery async task processing
- ECG data analysis
- Report generation
- File processing
3. Task Monitoring
- flower (1 replica)
- Celery task dashboard
- Task queue monitoring
- Worker health checks
- NodePort access (32745)
4. Scheduled Maintenance
- drivewatch CronJob
- Runs daily at 1:00 AM
- Storage monitoring
- Data cleanup/archival
- Last run: 12h ago (successful)
Production Considerations
High Availability
Single Point of Failure:
- All deployments: 1 replica (No HA)
- API restart = temporary unavailability
- Worker restart = task queue delay
- No redundancy
Recommendations
-
High Availability:
- Current: All at 1 replica (acceptable for medical data processing)
- Consider 2+ replicas for API if uptime is critical
- Worker can remain at 1 if task queue is not time-sensitive
- Flower at 1 is acceptable (monitoring only)
-
CronJob Monitoring:
- drivewatch runs daily at 1 AM
- Monitor for failed executions
- Alert on job failures
- Review job logs regularly
-
Data Backup:
- ECG data is medical data (sensitive)
- Implement regular backups
- Test restore procedures
- Comply with medical data regulations
-
Worker Scaling:
- Consider HPA for worker based on queue depth
- Monitor Celery queue length
- Scale workers during peak processing
- Use KEDA for queue-based scaling
-
API Performance:
- Monitor API response times
- Check for bottlenecks
- Consider caching
- Optimize database queries
Troubleshooting
API issues:
# Check API pod
kubectl get pods -n ecg | grep "^api"
# Check logs
kubectl logs -f deployment/api -n ecg
# Check for errors
kubectl logs deployment/api -n ecg --tail=100 | grep -i "error\|fail"
# Test API
kubectl port-forward -n ecg service/api 8080:80
curl http://localhost:8080/health
Worker not processing tasks:
# Check worker pod
kubectl get pods -n ecg | grep worker
# Check worker logs
kubectl logs -f deployment/worker -n ecg
# Check Celery via Flower
kubectl port-forward -n ecg service/flower 5555:80
# Access: http://localhost:5555
# Restart worker
kubectl rollout restart deployment/worker -n ecg
CronJob failures:
# Check recent jobs
kubectl get jobs -n ecg
# Check failed jobs
kubectl get jobs -n ecg | grep -v "1/1"
# View job logs
kubectl logs job/drivewatch-<timestamp> -n ecg
# Check CronJob configuration
kubectl describe cronjob drivewatch -n ecg
# Manually trigger job (for testing)
kubectl create job --from=cronjob/drivewatch manual-drivewatch-test -n ecg
Flower dashboard not accessible:
# Check Flower pod
kubectl get pods -n ecg | grep flower
# Check logs
kubectl logs -f deployment/flower -n ecg
# Port forward
kubectl port-forward -n ecg service/flower 5555:80
# Restart Flower
kubectl rollout restart deployment/flower -n ecg
Performance Metrics
Current Scale
- API: 1 replica (85 days old)
- Worker: 1 replica (82 days old, 2 restarts)
- Flower: 1 replica (313 days old - very stable)
- CronJob: Daily at 1:00 AM (last run 12h ago)
Stability
- API Age: 85 days (stable)
- Worker Age: 82 days (stable, 2 restarts)
- Flower Age: 313 days (very stable)
- CronJob: Running successfully (last successful: 3y179d)
Architecture Notes
- Medical Data: ECG processing (sensitive data)
- Celery-Based: Async task processing with Flower monitoring
- NodePort Access: API (31658) and Flower (32745)
- Scheduled Tasks: Daily drivewatch at 1 AM
- No HA: All at 1 replica
- Stable: Long-running pods with minimal restarts