Skip to main content

ecg

Overview

  • Namespace: ecg
  • Purpose: ECG (Electrocardiogram) Services - PRODUCTION
  • Age: ~4 years 139 days
  • Status: Active - ECG data processing and storage
  • Workloads: 3 deployments (3 active) + 1 CronJob
  • Environment: PRODUCTION - Medical ECG data management

Architecture

ECG data processing platform:

  • API: REST API service (1 replica)
  • Worker: Celery background worker (1 replica)
  • Flower: Celery monitoring dashboard (1 replica)
  • CronJob: Daily drivewatch task (1:00 AM daily)

Auto-Scaling Configuration

Not Applicable:

  • All deployments at fixed 1 replica
  • No HPA configured
  • Manual scaling only

Workload Categories

API Service (1 deployment)

NameReplicasStatusPurpose
api1/1RunningECG API service

Background Processing (2 deployments)

NameReplicasStatusPurpose
worker1/1RunningCelery background worker
flower1/1RunningCelery task monitoring

Scheduled Tasks (1 CronJob)

NameScheduleStatusPurpose
drivewatch1 0 * * *ActiveDaily drive monitoring (1:00 AM)

Services

NameTypeCluster IPPortsPurpose
apiNodePort10.8.19.20580:31658ECG API
flowerNodePort10.8.21.17380:32745Celery monitoring dashboard

Access & Management

View all resources:

kubectl get all -n ecg
kubectl get cronjobs -n ecg

Check API:

# View API pod
kubectl get pods -n ecg | grep "^api"

# View logs
kubectl logs -f deployment/api -n ecg

# Test API
kubectl port-forward -n ecg service/api 8080:80
# Access: http://localhost:8080

Check worker:

# View worker pod
kubectl get pods -n ecg | grep worker

# View logs
kubectl logs -f deployment/worker -n ecg

# Check Celery tasks via Flower
kubectl port-forward -n ecg service/flower 5555:80
# Access: http://localhost:5555

Check CronJob:

# View CronJob schedule
kubectl get cronjobs -n ecg

# View recent job runs
kubectl get jobs -n ecg --sort-by=.metadata.creationTimestamp | tail -5

# View job logs
kubectl logs job/drivewatch-<timestamp> -n ecg

Restart services:

# Restart API
kubectl rollout restart deployment/api -n ecg

# Restart worker
kubectl rollout restart deployment/worker -n ecg

# Restart Flower
kubectl rollout restart deployment/flower -n ecg

Monitoring

Pod metrics:

kubectl top pods -n ecg

# Sort by resource usage
kubectl top pods -n ecg --sort-by=memory

CronJob monitoring:

# Check recent job executions
kubectl get jobs -n ecg

# Check for failed jobs
kubectl get jobs -n ecg --field-selector status.successful=0

# View job events
kubectl get events -n ecg --sort-by='.lastTimestamp' | grep -i "job"

Events:

kubectl get events -n ecg --sort-by='.lastTimestamp' | head -20

Data Flow

External ECG Devices/Systems

API (NodePort 31658)

Worker (Celery tasks)

ECG Data Processing

Storage/Database

CronJob (drivewatch - daily at 1 AM)

ECG Workflow

1. API Service

  • api (1 replica)
  • Receives ECG data
  • REST endpoints
  • NodePort access (31658)
  • Data validation and storage

2. Background Processing

  • worker (1 replica)
  • Celery async task processing
  • ECG data analysis
  • Report generation
  • File processing

3. Task Monitoring

  • flower (1 replica)
  • Celery task dashboard
  • Task queue monitoring
  • Worker health checks
  • NodePort access (32745)

4. Scheduled Maintenance

  • drivewatch CronJob
  • Runs daily at 1:00 AM
  • Storage monitoring
  • Data cleanup/archival
  • Last run: 12h ago (successful)

Production Considerations

High Availability

Single Point of Failure:

  • All deployments: 1 replica (No HA)
  • API restart = temporary unavailability
  • Worker restart = task queue delay
  • No redundancy

Recommendations

  1. High Availability:

    • Current: All at 1 replica (acceptable for medical data processing)
    • Consider 2+ replicas for API if uptime is critical
    • Worker can remain at 1 if task queue is not time-sensitive
    • Flower at 1 is acceptable (monitoring only)
  2. CronJob Monitoring:

    • drivewatch runs daily at 1 AM
    • Monitor for failed executions
    • Alert on job failures
    • Review job logs regularly
  3. Data Backup:

    • ECG data is medical data (sensitive)
    • Implement regular backups
    • Test restore procedures
    • Comply with medical data regulations
  4. Worker Scaling:

    • Consider HPA for worker based on queue depth
    • Monitor Celery queue length
    • Scale workers during peak processing
    • Use KEDA for queue-based scaling
  5. API Performance:

    • Monitor API response times
    • Check for bottlenecks
    • Consider caching
    • Optimize database queries

Troubleshooting

API issues:

# Check API pod
kubectl get pods -n ecg | grep "^api"

# Check logs
kubectl logs -f deployment/api -n ecg

# Check for errors
kubectl logs deployment/api -n ecg --tail=100 | grep -i "error\|fail"

# Test API
kubectl port-forward -n ecg service/api 8080:80
curl http://localhost:8080/health

Worker not processing tasks:

# Check worker pod
kubectl get pods -n ecg | grep worker

# Check worker logs
kubectl logs -f deployment/worker -n ecg

# Check Celery via Flower
kubectl port-forward -n ecg service/flower 5555:80
# Access: http://localhost:5555

# Restart worker
kubectl rollout restart deployment/worker -n ecg

CronJob failures:

# Check recent jobs
kubectl get jobs -n ecg

# Check failed jobs
kubectl get jobs -n ecg | grep -v "1/1"

# View job logs
kubectl logs job/drivewatch-<timestamp> -n ecg

# Check CronJob configuration
kubectl describe cronjob drivewatch -n ecg

# Manually trigger job (for testing)
kubectl create job --from=cronjob/drivewatch manual-drivewatch-test -n ecg

Flower dashboard not accessible:

# Check Flower pod
kubectl get pods -n ecg | grep flower

# Check logs
kubectl logs -f deployment/flower -n ecg

# Port forward
kubectl port-forward -n ecg service/flower 5555:80

# Restart Flower
kubectl rollout restart deployment/flower -n ecg

Performance Metrics

Current Scale

  • API: 1 replica (85 days old)
  • Worker: 1 replica (82 days old, 2 restarts)
  • Flower: 1 replica (313 days old - very stable)
  • CronJob: Daily at 1:00 AM (last run 12h ago)

Stability

  • API Age: 85 days (stable)
  • Worker Age: 82 days (stable, 2 restarts)
  • Flower Age: 313 days (very stable)
  • CronJob: Running successfully (last successful: 3y179d)

Architecture Notes

  • Medical Data: ECG processing (sensitive data)
  • Celery-Based: Async task processing with Flower monitoring
  • NodePort Access: API (31658) and Flower (32745)
  • Scheduled Tasks: Daily drivewatch at 1 AM
  • No HA: All at 1 replica
  • Stable: Long-running pods with minimal restarts