Skip to main content

pat--reminder-booking--be

Overview

  • Namespace: pat--reminder-booking--be
  • Purpose: Patient Reminder & Booking Backend - PRODUCTION
  • Age: ~2 years 189 days (since April 2023)
  • Status: Active - Patient reminder and booking recommendation system
  • Workloads: 6 deployments (all active)
  • Environment: PRODUCTION - Patient engagement and test recommendations

Architecture

Patient reminder and booking system handling test result recommendations and data synchronization:

  • Main Application: REST API backend (1 replica)
  • Event Consumers: Test result recommendations, test master data sync (2 deployments, 4 total pods)
  • Workers: Background job processing (2 deployments)
  • Scheduler: Cron jobs for scheduled tasks

Auto-Scaling Configuration

No Auto-Scaling Configured:

  • No HorizontalPodAutoscalers (HPAs)
  • No KEDA scaled objects
  • Fixed replica counts

Workload Categories

Main Application (1 deployment)

NameReplicasStatusPurpose
pat--reminder-booking--be--app--prod1/1RunningMain reminder & booking API

Event Consumers (2 deployments - 4 total pods)

NameReplicasStatusPurpose
consumer-test-result-recommend3/3RunningTest result recommendations (high volume)
consumer-sync-test-master-data1/1x Running (145 restarts)Test master data synchronization

Workers (2 deployments)

NameReplicasStatusPurpose
wrk--default1/1RunningDefault worker queue
wrk--notifications1/1RunningNotification processing

Scheduler (1 deployment)

NameReplicasStatusPurpose
cron--prod1/1RunningScheduled cron jobs

Services

NameTypeCluster IPPortsNodePortPurpose
pat--reminder-booking--be--app--prodNodePort10.8.24.158032371Main reminder & booking API

Access & Management

View all resources:

kubectl get all -n pat--reminder-booking--be

Check main application:

kubectl get pods -n pat--reminder-booking--be | grep "app--prod"
kubectl logs -f deployment/pat--reminder-booking--be--app--prod -n pat--reminder-booking--be

Check high-volume consumer:

# Test result recommendations (3 replicas)
kubectl get pods -n pat--reminder-booking--be | grep test-result-recommend
kubectl logs -f deployment/pat--reminder-booking--be--consumer-test-result-recommend--prod -n pat--reminder-booking--be

Check consumers:

# All consumers
kubectl get pods -n pat--reminder-booking--be | grep consumer

# Test master data sync (check for restarts)
kubectl logs -f deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be

Restart services:

# Restart main app
kubectl rollout restart deployment/pat--reminder-booking--be--app--prod -n pat--reminder-booking--be

# Restart test result recommendation consumer
kubectl rollout restart deployment/pat--reminder-booking--be--consumer-test-result-recommend--prod -n pat--reminder-booking--be

# Restart all consumers
kubectl get deployments -n pat--reminder-booking--be | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n pat--reminder-booking--be

Monitoring

Resource usage:

kubectl top pods -n pat--reminder-booking--be --sort-by=memory
kubectl top pods -n pat--reminder-booking--be --sort-by=cpu

Check high-restart consumer:

# Check test master data sync consumer (145 restarts)
kubectl get pods -n pat--reminder-booking--be | grep sync-test-master-data

# Check restart reason
kubectl describe pod -n pat--reminder-booking--be -l app=pat--reminder-booking--be--consumer-sync-test-master-data--prod | grep -A 10 "Last State"

Events:

kubectl get events -n pat--reminder-booking--be --sort-by='.lastTimestamp' | head -20

Data Flow

Patient Test Result Event

pat--reminder-booking--be--app--prod (NodePort 32371)

Main Reminder & Booking API

Message Queue (Kafka/Redpanda)

Consumers Process Events
├─ Test Recommendations → consumer-test-result-recommend (3 replicas - high volume)
└─ Test Master Data → consumer-sync-test-master-data (145 restarts!)

Workers Process Background Jobs

Booking reminders, test recommendations to patients

Reminder & Booking Workflow

1. Test Result Recommendations (High Volume)

  • 3 replicas dedicated to test result recommendations
  • Processes test results and generates booking recommendations
  • Suggests related or follow-up tests to patients
  • High-volume consumer for personalized recommendations

2. Test Master Data Synchronization

  • Syncs test master data from upstream systems
  • x 145 restarts in 178 days - stability issue
  • Updates test catalog and pricing
  • Critical for accurate recommendations

3. Background Workers

  • wrk--notifications: Processes reminder notifications
  • wrk--default: General background processing
  • Booking reminder delivery

4. Scheduled Tasks

  • Cron jobs for periodic reminders
  • Scheduled booking recommendations
  • Data cleanup

Production Considerations

High Availability

Well Configured:

  • consumer-test-result-recommend: 3 replicas (high volume handling)

x Single Points of Failure:

  • Main API: 1 replica (no HA)
  • consumer-sync-test-master-data: 1 replica (no HA)
  • All workers: 1 replica each
  • Cron job: 1 replica

x Stability Issues:

  • consumer-sync-test-master-data: 145 restarts in 178 days (~0.8 restarts/day)
  • Investigate memory leaks or connection issues
  • Monitor for OOMKilled or CrashLoopBackOff

Recommendations

  1. Investigate High Restart Count:

    • consumer-sync-test-master-data: 145 restarts (critical issue)
    • Check logs for errors: OOM, connection failures, crashes
    • Review resource limits and requests
    • Fix underlying stability issue
  2. Main API Resilience:

    • Currently 1 replica (single point of failure)
    • Increase to 2+ replicas or add HPA
    • Critical for reminder and booking API
  3. Add Auto-Scaling:

    • Consider KEDA for consumers based on queue depth:
      • consumer-test-result-recommend (high volume - scale 3-10)
      • consumer-sync-test-master-data (after fixing restart issue)
  4. Consumer Resilience:

    • consumer-test-result-recommend: 3 replicas (good)
    • consumer-sync-test-master-data: 1 replica (consider 2 after fixing restarts)
  5. Monitoring Priorities:

    • consumer-sync-test-master-data restart count (highest priority)
    • API response times
    • Test recommendation processing lag
    • Notification delivery success rates

Troubleshooting

High restart investigation (PRIORITY):

# Check consumer pod status and restarts
kubectl get pods -n pat--reminder-booking--be | grep sync-test-master-data

# Check recent logs for errors
kubectl logs deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be --tail=200 | grep -i "error\|exception\|fatal\|oom"

# Check resource usage
kubectl top pods -n pat--reminder-booking--be | grep sync-test-master-data

# Describe pod for restart reasons
POD_NAME=$(kubectl get pods -n pat--reminder-booking--be | grep sync-test-master-data | awk '{print $1}')
kubectl describe pod $POD_NAME -n pat--reminder-booking--be | grep -A 20 "Last State"

# Check resource limits
kubectl get deployment pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be -o yaml | grep -A 5 "resources:"

Main API issues:

# Check API pod
kubectl get pods -n pat--reminder-booking--be | grep "app--prod"

# Check logs
kubectl logs -f deployment/pat--reminder-booking--be--app--prod -n pat--reminder-booking--be --tail=100

# Test API endpoint
kubectl port-forward -n pat--reminder-booking--be service/pat--reminder-booking--be--app--prod 8080:80
# Access http://localhost:8080

Test recommendation consumer issues:

# Check all 3 replicas
kubectl get pods -n pat--reminder-booking--be | grep test-result-recommend

# Check logs from all replicas
kubectl logs deployment/pat--reminder-booking--be--consumer-test-result-recommend--prod -n pat--reminder-booking--be --all-containers=true --tail=100

# Check resource usage
kubectl top pods -n pat--reminder-booking--be | grep test-result-recommend

# Restart consumer (all 3 replicas)
kubectl rollout restart deployment/pat--reminder-booking--be--consumer-test-result-recommend--prod -n pat--reminder-booking--be

Test master data sync issues:

# Check sync consumer
kubectl logs -f deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be

# Check for sync errors
kubectl logs deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be --tail=100 | grep -i "sync\|error\|fail"

# Restart consumer
kubectl rollout restart deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be

Worker issues:

# Check notification worker
kubectl logs -f deployment/pat--reminder-booking--be--wrk--notifications--prod -n pat--reminder-booking--be

# Check default worker
kubectl logs -f deployment/pat--reminder-booking--be--wrk--default--prod -n pat--reminder-booking--be

# Restart workers
kubectl rollout restart deployment/pat--reminder-booking--be--wrk--notifications--prod -n pat--reminder-booking--be
kubectl rollout restart deployment/pat--reminder-booking--be--wrk--default--prod -n pat--reminder-booking--be

Cron job failures:

# Check cron pod
kubectl get pods -n pat--reminder-booking--be | grep cron

# Check cron logs
kubectl logs -f deployment/pat--reminder-booking--be--cron--prod -n pat--reminder-booking--be

# Restart cron
kubectl rollout restart deployment/pat--reminder-booking--be--cron--prod -n pat--reminder-booking--be

Performance Metrics

Current Scale

  • Main API: 1 replica (no HA)
  • High-Volume Consumer: 3 replicas (test result recommendations)
  • Standard Consumer: 1 replica (test master data sync - 145 restarts!)
  • Workers: 2 workers at 1 replica each
  • Total Active Pods: ~8 pods

Stability

  • Namespace Age: ~2 years (mature)
  • Recent Updates: 178 days ago (stable deployment)
  • x Critical Issue: consumer-sync-test-master-data with 145 restarts (~0.8/day)