pat--reminder-booking--be
Overview
- Namespace:
pat--reminder-booking--be - Purpose: Patient Reminder & Booking Backend - PRODUCTION
- Age: ~2 years 189 days (since April 2023)
- Status: Active - Patient reminder and booking recommendation system
- Workloads: 6 deployments (all active)
- Environment: PRODUCTION - Patient engagement and test recommendations
Architecture
Patient reminder and booking system handling test result recommendations and data synchronization:
- Main Application: REST API backend (1 replica)
- Event Consumers: Test result recommendations, test master data sync (2 deployments, 4 total pods)
- Workers: Background job processing (2 deployments)
- Scheduler: Cron jobs for scheduled tasks
Auto-Scaling Configuration
No Auto-Scaling Configured:
- No HorizontalPodAutoscalers (HPAs)
- No KEDA scaled objects
- Fixed replica counts
Workload Categories
Main Application (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| pat--reminder-booking--be--app--prod | 1/1 | Running | Main reminder & booking API |
Event Consumers (2 deployments - 4 total pods)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| consumer-test-result-recommend | 3/3 | Running | Test result recommendations (high volume) |
| consumer-sync-test-master-data | 1/1 | x Running (145 restarts) | Test master data synchronization |
Workers (2 deployments)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| wrk--default | 1/1 | Running | Default worker queue |
| wrk--notifications | 1/1 | Running | Notification processing |
Scheduler (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| cron--prod | 1/1 | Running | Scheduled cron jobs |
Services
| Name | Type | Cluster IP | Ports | NodePort | Purpose |
|---|---|---|---|---|---|
| pat--reminder-booking--be--app--prod | NodePort | 10.8.24.15 | 80 | 32371 | Main reminder & booking API |
Access & Management
View all resources:
kubectl get all -n pat--reminder-booking--be
Check main application:
kubectl get pods -n pat--reminder-booking--be | grep "app--prod"
kubectl logs -f deployment/pat--reminder-booking--be--app--prod -n pat--reminder-booking--be
Check high-volume consumer:
# Test result recommendations (3 replicas)
kubectl get pods -n pat--reminder-booking--be | grep test-result-recommend
kubectl logs -f deployment/pat--reminder-booking--be--consumer-test-result-recommend--prod -n pat--reminder-booking--be
Check consumers:
# All consumers
kubectl get pods -n pat--reminder-booking--be | grep consumer
# Test master data sync (check for restarts)
kubectl logs -f deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be
Restart services:
# Restart main app
kubectl rollout restart deployment/pat--reminder-booking--be--app--prod -n pat--reminder-booking--be
# Restart test result recommendation consumer
kubectl rollout restart deployment/pat--reminder-booking--be--consumer-test-result-recommend--prod -n pat--reminder-booking--be
# Restart all consumers
kubectl get deployments -n pat--reminder-booking--be | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n pat--reminder-booking--be
Monitoring
Resource usage:
kubectl top pods -n pat--reminder-booking--be --sort-by=memory
kubectl top pods -n pat--reminder-booking--be --sort-by=cpu
Check high-restart consumer:
# Check test master data sync consumer (145 restarts)
kubectl get pods -n pat--reminder-booking--be | grep sync-test-master-data
# Check restart reason
kubectl describe pod -n pat--reminder-booking--be -l app=pat--reminder-booking--be--consumer-sync-test-master-data--prod | grep -A 10 "Last State"
Events:
kubectl get events -n pat--reminder-booking--be --sort-by='.lastTimestamp' | head -20
Data Flow
Patient Test Result Event
↓
pat--reminder-booking--be--app--prod (NodePort 32371)
↓
Main Reminder & Booking API
↓
Message Queue (Kafka/Redpanda)
↓
Consumers Process Events
├─ Test Recommendations → consumer-test-result-recommend (3 replicas - high volume)
└─ Test Master Data → consumer-sync-test-master-data (145 restarts!)
↓
Workers Process Background Jobs
↓
Booking reminders, test recommendations to patients
Reminder & Booking Workflow
1. Test Result Recommendations (High Volume)
- 3 replicas dedicated to test result recommendations
- Processes test results and generates booking recommendations
- Suggests related or follow-up tests to patients
- High-volume consumer for personalized recommendations
2. Test Master Data Synchronization
- Syncs test master data from upstream systems
- x 145 restarts in 178 days - stability issue
- Updates test catalog and pricing
- Critical for accurate recommendations
3. Background Workers
- wrk--notifications: Processes reminder notifications
- wrk--default: General background processing
- Booking reminder delivery
4. Scheduled Tasks
- Cron jobs for periodic reminders
- Scheduled booking recommendations
- Data cleanup
Production Considerations
High Availability
Well Configured:
- consumer-test-result-recommend: 3 replicas (high volume handling)
x Single Points of Failure:
- Main API: 1 replica (no HA)
- consumer-sync-test-master-data: 1 replica (no HA)
- All workers: 1 replica each
- Cron job: 1 replica
x Stability Issues:
- consumer-sync-test-master-data: 145 restarts in 178 days (~0.8 restarts/day)
- Investigate memory leaks or connection issues
- Monitor for OOMKilled or CrashLoopBackOff
Recommendations
-
Investigate High Restart Count:
- consumer-sync-test-master-data: 145 restarts (critical issue)
- Check logs for errors: OOM, connection failures, crashes
- Review resource limits and requests
- Fix underlying stability issue
-
Main API Resilience:
- Currently 1 replica (single point of failure)
- Increase to 2+ replicas or add HPA
- Critical for reminder and booking API
-
Add Auto-Scaling:
- Consider KEDA for consumers based on queue depth:
- consumer-test-result-recommend (high volume - scale 3-10)
- consumer-sync-test-master-data (after fixing restart issue)
- Consider KEDA for consumers based on queue depth:
-
Consumer Resilience:
- consumer-test-result-recommend: 3 replicas (good)
- consumer-sync-test-master-data: 1 replica (consider 2 after fixing restarts)
-
Monitoring Priorities:
- consumer-sync-test-master-data restart count (highest priority)
- API response times
- Test recommendation processing lag
- Notification delivery success rates
Troubleshooting
High restart investigation (PRIORITY):
# Check consumer pod status and restarts
kubectl get pods -n pat--reminder-booking--be | grep sync-test-master-data
# Check recent logs for errors
kubectl logs deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be --tail=200 | grep -i "error\|exception\|fatal\|oom"
# Check resource usage
kubectl top pods -n pat--reminder-booking--be | grep sync-test-master-data
# Describe pod for restart reasons
POD_NAME=$(kubectl get pods -n pat--reminder-booking--be | grep sync-test-master-data | awk '{print $1}')
kubectl describe pod $POD_NAME -n pat--reminder-booking--be | grep -A 20 "Last State"
# Check resource limits
kubectl get deployment pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be -o yaml | grep -A 5 "resources:"
Main API issues:
# Check API pod
kubectl get pods -n pat--reminder-booking--be | grep "app--prod"
# Check logs
kubectl logs -f deployment/pat--reminder-booking--be--app--prod -n pat--reminder-booking--be --tail=100
# Test API endpoint
kubectl port-forward -n pat--reminder-booking--be service/pat--reminder-booking--be--app--prod 8080:80
# Access http://localhost:8080
Test recommendation consumer issues:
# Check all 3 replicas
kubectl get pods -n pat--reminder-booking--be | grep test-result-recommend
# Check logs from all replicas
kubectl logs deployment/pat--reminder-booking--be--consumer-test-result-recommend--prod -n pat--reminder-booking--be --all-containers=true --tail=100
# Check resource usage
kubectl top pods -n pat--reminder-booking--be | grep test-result-recommend
# Restart consumer (all 3 replicas)
kubectl rollout restart deployment/pat--reminder-booking--be--consumer-test-result-recommend--prod -n pat--reminder-booking--be
Test master data sync issues:
# Check sync consumer
kubectl logs -f deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be
# Check for sync errors
kubectl logs deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be --tail=100 | grep -i "sync\|error\|fail"
# Restart consumer
kubectl rollout restart deployment/pat--reminder-booking--be--consumer-sync-test-master-data--prod -n pat--reminder-booking--be
Worker issues:
# Check notification worker
kubectl logs -f deployment/pat--reminder-booking--be--wrk--notifications--prod -n pat--reminder-booking--be
# Check default worker
kubectl logs -f deployment/pat--reminder-booking--be--wrk--default--prod -n pat--reminder-booking--be
# Restart workers
kubectl rollout restart deployment/pat--reminder-booking--be--wrk--notifications--prod -n pat--reminder-booking--be
kubectl rollout restart deployment/pat--reminder-booking--be--wrk--default--prod -n pat--reminder-booking--be
Cron job failures:
# Check cron pod
kubectl get pods -n pat--reminder-booking--be | grep cron
# Check cron logs
kubectl logs -f deployment/pat--reminder-booking--be--cron--prod -n pat--reminder-booking--be
# Restart cron
kubectl rollout restart deployment/pat--reminder-booking--be--cron--prod -n pat--reminder-booking--be
Performance Metrics
Current Scale
- Main API: 1 replica (no HA)
- High-Volume Consumer: 3 replicas (test result recommendations)
- Standard Consumer: 1 replica (test master data sync - 145 restarts!)
- Workers: 2 workers at 1 replica each
- Total Active Pods: ~8 pods
Stability
- Namespace Age: ~2 years (mature)
- Recent Updates: 178 days ago (stable deployment)
- x Critical Issue: consumer-sync-test-master-data with 145 restarts (~0.8/day)