pat--booking--be
Overview
- Namespace:
pat--booking--be - Purpose: Patient Portal Booking Backend - PRODUCTION
- Age: ~3 years (since October 2022)
- Status: Active - Patient-facing booking and appointment system
- Workloads: 14 deployments (all active) + 1 cron job
- Environment: PRODUCTION - Critical patient appointment booking
Architecture
The patient booking backend handles appointment booking, test recommendations, and integration with lab results:
- Main Application: REST API backend (1 replica with HPA)
- Event Consumers: Process bookings, appointments, and test results (9 deployments)
- Workers: Background job processing (3 deployments)
- Scheduler: Cron jobs for scheduled tasks
- Observability: OpenTelemetry collector for tracing
Auto-Scaling Configuration
HorizontalPodAutoscalers (2 HPAs)
| HPA Name | Target | Min | Max | Current | Metrics | Type |
|---|---|---|---|---|---|---|
| pat--booking--be--app--prod-hpa | Main app | 1 | 5 | 1 | CPU: 5%/100% | Standard HPA |
| keda-hpa-kafka-pat--booking--be--consumer-pat-test-result--prod | Test result consumer | 1 | 10 | 1 | Queue: 0/20 | KEDA |
Scaling Summary:
- Main API auto-scales based on CPU load
- 1 consumer uses KEDA for Kafka queue-based autoscaling
- System currently at low load (all at minimum replicas)
Workload Categories
Main Application (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| pat--booking--be--app--prod | 1/1 | Running + HPA | Main booking API (auto-scales 1-5) |
The main application handles:
- Patient appointment booking
- Test package recommendations
- Integration with clinic schedules
- RESTful API for patient portal
- Appointment management
Event Consumers (9 deployments)
Process booking and medical data events from message queues:
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| consumer-appointment-update | 1/1 | Running | Appointment update processing |
| consumer-booking | 1/1 | Running | Booking event processing |
| consumer-customer | 1/1 | Running | Customer data synchronization |
| consumer-lis-testresult | 2/2 | Running | LIS test results (2 replicas for reliability) |
| consumer-pat-booking-recommend-test | 1/1 | Running | Test package recommendations |
| consumer-pat-test-result | 1/1 | Running + HPA | Patient test results (scales 1-10) |
| consumer-pat-test-result-svip1 | 1/1 | Running | SVIP tier 1 test results |
| consumer-pat-test-result-svip2 | 1/1 | Running | SVIP tier 2 test results |
| consumer-sync-test-master-data | 1/1 | Running | Test master data synchronization |
Workers (3 deployments)
Background job processing:
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| wrk--batch-publisher | 1/1 | Running | Batch job publisher |
| wrk--default | 1/1 | Running | Default worker queue |
| wrk--notifications | 1/1 | Running | Notification processing |
Scheduler & Supporting (2 deployments)
| Name | Type | Status | Purpose |
|---|---|---|---|
| pat--booking--be--cron--prod | CronJob | Running | Scheduled tasks (every minute) |
| pat--booking--be-otel-collector | Deployment | Running | OpenTelemetry trace collector |
Services
| Name | Type | Cluster IP | Ports | NodePort | Purpose |
|---|---|---|---|---|---|
| pat--booking--be--app--prod | NodePort | 10.8.27.159 | 80 | 30978 | Main booking API |
| pat--booking--be-otel-collector | ClusterIP | 10.8.28.105 | 4317, 4318 | - | OTLP trace collection |
Access & Management
View all resources:
kubectl get all -n pat--booking--be
Check main application:
# View app pods
kubectl get pods -n pat--booking--be | grep "app--prod"
# Check HPA status
kubectl describe hpa pat--booking--be--app--prod-hpa -n pat--booking--be
# View logs
kubectl logs -f deployment/pat--booking--be--app--prod -n pat--booking--be
Check consumers:
# All consumers
kubectl get pods -n pat--booking--be | grep consumer
# KEDA-scaled test result consumer
kubectl get hpa -n pat--booking--be | grep keda
# Consumer logs
kubectl logs -f deployment/pat--booking--be--consumer-pat-test-result--prod -n pat--booking--be
Check workers:
# All workers
kubectl get pods -n pat--booking--be | grep "wrk--"
# Worker logs
kubectl logs -f deployment/pat--booking--be--wrk--notifications--prod -n pat--booking--be
Check cron jobs:
# View cron jobs
kubectl get cronjobs -n pat--booking--be
# View cron job history
kubectl get jobs -n pat--booking--be --sort-by=.metadata.creationTimestamp
Scaling:
# View all HPAs
kubectl get hpa -n pat--booking--be
# Check KEDA scaled objects
kubectl get scaledobjects -n pat--booking--be
Restart services:
# Restart main app
kubectl rollout restart deployment/pat--booking--be--app--prod -n pat--booking--be
# Restart specific consumer
kubectl rollout restart deployment/pat--booking--be--consumer-booking--prod -n pat--booking--be
# Restart all consumers
kubectl get deployments -n pat--booking--be | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n pat--booking--be
Monitoring
Resource usage:
kubectl top pods -n pat--booking--be --sort-by=memory
kubectl top pods -n pat--booking--be --sort-by=cpu
HPA metrics:
# All HPAs
kubectl get hpa -n pat--booking--be
# Detailed HPA status
kubectl describe hpa pat--booking--be--app--prod-hpa -n pat--booking--be
kubectl describe hpa keda-hpa-kafka-pat--booking--be--consumer-pat-test-result--prod -n pat--booking--be
Deployment status:
kubectl get deployments -n pat--booking--be
Cron job monitoring:
# Recent jobs
kubectl get jobs -n pat--booking--be --sort-by=.metadata.creationTimestamp | tail -10
# Failed jobs
kubectl get jobs -n pat--booking--be --field-selector status.successful=0
Events:
kubectl get events -n pat--booking--be --sort-by='.lastTimestamp' | head -20
Traces (OpenTelemetry):
# Check OTEL collector
kubectl logs -f deployment/pat--booking--be-otel-collector -n pat--booking--be
# Port forward to access OTEL endpoints
kubectl port-forward -n pat--booking--be deployment/pat--booking--be-otel-collector 4317:4317
Data Flow
Patient Portal
↓
pat--booking--be--app--prod (NodePort 30978)
↓
Main Booking API (1-5 replicas via HPA)
↓
Database (external)
↓
Events Published to Kafka
↓
Consumers Process Events
↓
├─ Booking Processing
├─ Appointment Updates
├─ Test Results Integration (KEDA scaled)
├─ Recommendations
└─ Notifications
↓
Workers Process Background Jobs
↓
Patient notifications and updates
OpenTelemetry Tracing
Application → OTEL Collector (4317/4318) → Backend (Grafana/Jaeger)
Patient Booking Workflow
1. Appointment Booking
- Patient creates booking via portal
consumer-bookingprocesses booking eventsconsumer-appointment-updatehandles appointment changes- Validation and confirmation
2. Test Package Recommendations
- AI/rule-based recommendations
consumer-pat-booking-recommend-testprocesses recommendations- Test master data sync via
consumer-sync-test-master-data
3. Customer Data Sync
- Patient profile updates
consumer-customersynchronizes customer data- Integration with main customer database
4. Test Results Integration
- High Volume:
consumer-lis-testresult(2 replicas) - Patient-facing:
consumer-pat-test-result(KEDA-scaled 1-10) - VIP Tiers: Separate consumers for SVIP1 and SVIP2
- Results displayed in patient portal
5. Notifications
- Booking confirmations
- Appointment reminders
- Test result availability notifications
wrk--notificationsprocesses notification jobs
6. Scheduled Tasks
- Cron job runs every minute
- Appointment reminders
- Status updates
- Data cleanup
Production Considerations
High Availability
Well Configured:
- LIS test result consumer: 2 replicas
- KEDA autoscaling for patient-facing test results
- Separate VIP tier consumers
x Single Points of Failure:
- Main API: 1 replica (but has HPA)
- Most consumers: 1 replica
- All workers: 1 replica
- OTEL collector: 1 replica
Auto-Scaling Configuration
| Workload | Type | Current | Min | Max | Scaling Metric |
|---|---|---|---|---|---|
| Main API | Standard HPA | 1 | 1 | 5 | CPU 100% |
| Test Result Consumer | KEDA | 1 | 1 | 10 | Queue depth 20 |
Current State: Very low load (API at 5% CPU, all consumers at minimum)
Recommendations
-
Main API Scaling:
- Currently at 1 replica (minimum)
- CPU at 5% - very low load
- Consider baseline of 2 replicas for HA
- Max of 5 may be insufficient during peak booking times
-
Consumer Resilience:
- Critical consumers at 1 replica
- Recommend 2 replicas for:
- consumer-booking (critical path)
- consumer-appointment-update (critical path)
- consumer-pat-test-result (patient-facing)
-
Test Result Processing:
- LIS consumer: 2 replicas (good)
- Patient test result: KEDA-scaled (good)
- KEDA threshold at 20 messages - review if appropriate
- VIP consumers at 1 replica - monitor if scaling needed
-
Worker Scaling:
- All workers at 1 replica
- Consider 2 replicas for wrk--notifications (critical)
- Monitor job queue depths
-
Observability:
- OTEL collector at 1 replica
- Consider 2+ replicas for reliability
- Monitor trace collection lag
-
Cron Job Monitoring:
- Runs every minute - very frequent
- Monitor for failures
- Consider if every-minute frequency is necessary
- Review job history retention
-
KEDA Expansion:
- Only 1 consumer uses KEDA currently
- Consider KEDA for:
- consumer-booking
- consumer-appointment-update
- Queue-based scaling more efficient than CPU-based
-
Monitoring Priorities:
- API response times
- Booking success rates
- Appointment confirmation latency
- Test result delivery time
- Notification delivery success
- Cron job success rate
Troubleshooting
Main API issues:
# Check API pods
kubectl get pods -n pat--booking--be | grep "app--prod"
# Check HPA status
kubectl describe hpa pat--booking--be--app--prod-hpa -n pat--booking--be
# Check logs
kubectl logs -f deployment/pat--booking--be--app--prod -n pat--booking--be --tail=100
# Test API endpoint
kubectl port-forward -n pat--booking--be service/pat--booking--be--app--prod 8080:80
# Access http://localhost:8080
Consumer not processing:
# Check consumer status
kubectl get pods -n pat--booking--be | grep consumer-booking
# Check logs
kubectl logs -f deployment/pat--booking--be--consumer-booking--prod -n pat--booking--be
# Check for errors
kubectl logs deployment/pat--booking--be--consumer-booking--prod -n pat--booking--be --tail=100 | grep -i error
# Restart consumer
kubectl rollout restart deployment/pat--booking--be--consumer-booking--prod -n pat--booking--be
HPA not scaling:
# Check HPA events
kubectl describe hpa pat--booking--be--app--prod-hpa -n pat--booking--be
# Check metrics server
kubectl top nodes
kubectl top pods -n pat--booking--be
# Check KEDA operator (for test result consumer)
kubectl get pods -n keda
kubectl logs -n keda deployment/keda-operator
Test result delays:
# Check all test result consumers
kubectl get pods -n pat--booking--be | grep "test-result"
# Check KEDA scaled consumer
kubectl describe hpa keda-hpa-kafka-pat--booking--be--consumer-pat-test-result--prod -n pat--booking--be
# Check consumer logs
for consumer in consumer-lis-testresult consumer-pat-test-result consumer-pat-test-result-svip1 consumer-pat-test-result-svip2; do
echo "=== $consumer ==="
kubectl logs deployment/pat--booking--be--$consumer--prod -n pat--booking--be --tail=20
done
Booking issues:
# Check booking consumer
kubectl logs -f deployment/pat--booking--be--consumer-booking--prod -n pat--booking--be
# Check appointment updates
kubectl logs -f deployment/pat--booking--be--consumer-appointment-update--prod -n pat--booking--be
# Check batch publisher
kubectl logs -f deployment/pat--booking--be--wrk--batch-publisher--prod -n pat--booking--be
# Restart booking workflow
kubectl rollout restart deployment/pat--booking--be--consumer-booking--prod -n pat--booking--be
kubectl rollout restart deployment/pat--booking--be--consumer-appointment-update--prod -n pat--booking--be
Notification delays:
# Check notification worker
kubectl logs -f deployment/pat--booking--be--wrk--notifications--prod -n pat--booking--be
# Check for stuck jobs
kubectl top pods -n pat--booking--be | grep notifications
# Restart notification worker
kubectl rollout restart deployment/pat--booking--be--wrk--notifications--prod -n pat--booking--be
Cron job failures:
# View recent jobs
kubectl get jobs -n pat--booking--be --sort-by=.metadata.creationTimestamp | tail -20
# Check failed jobs
kubectl get jobs -n pat--booking--be --field-selector status.successful=0
# View cron job details
kubectl describe cronjob pat--booking--be--cron--prod -n pat--booking--be
# View logs of recent job
RECENT_JOB=$(kubectl get jobs -n pat--booking--be --sort-by=.metadata.creationTimestamp | tail -1 | awk '{print $1}')
kubectl logs job/$RECENT_JOB -n pat--booking--be
KEDA scaling issues:
# Check KEDA scaled object
kubectl describe scaledobject kafka-pat--booking--be--consumer-pat-test-result--prod -n pat--booking--be
# Check KEDA HPA
kubectl describe hpa keda-hpa-kafka-pat--booking--be--consumer-pat-test-result--prod -n pat--booking--be
# Check Kafka consumer lag (via KEDA)
kubectl get scaledobject kafka-pat--booking--be--consumer-pat-test-result--prod -n pat--booking--be -o yaml
# Check KEDA operator logs
kubectl logs -n keda deployment/keda-operator | grep pat--booking--be
Tracing issues:
# Check OTEL collector
kubectl logs -f deployment/pat--booking--be-otel-collector -n pat--booking--be
# Check collector metrics
kubectl port-forward -n pat--booking--be deployment/pat--booking--be-otel-collector 8888:8888
# Access http://localhost:8888/metrics
Consumer restart with one pod down:
# Find restart count
kubectl get pods -n pat--booking--be | grep consumer-sync-test-master-data
# Check pod events
POD_NAME=$(kubectl get pods -n pat--booking--be | grep consumer-sync-test-master-data | awk '{print $1}')
kubectl describe pod $POD_NAME -n pat--booking--be | tail -20
# Check logs for errors
kubectl logs $POD_NAME -n pat--booking--be --previous
Performance Metrics
Current Scale (Production Load)
- Main API: 1 replica (very low load at 5% CPU, can scale to 5)
- Consumers:
- High reliability: 2 replicas (lis-testresult)
- KEDA-scaled: 1 replica (pat-test-result, scales to 10)
- Standard: 1 replica
- Workers: 1 replica each
- Total Active Pods: ~16-18 pods in namespace
Scaling Behavior
- Main API HPA: Scales based on CPU (100% threshold)
- KEDA HPA: Scales based on Kafka queue depth (20 messages per replica)
- Cron Job: Runs every minute (very frequent)
Integration Points
External Systems
-
Patient Portal Frontend:
- Web and mobile applications
- Real-time booking interface
- Test result display
-
Laboratory Information System (LIS):
- Test result integration (2 replicas for reliability)
- Test master data synchronization
- Patient test data
-
Clinic Management:
- Appointment scheduling
- Resource availability
- Doctor schedules
-
Recommendation Engine:
- AI-based test recommendations
- Package suggestions
- Health screening recommendations
-
Notification Systems:
- SMS notifications
- Email notifications
- Push notifications
- In-app notifications
Important Notes
x PRODUCTION ENVIRONMENT:
- This is a CRITICAL PATIENT-FACING system
- Downtime directly impacts patient experience and appointment bookings
- Changes must be tested in staging first
- Coordinate with patient services team
- Monitor booking success rates carefully
- Have immediate rollback plan ready
x Performance Observations:
- Very low load: API at 5% CPU, consumers at minimum replicas
- This could indicate:
- Off-peak hours
- Low patient portal usage
- Potential over-provisioning
- Review load patterns and adjust resource allocations accordingly
Compliance: Patient data subject to strict regulatory requirements (HIPAA, GDPR for international patients)
System Purpose
The patient booking backend serves as:
- Booking Engine: Appointment and service booking for patients
- Test Integration: Real-time test result delivery to patients
- Recommendations: AI-powered health screening suggestions
- Notification Hub: Multi-channel patient communication
- VIP Services: Dedicated processing for VIP/SVIP patients
Key Role: Central patient-facing booking and appointment management platform with real-time test result integration and intelligent recommendations.