pat--dtc--be
Overview
- Namespace:
pat--dtc--be - Purpose: Patient Direct-to-Consumer Backend - PRODUCTION
- Age: ~3 years 19 days (since October 2022)
- Status: Active - Direct consumer services
- Workloads: 5 deployments + 1 CronJob (all active)
- Environment: PRODUCTION - Patient direct services
Architecture
Direct-to-consumer system handling customer interactions, homekit registration, and test results:
- Main Application: REST API backend (4 replicas) - Excellent HA
- Event Consumers: Customer events, homekit registration, LIS test results (3 deployments, 4 total pods)
- Worker: Background job processing (1 deployment)
- CronJob: Scheduled task (runs every minute)
Auto-Scaling Configuration
No Auto-Scaling Configured:
- No HorizontalPodAutoscalers (HPAs)
- No KEDA scaled objects
- Fixed replica counts (Main app: 4, LIS consumer: 2, others: 1)
Workload Categories
Main Application (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| pat--dtc--be--app--prod | 4/4 | Running | Main DTC API (excellent HA) |
Event Consumers (3 deployments - 4 total pods)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| consumer-lis-testresult | 2/2 | Running | LIS test result processing (HA) |
| consumer-customer | 1/1 | Running | Customer event processing |
| consumer-homekit-register | 1/1 | Running | Homekit device registration |
Workers (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| wrk--default | 1/1 | Running | Default worker queue |
Scheduler (1 CronJob)
| Name | Schedule | Status | Purpose |
|---|---|---|---|
| cron--prod | * * * * * (every minute) | Active | Scheduled tasks (high frequency) |
Services
| Name | Type | Cluster IP | Ports | NodePort | Purpose |
|---|---|---|---|---|---|
| pat--dtc--be--app--prod | NodePort | 10.8.22.90 | 80 | 32058 | Main DTC API |
Access & Management
View all resources:
kubectl get all -n pat--dtc--be
Check main application:
# View app pods (4 replicas)
kubectl get pods -n pat--dtc--be | grep "app--prod"
# View logs from all replicas
kubectl logs -f deployment/pat--dtc--be--app--prod -n pat--dtc--be
# Check all replicas
kubectl logs -f deployment/pat--dtc--be--app--prod -n pat--dtc--be --all-containers=true
Check consumers:
# All consumers
kubectl get pods -n pat--dtc--be | grep consumer
# LIS test result consumer (2 replicas)
kubectl logs -f deployment/pat--dtc--be--consumer-lis-testresult--prod -n pat--dtc--be
# Customer consumer
kubectl logs -f deployment/pat--dtc--be--consumer-customer--prod -n pat--dtc--be
# Homekit registration consumer
kubectl logs -f deployment/pat--dtc--be--consumer-homekit-register--prod -n pat--dtc--be
Check CronJob:
# View CronJob
kubectl get cronjob -n pat--dtc--be
# View recent job runs
kubectl get jobs -n pat--dtc--be --sort-by=.status.startTime
# View CronJob pods
kubectl get pods -n pat--dtc--be | grep cron
Restart services:
# Restart main app (all 4 replicas)
kubectl rollout restart deployment/pat--dtc--be--app--prod -n pat--dtc--be
# Restart LIS consumer (all 2 replicas)
kubectl rollout restart deployment/pat--dtc--be--consumer-lis-testresult--prod -n pat--dtc--be
# Restart all consumers
kubectl get deployments -n pat--dtc--be | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n pat--dtc--be
Monitoring
Resource usage:
kubectl top pods -n pat--dtc--be --sort-by=memory
kubectl top pods -n pat--dtc--be --sort-by=cpu
Check CronJob executions:
# Recent jobs
kubectl get jobs -n pat--dtc--be --sort-by=.status.startTime | tail -10
# Failed jobs
kubectl get jobs -n pat--dtc--be --field-selector status.successful=0
# Job logs
kubectl logs -n pat--dtc--be job/<job-name>
Events:
kubectl get events -n pat--dtc--be --sort-by='.lastTimestamp' | head -20
Data Flow
Direct-to-Consumer Request
↓
pat--dtc--be--app--prod (NodePort 32058)
↓
Main DTC API (4 replicas - Excellent HA)
↓
Database (external)
↓
Events Published to Message Queue
↓
Consumers Process Events
├─ LIS Test Results → consumer-lis-testresult (2 replicas)
├─ Customer Events → consumer-customer
└─ Homekit Registration → consumer-homekit-register
↓
Worker Processes Background Jobs
↓
CronJob → Scheduled Tasks (every minute)
↓
Direct consumer services, homekit integration
DTC Workflow
1. DTC API (Excellent High Availability)
- 4 replicas for high redundancy
- Direct consumer interactions
- Self-service patient portal
- Homekit device management
- Test result access
- Order placement
- Account management
2. LIS Test Result Processing (High Availability)
- 2 replicas for reliability
consumer-lis-testresultprocesses test results from LIS- Updates consumer-facing test results
- Notification triggers for new results
- Critical for patient test access
3. Customer Event Processing
consumer-customerhandles customer events- Account updates
- Profile changes
- Preference management
4. Homekit Device Registration
consumer-homekit-registerprocesses homekit events- Device pairing and registration
- Device authorization
- Integration with Apple HomeKit
5. Background Worker
- Async job processing
- Email/SMS sending
- Data synchronization
6. Scheduled Tasks (High Frequency)
- CronJob runs every minute (very high frequency)
- Regular health checks
- Data synchronization
- Session cleanup
Production Considerations
High Availability
Excellent Configuration:
- Main API: 4 replicas (excellent HA)
- LIS consumer: 2 replicas (good HA)
- Very mature namespace (~3 years)
x Single Points of Failure:
- consumer-customer: 1 replica
- consumer-homekit-register: 1 replica
- Worker: 1 replica
x CronJob Frequency:
- Runs every minute (very high frequency)
- Monitor for resource impact
- Review if all runs are necessary
Recommendations
-
Auto-Scaling (Optional):
- Currently fixed at 4 replicas (excellent baseline)
- Consider HPA to scale during peak consumer traffic
- Target: 4-12 replicas based on load
-
Consumer Resilience:
- consumer-customer: 1 replica (consider 2)
- consumer-homekit-register: 1 replica (consider 2)
- Both handle important consumer-facing operations
-
Worker Resilience:
- Currently 1 replica
- Consider 2 replicas for HA
-
CronJob Review:
- Runs every minute (very frequent)
- Review if frequency can be reduced
- Monitor resource usage
-
Monitoring Priorities:
- API response times (4 replicas handling load)
- Consumer request success rates
- LIS test result delivery (2 consumer replicas)
- Homekit registration success
- CronJob execution success
Troubleshooting
Main API issues:
# Check all 4 API pods
kubectl get pods -n pat--dtc--be | grep "app--prod"
# Check logs from all replicas
kubectl logs deployment/pat--dtc--be--app--prod -n pat--dtc--be --all-containers=true --tail=100
# Check specific pod
POD_NAME=$(kubectl get pods -n pat--dtc--be | grep "app--prod" | head -1 | awk '{print $1}')
kubectl logs $POD_NAME -n pat--dtc--be --tail=100
# Test API endpoint
kubectl port-forward -n pat--dtc--be service/pat--dtc--be--app--prod 8080:80
# Access http://localhost:8080
LIS test result issues:
# Check both consumer pods
kubectl get pods -n pat--dtc--be | grep lis-testresult
# Check logs from both replicas
kubectl logs deployment/pat--dtc--be--consumer-lis-testresult--prod -n pat--dtc--be --all-containers=true --tail=100
# Check each replica
for pod in $(kubectl get pods -n pat--dtc--be | grep lis-testresult | awk '{print $1}'); do
echo "=== $pod ==="
kubectl logs $pod -n pat--dtc--be --tail=50 | grep -i "error\|test\|result"
done
# Restart consumer (both replicas)
kubectl rollout restart deployment/pat--dtc--be--consumer-lis-testresult--prod -n pat--dtc--be
Customer event issues:
# Check customer consumer
kubectl logs -f deployment/pat--dtc--be--consumer-customer--prod -n pat--dtc--be
# Check for errors
kubectl logs deployment/pat--dtc--be--consumer-customer--prod -n pat--dtc--be --tail=100 | grep -i "error\|customer\|fail"
# Restart consumer
kubectl rollout restart deployment/pat--dtc--be--consumer-customer--prod -n pat--dtc--be
Homekit registration issues:
# Check homekit consumer
kubectl logs -f deployment/pat--dtc--be--consumer-homekit-register--prod -n pat--dtc--be
# Check for registration errors
kubectl logs deployment/pat--dtc--be--consumer-homekit-register--prod -n pat--dtc--be --tail=100 | grep -i "error\|homekit\|register\|fail"
# Restart consumer
kubectl rollout restart deployment/pat--dtc--be--consumer-homekit-register--prod -n pat--dtc--be
Worker issues:
# Check worker
kubectl logs -f deployment/pat--dtc--be--wrk--default--prod -n pat--dtc--be
# Check for errors
kubectl logs deployment/pat--dtc--be--wrk--default--prod -n pat--dtc--be --tail=100 | grep -i "error\|fail"
# Restart worker
kubectl rollout restart deployment/pat--dtc--be--wrk--default--prod -n pat--dtc--be
CronJob issues:
# Check CronJob status
kubectl get cronjob -n pat--dtc--be
# Check recent jobs
kubectl get jobs -n pat--dtc--be --sort-by=.status.startTime | tail -20
# Check failed jobs
kubectl get jobs -n pat--dtc--be --field-selector status.successful=0
# Check specific job logs
kubectl logs -n pat--dtc--be job/<job-name>
# Delete old completed jobs
kubectl delete jobs -n pat--dtc--be --field-selector status.successful=1
Load distribution issues:
# Check resource usage across API replicas
kubectl top pods -n pat--dtc--be | grep app--prod
# Check resource usage across LIS consumer replicas
kubectl top pods -n pat--dtc--be | grep lis-testresult
# Restart all to redistribute load
kubectl rollout restart deployment/pat--dtc--be--app--prod -n pat--dtc--be
kubectl rollout restart deployment/pat--dtc--be--consumer-lis-testresult--prod -n pat--dtc--be
Performance Metrics
Current Scale
- Main API: 4 replicas (excellent HA)
- LIS Consumer: 2 replicas (good HA)
- Other Consumers: 2 consumers at 1 replica each
- Worker: 1 replica
- CronJob: Runs every minute (high frequency)
- Total Active Pods: ~9 pods + CronJob pods
Stability
- Namespace Age: ~3 years (very mature, stable)
- Recent Updates: 199 days ago (very stable)
- HA Configuration: 4+2 replicas (excellent)