rfd--test-library--be
Overview
- Namespace:
rfd--test-library--be - Purpose: Referring Doctor Test Library Backend - PRODUCTION
- Age: ~2 years 151 days (since May 2023)
- Status: Active - Test catalog and library management
- Workloads: 6 deployments (5 active, 1 scaled to 0)
- Environment: PRODUCTION - Test information and catalog
Architecture
Test library system handling test catalog, master data sync, and publishing:
- Main Application: REST API backend (4 replicas) - Excellent HA with HPA
- Event Consumer: Master data synchronization (1 deployment)
- Workers: Default worker, Kafka publisher (2 active, 1 scaled to 0)
- Scheduler: Cron jobs for scheduled tasks
Auto-Scaling Configuration
HPA Configured:
- 1 Standard HPA: Main API (4-10 replicas, CPU + Memory based)
- Currently: 4/10 replicas (CPU: 13%, Memory: 60MB/150MB)
- HPA age: 209 days (proven setup)
Workload Categories
Main Application (1 deployment with HPA)
| Name | Replicas | HPA Status | Purpose |
|---|---|---|---|
| rfd--test-library--be--app--prod | 4/4 | HPA Active (4-10, CPU+Memory) | Main test library API |
Event Consumer (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| consumer-rfd-sync-master-data | 1/1 | Running | RFD master data sync |
Workers (3 deployments - 2 active, 1 scaled to 0)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| wrk--default | 1/1 | Running | Default worker queue |
| wrk--kafka-publisher | 1/1 | Running | Kafka event publisher |
| wrk--notifications | 0/0 | Scaled to 0 | Notification worker (INACTIVE) |
Scheduler (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| cron--prod | 1/1 | Running | Scheduled cron jobs |
HPA Details
Standard HPA - Main API
Name: rfd--test-library--be--app--prod-hpa
Current: 4 replicas
Min: 4 replicas
Max: 10 replicas
Metrics:
- CPU: 13% / 80% target
- Memory: 60MB / 150MB target
Status: Healthy, below target thresholds
Age: 209 days (proven configuration)
Services
| Name | Type | Cluster IP | Ports | NodePort | Purpose |
|---|---|---|---|---|---|
| rfd--test-library--be--app--prod | NodePort | 10.8.21.192 | 80 | 30299 | Main test library API |
Access & Management
View all resources:
kubectl get all -n rfd--test-library--be
Check HPA:
# Check HPA status
kubectl get hpa -n rfd--test-library--be
# Detailed HPA info
kubectl describe hpa rfd--test-library--be--app--prod-hpa -n rfd--test-library--be
# Watch HPA scaling
kubectl get hpa -n rfd--test-library--be -w
Check main application:
# View app pods (4 replicas with HPA)
kubectl get pods -n rfd--test-library--be | grep "app--prod"
# View logs from all replicas
kubectl logs -f deployment/rfd--test-library--be--app--prod -n rfd--test-library--be
Check consumer and workers:
# Master data sync consumer
kubectl logs -f deployment/rfd--test-library--be--consumer-rfd-sync-master-data--prod -n rfd--test-library--be
# Kafka publisher worker
kubectl logs -f deployment/rfd--test-library--be--wrk--kafka-publisher--prod -n rfd--test-library--be
# Default worker
kubectl logs -f deployment/rfd--test-library--be--wrk--default--prod -n rfd--test-library--be
Restart services:
# Restart main app (HPA will maintain replicas)
kubectl rollout restart deployment/rfd--test-library--be--app--prod -n rfd--test-library--be
# Restart consumer
kubectl rollout restart deployment/rfd--test-library--be--consumer-rfd-sync-master-data--prod -n rfd--test-library--be
# Restart all workers (active only)
kubectl get deployments -n rfd--test-library--be | grep wrk | grep -v "0/0" | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n rfd--test-library--be
Monitoring
Resource usage:
kubectl top pods -n rfd--test-library--be --sort-by=memory
kubectl top pods -n rfd--test-library--be --sort-by=cpu
HPA monitoring:
# Current HPA status
kubectl get hpa -n rfd--test-library--be
# HPA events
kubectl describe hpa rfd--test-library--be--app--prod-hpa -n rfd--test-library--be | grep -A 20 "Events:"
Events:
kubectl get events -n rfd--test-library--be --sort-by='.lastTimestamp' | head -20
Data Flow
Doctor Test Library Request
↓
rfd--test-library--be--app--prod (NodePort 30299)
↓
Main Test Library API (4 replicas - HPA scaling 4-10)
↓
Database (external)
↓
Events Published to Kafka
├─ Master Data Sync → consumer-rfd-sync-master-data
└─ Kafka Publisher → wrk--kafka-publisher
↓
Workers Process Background Jobs
↓
Cron Jobs → Scheduled Tasks
↓
Test catalog, doctor test information
Test Library Workflow
1. Test Library API (Excellent Scaling)
- 4 replicas currently active (HPA managing)
- HPA scales 4-10 based on CPU + Memory
- Test catalog queries
- Test information lookup
- Test pricing information
- Test requirements and instructions
- Package information
2. Master Data Synchronization
consumer-rfd-sync-master-datasyncs test master data- Test catalog updates
- Pricing synchronization
- Test information updates
- Integration with central test database
3. Kafka Event Publisher
wrk--kafka-publisherpublishes events to Kafka- Test catalog change notifications
- Event-driven architecture
- Integration with other services
4. Background Worker
wrk--defaulthandles general background tasks- Data processing
- Batch operations
5. Notification Worker (Scaled to 0)
wrk--notificationsscaled to 0- Previously handled notification tasks
- May be deprecated or moved elsewhere
6. Scheduled Tasks
- Cron jobs for periodic test library updates
- Cache refreshes
- Data validation
Production Considerations
High Availability
Excellent Configuration:
- Main API: 4 replicas with HPA (4-10) - excellent HA and scaling
- HPA proven setup (209 days)
- Well-tuned thresholds (CPU: 13%, Memory: 60MB/150MB)
- Very mature namespace (~2 years)
Single Points of Failure:
- Consumer: 1 replica (master data sync)
- Workers: 1 replica each
- Cron job: 1 replica
Scaled to 0:
- wrk--notifications: Scaled to 0 (review if needed)
Recommendations
-
Maintain Current HPA Setup (Already Excellent):
- HPA for main API: Working well (4/10 replicas)
- Healthy metrics (CPU 13%, Memory 40%)
- Good headroom for scaling
- Min: 4 replicas provides good HA baseline
-
Consumer Resilience:
- consumer-rfd-sync-master-data: 1 replica
- Consider 2 replicas for master data sync reliability
- Critical for test catalog accuracy
-
Worker Resilience:
- wrk--kafka-publisher: 1 replica (consider 2 for event reliability)
- wrk--default: 1 replica (consider 2 for HA)
-
Review Scaled-to-0 Worker:
- wrk--notifications scaled to 0
- Confirm if permanently unused
- Remove if deprecated
-
Recent Activity:
- Very active: Pods updated 7 days ago
- Frequent updates (multiple in past month)
- Monitor stability after updates
-
Monitoring Priorities:
- HPA scaling events
- Master data sync status
- Kafka publishing success
- Consumer lag
Troubleshooting
HPA issues:
# Check HPA status
kubectl get hpa -n rfd--test-library--be
# Detailed HPA metrics
kubectl describe hpa rfd--test-library--be--app--prod-hpa -n rfd--test-library--be
# Watch for scaling events
kubectl get hpa -n rfd--test-library--be -w
Main API issues:
# Check all 4 API pods
kubectl get pods -n rfd--test-library--be | grep "app--prod"
# Check logs from scaled replicas
kubectl logs deployment/rfd--test-library--be--app--prod -n rfd--test-library--be --all-containers=true --tail=100
# Test API endpoint
kubectl port-forward -n rfd--test-library--be service/rfd--test-library--be--app--prod 8080:80
Master data sync issues:
# Check consumer
kubectl logs -f deployment/rfd--test-library--be--consumer-rfd-sync-master-data--prod -n rfd--test-library--be
# Check for sync errors
kubectl logs deployment/rfd--test-library--be--consumer-rfd-sync-master-data--prod -n rfd--test-library--be --tail=100 | grep -i "error\|sync\|fail"
# Restart consumer
kubectl rollout restart deployment/rfd--test-library--be--consumer-rfd-sync-master-data--prod -n rfd--test-library--be
Kafka publisher issues:
# Check Kafka publisher
kubectl logs -f deployment/rfd--test-library--be--wrk--kafka-publisher--prod -n rfd--test-library--be
# Check for publishing errors
kubectl logs deployment/rfd--test-library--be--wrk--kafka-publisher--prod -n rfd--test-library--be --tail=100 | grep -i "error\|kafka\|publish"
# Restart publisher
kubectl rollout restart deployment/rfd--test-library--be--wrk--kafka-publisher--prod -n rfd--test-library--be
Cron job issues:
# Check cron
kubectl logs -f deployment/rfd--test-library--be--cron--prod -n rfd--test-library--be
# Restart cron
kubectl rollout restart deployment/rfd--test-library--be--cron--prod -n rfd--test-library--be
Performance Metrics
Current Scale
- Main API: 4/10 replicas (HPA active, CPU+Memory based)
- CPU: 13% (target: 80%)
- Memory: 60MB (target: 150MB)
- Healthy headroom for scaling
- Consumer: 1 replica (master data sync)
- Workers: 2 active workers at 1 replica each
- Cron: 1 replica
- Scaled to 0: 1 worker (notifications)
- Total Active Pods: ~9 pods
Stability
- Namespace Age: ~2 years (very mature)
- HPA Age: 209 days (proven scaling setup)
- Recent Updates: 7 days ago (very active development)
- Auto-Scaling: Excellent (Standard HPA with good metrics)