rfd--notification
Overview
- Namespace:
rfd--notification - Purpose: Referring Doctor (RFD) Notification Service - PRODUCTION
- Age: ~2 years 195 days (since April 2023)
- Status: Active - Doctor notification and communication system
- Workloads: 8 deployments (all active)
- Environment: PRODUCTION - Critical doctor communication platform
Architecture
Referring doctor notification system handling event-driven notifications for doctors, checkup events, order history, and user management:
- Main Application: REST API backend (1 replica)
- Event Consumers: Process doctor notifications, checkup events, orders, user management (5 deployments)
- Worker: Batch job publisher (1 deployment)
- Scheduler: Cron jobs for scheduled tasks (1 deployment)
Auto-Scaling Configuration
No Auto-Scaling Configured:
- No HorizontalPodAutoscalers (HPAs)
- No KEDA scaled objects
- Fixed replica counts for all deployments
Workload Categories
Main Application (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| rfd--notification--be--app--prod | 1/1 | Running | Main RFD notification API |
The main application handles:
- Referring doctor communications
- Notification management
- Event orchestration
- RESTful API for frontend and integrations
Event Consumers (5 deployments)
Process referring doctor and checkup-related events:
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| consumer-rfd-order-history-events | 1/1 | Running | RFD order history events |
| consumer-sapoche-checkup-event | 3/3 | Running | Checkup events (3 replicas for high volume) |
| consumer-send-notification | 1/1 | Running | Send notifications (newer - 21 days) |
| consumer-user-management-create-user | 1/1 | Running | User creation events |
| consumer-user-management-internal-login-event | 1/1 | Running | Internal login tracking |
Workers & Scheduler (2 deployments)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| wrk--batch-publisher | 1/1 | Running | Batch job publisher (newer - 21 days) |
| cron--prod | 1/1 | Running | Scheduled cron jobs |
Services
| Name | Type | Cluster IP | Ports | NodePort | Purpose |
|---|---|---|---|---|---|
| rfd--notification--be--app--prod | NodePort | 10.8.22.245 | 80 | 31898 | Main RFD notification API |
Access & Management
View all resources:
kubectl get all -n rfd--notification
Check main application:
# View app pod
kubectl get pods -n rfd--notification | grep "app--prod"
# View logs
kubectl logs -f deployment/rfd--notification--be--app--prod -n rfd--notification
Check consumers:
# All consumers
kubectl get pods -n rfd--notification | grep consumer
# High-volume checkup consumer (3 replicas)
kubectl logs -f deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification
# Send notification consumer
kubectl logs -f deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification
Check cron jobs:
# View cron pod
kubectl get pods -n rfd--notification | grep cron
# Cron logs
kubectl logs -f deployment/rfd--notification--be--cron--prod -n rfd--notification
Restart services:
# Restart main app
kubectl rollout restart deployment/rfd--notification--be--app--prod -n rfd--notification
# Restart specific consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification
# Restart all consumers
kubectl get deployments -n rfd--notification | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n rfd--notification
Monitoring
Resource usage:
kubectl top pods -n rfd--notification --sort-by=memory
kubectl top pods -n rfd--notification --sort-by=cpu
Deployment status:
kubectl get deployments -n rfd--notification
Events:
kubectl get events -n rfd--notification --sort-by='.lastTimestamp' | head -20
Data Flow
Referring Doctor Events
↓
rfd--notification--be--app--prod (NodePort 31898)
↓
Main RFD Notification API
↓
Message Queue (Kafka/Redpanda)
↓
Consumers Process Events
├─ Checkup Events → consumer-sapoche-checkup-event (3 replicas - high volume)
├─ Order History → consumer-rfd-order-history-events
├─ Send Notifications → consumer-send-notification
├─ User Creation → consumer-user-management-create-user
└─ Login Events → consumer-user-management-internal-login-event
↓
Batch Publisher Creates Background Jobs
↓
Doctor notifications, order updates, user alerts
RFD Notification Workflow
1. Checkup Event Processing (High Volume)
- Patient checkup events from Sapoche
- 3 replicas to handle high event volume
consumer-sapoche-checkup-eventprocesses checkup notifications- Notifies referring doctors of checkup completion
- Updates doctor dashboard
2. Order History Events
- RFD order history tracking
consumer-rfd-order-history-eventsprocesses order events- Updates referring doctor order views
- Historical order data synchronization
3. Send Notifications
- Newer consumer (21 days old)
consumer-send-notificationhandles notification delivery- Multi-channel notification support
- SMS, email, push notifications to doctors
4. User Management Events
- User Creation:
consumer-user-management-create-user- New referring doctor registration
- User account setup
- Welcome notifications
- Login Tracking:
consumer-user-management-internal-login-event- Internal login event tracking
- Security monitoring
- Access auditing
5. Scheduled Tasks
- Cron job for periodic tasks
- Scheduled notifications
- Report generation
- Data cleanup
Production Considerations
High Availability
Well Configured:
- consumer-sapoche-checkup-event: 3 replicas (high volume handling)
x Single Points of Failure:
- Main API: 1 replica (no HA)
- All other consumers: 1 replica each
- Batch publisher: 1 replica
- Cron job: 1 replica
No Auto-Scaling
- No HPAs or KEDA configured
- Fixed replica counts
- Cannot automatically scale based on load
- May struggle during peak checkup times
Recommendations
-
Main API Resilience:
- Currently 1 replica (single point of failure)
- Add HPA or increase to 2 replicas minimum
- Critical for doctor communications
-
Add Auto-Scaling:
- Consider KEDA for consumers based on queue depth:
- consumer-sapoche-checkup-event (high volume - scale 3-10)
- consumer-send-notification (notification delivery)
- consumer-rfd-order-history-events
- Consider KEDA for consumers based on queue depth:
-
Consumer Resilience:
- consumer-sapoche-checkup-event: 3 replicas (good)
- Consider 2 replicas for critical consumers:
- consumer-send-notification (critical for notification delivery)
- consumer-rfd-order-history-events
-
Recent Deployments:
- consumer-send-notification: 21 days old (newer)
- wrk--batch-publisher: 21 days old (newer)
- Monitor stability and performance
-
Monitoring Priorities:
- API response times
- Checkup event processing lag (3 replicas handling load)
- Notification delivery success rates
- User management event processing
- Order history synchronization
-
Capacity Planning:
- consumer-sapoche-checkup-event at 3 replicas
- Review if sufficient for peak checkup volumes
- Consider scaling to 5-10 replicas during peak hours
Troubleshooting
Main API issues:
# Check API pod
kubectl get pods -n rfd--notification | grep "app--prod"
# Check logs
kubectl logs -f deployment/rfd--notification--be--app--prod -n rfd--notification --tail=100
# Test API endpoint
kubectl port-forward -n rfd--notification service/rfd--notification--be--app--prod 8080:80
# Access http://localhost:8080
Consumer not processing:
# Check consumer status
kubectl get pods -n rfd--notification | grep consumer
# Check checkup consumer (3 replicas)
kubectl logs -f deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification
# Check all replicas
for pod in $(kubectl get pods -n rfd--notification | grep "consumer-sapoche-checkup" | awk '{print $1}'); do
echo "=== $pod ==="
kubectl logs $pod -n rfd--notification --tail=20
done
# Restart consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification
Checkup event delays:
# Check checkup consumer load (3 replicas)
kubectl top pods -n rfd--notification | grep checkup
# Check logs for lag
kubectl logs deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification --tail=100 | grep -i "lag\|delay\|queue"
# Check pod distribution
kubectl get pods -n rfd--notification -o wide | grep checkup
# Restart all checkup consumer replicas
kubectl rollout restart deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification
Notification delivery failures:
# Check send notification consumer
kubectl logs -f deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification
# Check for errors
kubectl logs deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification --tail=100 | grep -i "error\|fail\|exception"
# Restart consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification
Order history sync issues:
# Check order history consumer
kubectl logs -f deployment/rfd--notification--be--consumer-rfd-order-history-events---prod -n rfd--notification
# Check for sync errors
kubectl logs deployment/rfd--notification--be--consumer-rfd-order-history-events---prod -n rfd--notification --tail=100 | grep -i "sync\|order\|history"
# Restart consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-rfd-order-history-events---prod -n rfd--notification
User management issues:
# Check user creation consumer
kubectl logs -f deployment/rfd--notification--be--consumer-user-management-create-user---prod -n rfd--notification
# Check login event consumer
kubectl logs -f deployment/rfd--notification--be--consumer-user-management-internal-login-event---prod -n rfd--notification
# Restart both
kubectl rollout restart deployment/rfd--notification--be--consumer-user-management-create-user---prod -n rfd--notification
kubectl rollout restart deployment/rfd--notification--be--consumer-user-management-internal-login-event---prod -n rfd--notification
Cron job failures:
# Check cron pod
kubectl get pods -n rfd--notification | grep cron
# Check cron logs
kubectl logs -f deployment/rfd--notification--be--cron--prod -n rfd--notification
# Restart cron
kubectl rollout restart deployment/rfd--notification--be--cron--prod -n rfd--notification
Performance Metrics
Current Scale
- Main API: 1 replica (no HA)
- High-Volume Consumer: 3 replicas (checkup events)
- Standard Consumers: 4 consumers at 1 replica each
- Workers: 1 batch publisher, 1 cron
- Total Active Pods: ~10 pods
Scaling Pattern
- consumer-sapoche-checkup-event: 3 replicas (highest volume)
- All other consumers: 1 replica (standard volume)
- No auto-scaling, fixed capacity
Stability
- Namespace Age: ~2 years (mature, stable)
- Recent Updates: 20 days ago (regular maintenance)
- New Components: consumer-send-notification and batch-publisher (21 days)
Integration Points
External Systems
-
Sapoche Platform:
- Checkup events (high volume)
- Patient examination completions
- Test result notifications
-
Referring Doctor Portal:
- Doctor dashboard
- Order history view
- Notification preferences
-
Notification Channels:
- SMS notifications
- Email notifications
- Push notifications
- In-app notifications
-
User Management System:
- Doctor registration
- User account creation
- Login tracking
Internal Systems
- Order Management: RFD order history
- Checkup Service: Checkup event source
- User Service: User management events
- Notification Gateway: Multi-channel delivery
Important Notes
x PRODUCTION ENVIRONMENT:
- This is a CRITICAL DOCTOR COMMUNICATION system
- Downtime impacts referring doctor notifications and updates
- Changes must be tested in staging first
- Monitor notification delivery rates carefully
- Have immediate rollback plan ready
x No Auto-Scaling:
- No HPAs or KEDA configured
- Fixed replica counts cannot adapt to load
- consumer-sapoche-checkup-event at 3 replicas (may need more during peaks)
- Consider adding KEDA for queue-based scaling
x Single Replica API:
- Main API at 1 replica (single point of failure)
- No redundancy for doctor communication API
- Increase to 2+ replicas or add HPA
x High Volume Consumer:
- consumer-sapoche-checkup-event: 3 replicas
- Handles high volume of checkup events
- Monitor for capacity issues during peak times
Compliance: Doctor communication data subject to regulatory requirements (HIPAA, confidentiality requirements)
System Purpose
The RFD (Referring Doctor) notification system provides:
- Checkup Notifications: High-volume checkup event processing (3 replicas)
- Order History: RFD order tracking and history
- Notification Delivery: Multi-channel doctor notifications
- User Management: Doctor registration and login tracking
- Event Processing: Event-driven doctor communication workflow
Key Role: Central notification and communication platform for referring doctors, processing high volumes of checkup events and delivering notifications across multiple channels.