rfd--notification

Overview

Namespace: rfd--notification
Purpose: Referring Doctor (RFD) Notification Service - PRODUCTION
Age: ~2 years 195 days (since April 2023)
Status: Active - Doctor notification and communication system
Workloads: 8 deployments (all active)
Environment: PRODUCTION - Critical doctor communication platform

Architecture

Referring doctor notification system handling event-driven notifications for doctors, checkup events, order history, and user management:

Main Application: REST API backend (1 replica)
Event Consumers: Process doctor notifications, checkup events, orders, user management (5 deployments)
Worker: Batch job publisher (1 deployment)
Scheduler: Cron jobs for scheduled tasks (1 deployment)

Auto-Scaling Configuration

No Auto-Scaling Configured:

No HorizontalPodAutoscalers (HPAs)
No KEDA scaled objects
Fixed replica counts for all deployments

Workload Categories

Main Application (1 deployment)

Name	Replicas	Status	Purpose
rfd--notification--be--app--prod	1/1	Running	Main RFD notification API

The main application handles:

Referring doctor communications
Notification management
Event orchestration
RESTful API for frontend and integrations

Event Consumers (5 deployments)

Process referring doctor and checkup-related events:

Name	Replicas	Status	Purpose
consumer-rfd-order-history-events	1/1	Running	RFD order history events
consumer-sapoche-checkup-event	3/3	Running	Checkup events (3 replicas for high volume)
consumer-send-notification	1/1	Running	Send notifications (newer - 21 days)
consumer-user-management-create-user	1/1	Running	User creation events
consumer-user-management-internal-login-event	1/1	Running	Internal login tracking

Workers & Scheduler (2 deployments)

Name	Replicas	Status	Purpose
wrk--batch-publisher	1/1	Running	Batch job publisher (newer - 21 days)
cron--prod	1/1	Running	Scheduled cron jobs

Services

Name	Type	Cluster IP	Ports	NodePort	Purpose
rfd--notification--be--app--prod	NodePort	10.8.22.245	80	31898	Main RFD notification API

Access & Management

View all resources:

kubectl get all -n rfd--notification

Check main application:

# View app pod
kubectl get pods -n rfd--notification | grep "app--prod"

# View logs
kubectl logs -f deployment/rfd--notification--be--app--prod -n rfd--notification

Check consumers:

# All consumers
kubectl get pods -n rfd--notification | grep consumer

# High-volume checkup consumer (3 replicas)
kubectl logs -f deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification

# Send notification consumer
kubectl logs -f deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification

Check cron jobs:

# View cron pod
kubectl get pods -n rfd--notification | grep cron

# Cron logs
kubectl logs -f deployment/rfd--notification--be--cron--prod -n rfd--notification

Restart services:

# Restart main app
kubectl rollout restart deployment/rfd--notification--be--app--prod -n rfd--notification

# Restart specific consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification

# Restart all consumers
kubectl get deployments -n rfd--notification | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n rfd--notification

Monitoring

Resource usage:

kubectl top pods -n rfd--notification --sort-by=memory
kubectl top pods -n rfd--notification --sort-by=cpu

Deployment status:

kubectl get deployments -n rfd--notification

Events:

kubectl get events -n rfd--notification --sort-by='.lastTimestamp' | head -20

Data Flow

Referring Doctor Events
    ↓
rfd--notification--be--app--prod (NodePort 31898)
    ↓
Main RFD Notification API
    ↓
Message Queue (Kafka/Redpanda)
    ↓
Consumers Process Events
    ├─ Checkup Events → consumer-sapoche-checkup-event (3 replicas - high volume)
    ├─ Order History → consumer-rfd-order-history-events
    ├─ Send Notifications → consumer-send-notification
    ├─ User Creation → consumer-user-management-create-user
    └─ Login Events → consumer-user-management-internal-login-event
    ↓
Batch Publisher Creates Background Jobs
    ↓
Doctor notifications, order updates, user alerts

RFD Notification Workflow

1. Checkup Event Processing (High Volume)

Patient checkup events from Sapoche
3 replicas to handle high event volume
consumer-sapoche-checkup-event processes checkup notifications
Notifies referring doctors of checkup completion
Updates doctor dashboard

2. Order History Events

RFD order history tracking
consumer-rfd-order-history-events processes order events
Updates referring doctor order views
Historical order data synchronization

3. Send Notifications

Newer consumer (21 days old)
consumer-send-notification handles notification delivery
Multi-channel notification support
SMS, email, push notifications to doctors

4. User Management Events

User Creation: consumer-user-management-create-user
- New referring doctor registration
- User account setup
- Welcome notifications
Login Tracking: consumer-user-management-internal-login-event
- Internal login event tracking
- Security monitoring
- Access auditing

5. Scheduled Tasks

Cron job for periodic tasks
Scheduled notifications
Report generation
Data cleanup

Production Considerations

High Availability

Well Configured:

consumer-sapoche-checkup-event: 3 replicas (high volume handling)

x Single Points of Failure:

Main API: 1 replica (no HA)
All other consumers: 1 replica each
Batch publisher: 1 replica
Cron job: 1 replica

No Auto-Scaling

No HPAs or KEDA configured
Fixed replica counts
Cannot automatically scale based on load
May struggle during peak checkup times

Recommendations

Main API Resilience:
- Currently 1 replica (single point of failure)
- Add HPA or increase to 2 replicas minimum
- Critical for doctor communications
Add Auto-Scaling:
- Consider KEDA for consumers based on queue depth:
  - consumer-sapoche-checkup-event (high volume - scale 3-10)
  - consumer-send-notification (notification delivery)
  - consumer-rfd-order-history-events
Consumer Resilience:
- consumer-sapoche-checkup-event: 3 replicas (good)
- Consider 2 replicas for critical consumers:
  - consumer-send-notification (critical for notification delivery)
  - consumer-rfd-order-history-events
Recent Deployments:
- consumer-send-notification: 21 days old (newer)
- wrk--batch-publisher: 21 days old (newer)
- Monitor stability and performance
Monitoring Priorities:
- API response times
- Checkup event processing lag (3 replicas handling load)
- Notification delivery success rates
- User management event processing
- Order history synchronization
Capacity Planning:
- consumer-sapoche-checkup-event at 3 replicas
- Review if sufficient for peak checkup volumes
- Consider scaling to 5-10 replicas during peak hours

Troubleshooting

Main API issues:

# Check API pod
kubectl get pods -n rfd--notification | grep "app--prod"

# Check logs
kubectl logs -f deployment/rfd--notification--be--app--prod -n rfd--notification --tail=100

# Test API endpoint
kubectl port-forward -n rfd--notification service/rfd--notification--be--app--prod 8080:80
# Access http://localhost:8080

Consumer not processing:

# Check consumer status
kubectl get pods -n rfd--notification | grep consumer

# Check checkup consumer (3 replicas)
kubectl logs -f deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification

# Check all replicas
for pod in $(kubectl get pods -n rfd--notification | grep "consumer-sapoche-checkup" | awk '{print $1}'); do
  echo "=== $pod ==="
  kubectl logs $pod -n rfd--notification --tail=20
done

# Restart consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification

Checkup event delays:

# Check checkup consumer load (3 replicas)
kubectl top pods -n rfd--notification | grep checkup

# Check logs for lag
kubectl logs deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification --tail=100 | grep -i "lag\|delay\|queue"

# Check pod distribution
kubectl get pods -n rfd--notification -o wide | grep checkup

# Restart all checkup consumer replicas
kubectl rollout restart deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification

Notification delivery failures:

# Check send notification consumer
kubectl logs -f deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification

# Check for errors
kubectl logs deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification --tail=100 | grep -i "error\|fail\|exception"

# Restart consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification

Order history sync issues:

# Check order history consumer
kubectl logs -f deployment/rfd--notification--be--consumer-rfd-order-history-events---prod -n rfd--notification

# Check for sync errors
kubectl logs deployment/rfd--notification--be--consumer-rfd-order-history-events---prod -n rfd--notification --tail=100 | grep -i "sync\|order\|history"

# Restart consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-rfd-order-history-events---prod -n rfd--notification

User management issues:

# Check user creation consumer
kubectl logs -f deployment/rfd--notification--be--consumer-user-management-create-user---prod -n rfd--notification

# Check login event consumer
kubectl logs -f deployment/rfd--notification--be--consumer-user-management-internal-login-event---prod -n rfd--notification

# Restart both
kubectl rollout restart deployment/rfd--notification--be--consumer-user-management-create-user---prod -n rfd--notification
kubectl rollout restart deployment/rfd--notification--be--consumer-user-management-internal-login-event---prod -n rfd--notification

Cron job failures:

# Check cron pod
kubectl get pods -n rfd--notification | grep cron

# Check cron logs
kubectl logs -f deployment/rfd--notification--be--cron--prod -n rfd--notification

# Restart cron
kubectl rollout restart deployment/rfd--notification--be--cron--prod -n rfd--notification

Performance Metrics

Current Scale

Main API: 1 replica (no HA)
High-Volume Consumer: 3 replicas (checkup events)
Standard Consumers: 4 consumers at 1 replica each
Workers: 1 batch publisher, 1 cron
Total Active Pods: ~10 pods

Scaling Pattern

consumer-sapoche-checkup-event: 3 replicas (highest volume)
All other consumers: 1 replica (standard volume)
No auto-scaling, fixed capacity

Stability

Namespace Age: ~2 years (mature, stable)
Recent Updates: 20 days ago (regular maintenance)
New Components: consumer-send-notification and batch-publisher (21 days)

Integration Points

External Systems

Sapoche Platform:
- Checkup events (high volume)
- Patient examination completions
- Test result notifications
Referring Doctor Portal:
- Doctor dashboard
- Order history view
- Notification preferences
Notification Channels:
- SMS notifications
- Email notifications
- Push notifications
- In-app notifications
User Management System:
- Doctor registration
- User account creation
- Login tracking

Internal Systems

Order Management: RFD order history
Checkup Service: Checkup event source
User Service: User management events
Notification Gateway: Multi-channel delivery

Important Notes

x PRODUCTION ENVIRONMENT:

This is a CRITICAL DOCTOR COMMUNICATION system
Downtime impacts referring doctor notifications and updates
Changes must be tested in staging first
Monitor notification delivery rates carefully
Have immediate rollback plan ready

x No Auto-Scaling:

No HPAs or KEDA configured
Fixed replica counts cannot adapt to load
consumer-sapoche-checkup-event at 3 replicas (may need more during peaks)
Consider adding KEDA for queue-based scaling

x Single Replica API:

Main API at 1 replica (single point of failure)
No redundancy for doctor communication API
Increase to 2+ replicas or add HPA

x High Volume Consumer:

consumer-sapoche-checkup-event: 3 replicas
Handles high volume of checkup events
Monitor for capacity issues during peak times

Compliance: Doctor communication data subject to regulatory requirements (HIPAA, confidentiality requirements)

System Purpose

The RFD (Referring Doctor) notification system provides:

Checkup Notifications: High-volume checkup event processing (3 replicas)
Order History: RFD order tracking and history
Notification Delivery: Multi-channel doctor notifications
User Management: Doctor registration and login tracking
Event Processing: Event-driven doctor communication workflow

Key Role: Central notification and communication platform for referring doctors, processing high volumes of checkup events and delivering notifications across multiple channels.

Overview​

Architecture​

Auto-Scaling Configuration​

Workload Categories​

Main Application (1 deployment)​

Event Consumers (5 deployments)​

Workers & Scheduler (2 deployments)​

Services​

Access & Management​

View all resources:​

Check main application:​

Check consumers:​

Check cron jobs:​

Restart services:​

Monitoring​

Resource usage:​

Deployment status:​

Events:​

Data Flow​

RFD Notification Workflow​

1. Checkup Event Processing (High Volume)​

2. Order History Events​

3. Send Notifications​

4. User Management Events​

5. Scheduled Tasks​

Production Considerations​

High Availability​

No Auto-Scaling​

Recommendations​

Troubleshooting​

Main API issues:​

Consumer not processing:​

Checkup event delays:​

Notification delivery failures:​

Order history sync issues:​

User management issues:​

Cron job failures:​

Performance Metrics​

Current Scale​

Scaling Pattern​

Stability​

Integration Points​

External Systems​

Internal Systems​

Important Notes​

System Purpose​