Skip to main content

rfd--notification

Overview

  • Namespace: rfd--notification
  • Purpose: Referring Doctor (RFD) Notification Service - PRODUCTION
  • Age: ~2 years 195 days (since April 2023)
  • Status: Active - Doctor notification and communication system
  • Workloads: 8 deployments (all active)
  • Environment: PRODUCTION - Critical doctor communication platform

Architecture

Referring doctor notification system handling event-driven notifications for doctors, checkup events, order history, and user management:

  • Main Application: REST API backend (1 replica)
  • Event Consumers: Process doctor notifications, checkup events, orders, user management (5 deployments)
  • Worker: Batch job publisher (1 deployment)
  • Scheduler: Cron jobs for scheduled tasks (1 deployment)

Auto-Scaling Configuration

No Auto-Scaling Configured:

  • No HorizontalPodAutoscalers (HPAs)
  • No KEDA scaled objects
  • Fixed replica counts for all deployments

Workload Categories

Main Application (1 deployment)

NameReplicasStatusPurpose
rfd--notification--be--app--prod1/1RunningMain RFD notification API

The main application handles:

  • Referring doctor communications
  • Notification management
  • Event orchestration
  • RESTful API for frontend and integrations

Event Consumers (5 deployments)

Process referring doctor and checkup-related events:

NameReplicasStatusPurpose
consumer-rfd-order-history-events1/1RunningRFD order history events
consumer-sapoche-checkup-event3/3RunningCheckup events (3 replicas for high volume)
consumer-send-notification1/1RunningSend notifications (newer - 21 days)
consumer-user-management-create-user1/1RunningUser creation events
consumer-user-management-internal-login-event1/1RunningInternal login tracking

Workers & Scheduler (2 deployments)

NameReplicasStatusPurpose
wrk--batch-publisher1/1RunningBatch job publisher (newer - 21 days)
cron--prod1/1RunningScheduled cron jobs

Services

NameTypeCluster IPPortsNodePortPurpose
rfd--notification--be--app--prodNodePort10.8.22.2458031898Main RFD notification API

Access & Management

View all resources:

kubectl get all -n rfd--notification

Check main application:

# View app pod
kubectl get pods -n rfd--notification | grep "app--prod"

# View logs
kubectl logs -f deployment/rfd--notification--be--app--prod -n rfd--notification

Check consumers:

# All consumers
kubectl get pods -n rfd--notification | grep consumer

# High-volume checkup consumer (3 replicas)
kubectl logs -f deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification

# Send notification consumer
kubectl logs -f deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification

Check cron jobs:

# View cron pod
kubectl get pods -n rfd--notification | grep cron

# Cron logs
kubectl logs -f deployment/rfd--notification--be--cron--prod -n rfd--notification

Restart services:

# Restart main app
kubectl rollout restart deployment/rfd--notification--be--app--prod -n rfd--notification

# Restart specific consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification

# Restart all consumers
kubectl get deployments -n rfd--notification | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n rfd--notification

Monitoring

Resource usage:

kubectl top pods -n rfd--notification --sort-by=memory
kubectl top pods -n rfd--notification --sort-by=cpu

Deployment status:

kubectl get deployments -n rfd--notification

Events:

kubectl get events -n rfd--notification --sort-by='.lastTimestamp' | head -20

Data Flow

Referring Doctor Events

rfd--notification--be--app--prod (NodePort 31898)

Main RFD Notification API

Message Queue (Kafka/Redpanda)

Consumers Process Events
├─ Checkup Events → consumer-sapoche-checkup-event (3 replicas - high volume)
├─ Order History → consumer-rfd-order-history-events
├─ Send Notifications → consumer-send-notification
├─ User Creation → consumer-user-management-create-user
└─ Login Events → consumer-user-management-internal-login-event

Batch Publisher Creates Background Jobs

Doctor notifications, order updates, user alerts

RFD Notification Workflow

1. Checkup Event Processing (High Volume)

  • Patient checkup events from Sapoche
  • 3 replicas to handle high event volume
  • consumer-sapoche-checkup-event processes checkup notifications
  • Notifies referring doctors of checkup completion
  • Updates doctor dashboard

2. Order History Events

  • RFD order history tracking
  • consumer-rfd-order-history-events processes order events
  • Updates referring doctor order views
  • Historical order data synchronization

3. Send Notifications

  • Newer consumer (21 days old)
  • consumer-send-notification handles notification delivery
  • Multi-channel notification support
  • SMS, email, push notifications to doctors

4. User Management Events

  • User Creation: consumer-user-management-create-user
    • New referring doctor registration
    • User account setup
    • Welcome notifications
  • Login Tracking: consumer-user-management-internal-login-event
    • Internal login event tracking
    • Security monitoring
    • Access auditing

5. Scheduled Tasks

  • Cron job for periodic tasks
  • Scheduled notifications
  • Report generation
  • Data cleanup

Production Considerations

High Availability

Well Configured:

  • consumer-sapoche-checkup-event: 3 replicas (high volume handling)

x Single Points of Failure:

  • Main API: 1 replica (no HA)
  • All other consumers: 1 replica each
  • Batch publisher: 1 replica
  • Cron job: 1 replica

No Auto-Scaling

  • No HPAs or KEDA configured
  • Fixed replica counts
  • Cannot automatically scale based on load
  • May struggle during peak checkup times

Recommendations

  1. Main API Resilience:

    • Currently 1 replica (single point of failure)
    • Add HPA or increase to 2 replicas minimum
    • Critical for doctor communications
  2. Add Auto-Scaling:

    • Consider KEDA for consumers based on queue depth:
      • consumer-sapoche-checkup-event (high volume - scale 3-10)
      • consumer-send-notification (notification delivery)
      • consumer-rfd-order-history-events
  3. Consumer Resilience:

    • consumer-sapoche-checkup-event: 3 replicas (good)
    • Consider 2 replicas for critical consumers:
      • consumer-send-notification (critical for notification delivery)
      • consumer-rfd-order-history-events
  4. Recent Deployments:

    • consumer-send-notification: 21 days old (newer)
    • wrk--batch-publisher: 21 days old (newer)
    • Monitor stability and performance
  5. Monitoring Priorities:

    • API response times
    • Checkup event processing lag (3 replicas handling load)
    • Notification delivery success rates
    • User management event processing
    • Order history synchronization
  6. Capacity Planning:

    • consumer-sapoche-checkup-event at 3 replicas
    • Review if sufficient for peak checkup volumes
    • Consider scaling to 5-10 replicas during peak hours

Troubleshooting

Main API issues:

# Check API pod
kubectl get pods -n rfd--notification | grep "app--prod"

# Check logs
kubectl logs -f deployment/rfd--notification--be--app--prod -n rfd--notification --tail=100

# Test API endpoint
kubectl port-forward -n rfd--notification service/rfd--notification--be--app--prod 8080:80
# Access http://localhost:8080

Consumer not processing:

# Check consumer status
kubectl get pods -n rfd--notification | grep consumer

# Check checkup consumer (3 replicas)
kubectl logs -f deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification

# Check all replicas
for pod in $(kubectl get pods -n rfd--notification | grep "consumer-sapoche-checkup" | awk '{print $1}'); do
echo "=== $pod ==="
kubectl logs $pod -n rfd--notification --tail=20
done

# Restart consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification

Checkup event delays:

# Check checkup consumer load (3 replicas)
kubectl top pods -n rfd--notification | grep checkup

# Check logs for lag
kubectl logs deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification --tail=100 | grep -i "lag\|delay\|queue"

# Check pod distribution
kubectl get pods -n rfd--notification -o wide | grep checkup

# Restart all checkup consumer replicas
kubectl rollout restart deployment/rfd--notification--be--consumer-sapoche-checkup-event---prod -n rfd--notification

Notification delivery failures:

# Check send notification consumer
kubectl logs -f deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification

# Check for errors
kubectl logs deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification --tail=100 | grep -i "error\|fail\|exception"

# Restart consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-send-notification---prod -n rfd--notification

Order history sync issues:

# Check order history consumer
kubectl logs -f deployment/rfd--notification--be--consumer-rfd-order-history-events---prod -n rfd--notification

# Check for sync errors
kubectl logs deployment/rfd--notification--be--consumer-rfd-order-history-events---prod -n rfd--notification --tail=100 | grep -i "sync\|order\|history"

# Restart consumer
kubectl rollout restart deployment/rfd--notification--be--consumer-rfd-order-history-events---prod -n rfd--notification

User management issues:

# Check user creation consumer
kubectl logs -f deployment/rfd--notification--be--consumer-user-management-create-user---prod -n rfd--notification

# Check login event consumer
kubectl logs -f deployment/rfd--notification--be--consumer-user-management-internal-login-event---prod -n rfd--notification

# Restart both
kubectl rollout restart deployment/rfd--notification--be--consumer-user-management-create-user---prod -n rfd--notification
kubectl rollout restart deployment/rfd--notification--be--consumer-user-management-internal-login-event---prod -n rfd--notification

Cron job failures:

# Check cron pod
kubectl get pods -n rfd--notification | grep cron

# Check cron logs
kubectl logs -f deployment/rfd--notification--be--cron--prod -n rfd--notification

# Restart cron
kubectl rollout restart deployment/rfd--notification--be--cron--prod -n rfd--notification

Performance Metrics

Current Scale

  • Main API: 1 replica (no HA)
  • High-Volume Consumer: 3 replicas (checkup events)
  • Standard Consumers: 4 consumers at 1 replica each
  • Workers: 1 batch publisher, 1 cron
  • Total Active Pods: ~10 pods

Scaling Pattern

  • consumer-sapoche-checkup-event: 3 replicas (highest volume)
  • All other consumers: 1 replica (standard volume)
  • No auto-scaling, fixed capacity

Stability

  • Namespace Age: ~2 years (mature, stable)
  • Recent Updates: 20 days ago (regular maintenance)
  • New Components: consumer-send-notification and batch-publisher (21 days)

Integration Points

External Systems

  1. Sapoche Platform:

    • Checkup events (high volume)
    • Patient examination completions
    • Test result notifications
  2. Referring Doctor Portal:

    • Doctor dashboard
    • Order history view
    • Notification preferences
  3. Notification Channels:

    • SMS notifications
    • Email notifications
    • Push notifications
    • In-app notifications
  4. User Management System:

    • Doctor registration
    • User account creation
    • Login tracking

Internal Systems

  • Order Management: RFD order history
  • Checkup Service: Checkup event source
  • User Service: User management events
  • Notification Gateway: Multi-channel delivery

Important Notes

x PRODUCTION ENVIRONMENT:

  • This is a CRITICAL DOCTOR COMMUNICATION system
  • Downtime impacts referring doctor notifications and updates
  • Changes must be tested in staging first
  • Monitor notification delivery rates carefully
  • Have immediate rollback plan ready

x No Auto-Scaling:

  • No HPAs or KEDA configured
  • Fixed replica counts cannot adapt to load
  • consumer-sapoche-checkup-event at 3 replicas (may need more during peaks)
  • Consider adding KEDA for queue-based scaling

x Single Replica API:

  • Main API at 1 replica (single point of failure)
  • No redundancy for doctor communication API
  • Increase to 2+ replicas or add HPA

x High Volume Consumer:

  • consumer-sapoche-checkup-event: 3 replicas
  • Handles high volume of checkup events
  • Monitor for capacity issues during peak times

Compliance: Doctor communication data subject to regulatory requirements (HIPAA, confidentiality requirements)

System Purpose

The RFD (Referring Doctor) notification system provides:

  1. Checkup Notifications: High-volume checkup event processing (3 replicas)
  2. Order History: RFD order tracking and history
  3. Notification Delivery: Multi-channel doctor notifications
  4. User Management: Doctor registration and login tracking
  5. Event Processing: Event-driven doctor communication workflow

Key Role: Central notification and communication platform for referring doctors, processing high volumes of checkup events and delivering notifications across multiple channels.