Skip to main content

rfd--user-mgt

Overview

  • Namespace: rfd--user-mgt
  • Purpose: Referring Doctor User Management Backend - PRODUCTION
  • Age: ~3 years 80 days (since August 2022)
  • Status: Active - User registration, authentication, and profile management
  • Workloads: 7 deployments (all active)
  • Environment: PRODUCTION - Doctor user management and authentication

Architecture

User management system handling doctor accounts, authentication, and notifications:

  • Main Application: REST API backend (3 replicas) - Good HA
  • Event Consumers: Customer, email notifications, login events, OTP (4 deployments)
  • Worker: Default background worker (1 deployment)
  • Observability: OpenTelemetry collector

Auto-Scaling Configuration

No Auto-Scaling Configured:

  • No HorizontalPodAutoscalers (HPAs)
  • No KEDA scaled objects
  • Fixed replica counts

Workload Categories

Main Application (1 deployment with HA)

NameReplicasStatusPurpose
rfd--user-mgt--be--app--prod3/3RunningMain user management API (Good HA)

Event Consumers (4 deployments)

NameReplicasStatusPurpose
consumer-customer1/1RunningCustomer account events
consumer-email-notification1/1RunningEmail notification processing
consumer-internal-login-event1/1RunningInternal login tracking
consumer-send-otp1/1RunningOTP sending and verification

Workers (1 deployment)

NameReplicasStatusPurpose
wrk--default1/1RunningDefault worker queue

Observability (1 deployment)

NameReplicasStatusPurpose
otel-collector1/1RunningOpenTelemetry metrics/traces

Services

NameTypeCluster IPPortsNodePortPurpose
rfd--user-mgt--be--app--prodNodePort10.8.25.1408031057Main user management API
rfd--user-mgt--be-otel-collectorClusterIP10.8.29.2164317, 4318-OpenTelemetry collector

Access & Management

View all resources:

kubectl get all -n rfd--user-mgt

Check main application:

# View app pods (3 replicas)
kubectl get pods -n rfd--user-mgt | grep "app--prod"

# View logs from all replicas
kubectl logs -f deployment/rfd--user-mgt--be--app--prod -n rfd--user-mgt

Check consumers:

# Customer events
kubectl logs -f deployment/rfd--user-mgt--be--consumer-customer--prod -n rfd--user-mgt

# Email notifications
kubectl logs -f deployment/rfd--user-mgt--be--consumer-email-notification--prod -n rfd--user-mgt

# Login events
kubectl logs -f deployment/rfd--user-mgt--be--consumer-internal-login-event--prod -n rfd--user-mgt

# OTP sending
kubectl logs -f deployment/rfd--user-mgt--be--consumer-send-otp--prod -n rfd--user-mgt

Restart services:

# Restart main app
kubectl rollout restart deployment/rfd--user-mgt--be--app--prod -n rfd--user-mgt

# Restart all consumers
kubectl get deployments -n rfd--user-mgt | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n rfd--user-mgt

# Restart worker
kubectl rollout restart deployment/rfd--user-mgt--be--wrk--default--prod -n rfd--user-mgt

Monitoring

Resource usage:

kubectl top pods -n rfd--user-mgt --sort-by=memory
kubectl top pods -n rfd--user-mgt --sort-by=cpu

Events:

kubectl get events -n rfd--user-mgt --sort-by='.lastTimestamp' | head -20

Data Flow

Doctor User Management Request

rfd--user-mgt--be--app--prod (NodePort 31057)

Main User Management API (3 replicas)

Database (external)

Events Published to Message Queue
├─ Customer Events → consumer-customer
├─ Email Notifications → consumer-email-notification
├─ Login Tracking → consumer-internal-login-event
└─ OTP Requests → consumer-send-otp

Worker Processes Background Jobs

OpenTelemetry → Observability

User Management Workflow

1. User Management API (Good HA)

  • 3 replicas provide good redundancy
  • Doctor registration and profile management
  • Account authentication
  • Role and permission management
  • Profile updates

2. Customer Event Consumer

  • consumer-customer processes customer account events
  • Account creation and updates
  • Profile synchronization
  • Account status changes

3. Email Notification Consumer

  • consumer-email-notification handles email sending
  • Welcome emails
  • Verification emails
  • Password reset notifications
  • Account update notifications

4. Login Event Consumer

  • consumer-internal-login-event tracks login activity
  • Login analytics
  • Session management
  • Security monitoring
  • Audit logging

5. OTP Consumer

  • consumer-send-otp handles OTP delivery
  • One-time password generation
  • OTP sending via SMS/email
  • Verification code management
  • Time-based expiration

6. Background Worker

  • wrk--default handles general background tasks
  • Data processing
  • Cleanup operations
  • Scheduled maintenance

7. OpenTelemetry Collector

  • Metrics collection
  • Distributed tracing
  • Performance monitoring
  • Error tracking

Production Considerations

High Availability

Good Main API Configuration:

  • Main API: 3 replicas (good HA)
  • Very mature namespace (~3 years)
  • Very active development (updated 23h ago)

Single Points of Failure:

  • All 4 consumers: 1 replica each
  • Worker: 1 replica
  • OTP consumer particularly critical (single replica)

Recommendations

  1. Maintain Current HA:

    • Main API: 3 replicas provides good redundancy
    • Well-established (3 years in production)
    • Very active updates (23h ago)
  2. Consumer Resilience:

    • consumer-send-otp: 1 replica (consider 2+ for OTP reliability)
    • consumer-email-notification: 1 replica (consider 2+ for email delivery)
    • consumer-internal-login-event: 1 replica (consider 2 for audit reliability)
    • consumer-customer: 1 replica (consider 2 for account sync)
  3. Consider Auto-Scaling:

    • Main API: Add HPA (3-10 replicas based on traffic)
    • OTP consumer: Consider KEDA for queue-based scaling
    • Email consumer: Consider KEDA for email queue scaling
  4. Recent Activity:

    • Very active: Updated 23h ago
    • Frequent updates (multiple in past month)
    • Stable operation despite frequent changes
  5. Monitoring Priorities:

    • OTP delivery success rate
    • Email notification delivery
    • Login event processing lag
    • Customer sync status
    • API response times

Troubleshooting

Main API issues:

# Check all 3 API pods
kubectl get pods -n rfd--user-mgt | grep "app--prod"

# Check logs from all replicas
kubectl logs deployment/rfd--user-mgt--be--app--prod -n rfd--user-mgt --all-containers=true --tail=100

# Test API endpoint
kubectl port-forward -n rfd--user-mgt service/rfd--user-mgt--be--app--prod 8080:80

OTP delivery issues:

# Check OTP consumer
kubectl logs -f deployment/rfd--user-mgt--be--consumer-send-otp--prod -n rfd--user-mgt

# Check for OTP errors
kubectl logs deployment/rfd--user-mgt--be--consumer-send-otp--prod -n rfd--user-mgt --tail=100 | grep -i "error\|otp\|fail"

# Restart OTP consumer
kubectl rollout restart deployment/rfd--user-mgt--be--consumer-send-otp--prod -n rfd--user-mgt

Email notification issues:

# Check email consumer
kubectl logs -f deployment/rfd--user-mgt--be--consumer-email-notification--prod -n rfd--user-mgt

# Check for email errors
kubectl logs deployment/rfd--user-mgt--be--consumer-email-notification--prod -n rfd--user-mgt --tail=100 | grep -i "error\|email\|fail"

# Restart email consumer
kubectl rollout restart deployment/rfd--user-mgt--be--consumer-email-notification--prod -n rfd--user-mgt

Login tracking issues:

# Check login consumer
kubectl logs -f deployment/rfd--user-mgt--be--consumer-internal-login-event--prod -n rfd--user-mgt

# Check for login errors
kubectl logs deployment/rfd--user-mgt--be--consumer-internal-login-event--prod -n rfd--user-mgt --tail=100 | grep -i "error\|login\|fail"

Customer sync issues:

# Check customer consumer
kubectl logs -f deployment/rfd--user-mgt--be--consumer-customer--prod -n rfd--user-mgt

# Check for sync errors
kubectl logs deployment/rfd--user-mgt--be--consumer-customer--prod -n rfd--user-mgt --tail=100 | grep -i "error\|sync\|fail"

Performance Metrics

Current Scale

  • Main API: 3 replicas (good HA)
  • Consumers: 4 consumers at 1 replica each
  • Worker: 1 replica
  • Observability: 1 OTel collector
  • Total Active Pods: ~9 pods

Stability

  • Namespace Age: ~3 years (very mature)
  • Recent Update: 23h ago (very active development)
  • Main API: Fixed 3 replicas (good HA)
  • Critical Services: OTP and email consumers at 1 replica (consider increasing)