rfd--user-mgt
Overview
- Namespace:
rfd--user-mgt - Purpose: Referring Doctor User Management Backend - PRODUCTION
- Age: ~3 years 80 days (since August 2022)
- Status: Active - User registration, authentication, and profile management
- Workloads: 7 deployments (all active)
- Environment: PRODUCTION - Doctor user management and authentication
Architecture
User management system handling doctor accounts, authentication, and notifications:
- Main Application: REST API backend (3 replicas) - Good HA
- Event Consumers: Customer, email notifications, login events, OTP (4 deployments)
- Worker: Default background worker (1 deployment)
- Observability: OpenTelemetry collector
Auto-Scaling Configuration
No Auto-Scaling Configured:
- No HorizontalPodAutoscalers (HPAs)
- No KEDA scaled objects
- Fixed replica counts
Workload Categories
Main Application (1 deployment with HA)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| rfd--user-mgt--be--app--prod | 3/3 | Running | Main user management API (Good HA) |
Event Consumers (4 deployments)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| consumer-customer | 1/1 | Running | Customer account events |
| consumer-email-notification | 1/1 | Running | Email notification processing |
| consumer-internal-login-event | 1/1 | Running | Internal login tracking |
| consumer-send-otp | 1/1 | Running | OTP sending and verification |
Workers (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| wrk--default | 1/1 | Running | Default worker queue |
Observability (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| otel-collector | 1/1 | Running | OpenTelemetry metrics/traces |
Services
| Name | Type | Cluster IP | Ports | NodePort | Purpose |
|---|---|---|---|---|---|
| rfd--user-mgt--be--app--prod | NodePort | 10.8.25.140 | 80 | 31057 | Main user management API |
| rfd--user-mgt--be-otel-collector | ClusterIP | 10.8.29.216 | 4317, 4318 | - | OpenTelemetry collector |
Access & Management
View all resources:
kubectl get all -n rfd--user-mgt
Check main application:
# View app pods (3 replicas)
kubectl get pods -n rfd--user-mgt | grep "app--prod"
# View logs from all replicas
kubectl logs -f deployment/rfd--user-mgt--be--app--prod -n rfd--user-mgt
Check consumers:
# Customer events
kubectl logs -f deployment/rfd--user-mgt--be--consumer-customer--prod -n rfd--user-mgt
# Email notifications
kubectl logs -f deployment/rfd--user-mgt--be--consumer-email-notification--prod -n rfd--user-mgt
# Login events
kubectl logs -f deployment/rfd--user-mgt--be--consumer-internal-login-event--prod -n rfd--user-mgt
# OTP sending
kubectl logs -f deployment/rfd--user-mgt--be--consumer-send-otp--prod -n rfd--user-mgt
Restart services:
# Restart main app
kubectl rollout restart deployment/rfd--user-mgt--be--app--prod -n rfd--user-mgt
# Restart all consumers
kubectl get deployments -n rfd--user-mgt | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n rfd--user-mgt
# Restart worker
kubectl rollout restart deployment/rfd--user-mgt--be--wrk--default--prod -n rfd--user-mgt
Monitoring
Resource usage:
kubectl top pods -n rfd--user-mgt --sort-by=memory
kubectl top pods -n rfd--user-mgt --sort-by=cpu
Events:
kubectl get events -n rfd--user-mgt --sort-by='.lastTimestamp' | head -20
Data Flow
Doctor User Management Request
↓
rfd--user-mgt--be--app--prod (NodePort 31057)
↓
Main User Management API (3 replicas)
↓
Database (external)
↓
Events Published to Message Queue
├─ Customer Events → consumer-customer
├─ Email Notifications → consumer-email-notification
├─ Login Tracking → consumer-internal-login-event
└─ OTP Requests → consumer-send-otp
↓
Worker Processes Background Jobs
↓
OpenTelemetry → Observability
User Management Workflow
1. User Management API (Good HA)
- 3 replicas provide good redundancy
- Doctor registration and profile management
- Account authentication
- Role and permission management
- Profile updates
2. Customer Event Consumer
consumer-customerprocesses customer account events- Account creation and updates
- Profile synchronization
- Account status changes
3. Email Notification Consumer
consumer-email-notificationhandles email sending- Welcome emails
- Verification emails
- Password reset notifications
- Account update notifications
4. Login Event Consumer
consumer-internal-login-eventtracks login activity- Login analytics
- Session management
- Security monitoring
- Audit logging
5. OTP Consumer
consumer-send-otphandles OTP delivery- One-time password generation
- OTP sending via SMS/email
- Verification code management
- Time-based expiration
6. Background Worker
wrk--defaulthandles general background tasks- Data processing
- Cleanup operations
- Scheduled maintenance
7. OpenTelemetry Collector
- Metrics collection
- Distributed tracing
- Performance monitoring
- Error tracking
Production Considerations
High Availability
Good Main API Configuration:
- Main API: 3 replicas (good HA)
- Very mature namespace (~3 years)
- Very active development (updated 23h ago)
Single Points of Failure:
- All 4 consumers: 1 replica each
- Worker: 1 replica
- OTP consumer particularly critical (single replica)
Recommendations
-
Maintain Current HA:
- Main API: 3 replicas provides good redundancy
- Well-established (3 years in production)
- Very active updates (23h ago)
-
Consumer Resilience:
- consumer-send-otp: 1 replica (consider 2+ for OTP reliability)
- consumer-email-notification: 1 replica (consider 2+ for email delivery)
- consumer-internal-login-event: 1 replica (consider 2 for audit reliability)
- consumer-customer: 1 replica (consider 2 for account sync)
-
Consider Auto-Scaling:
- Main API: Add HPA (3-10 replicas based on traffic)
- OTP consumer: Consider KEDA for queue-based scaling
- Email consumer: Consider KEDA for email queue scaling
-
Recent Activity:
- Very active: Updated 23h ago
- Frequent updates (multiple in past month)
- Stable operation despite frequent changes
-
Monitoring Priorities:
- OTP delivery success rate
- Email notification delivery
- Login event processing lag
- Customer sync status
- API response times
Troubleshooting
Main API issues:
# Check all 3 API pods
kubectl get pods -n rfd--user-mgt | grep "app--prod"
# Check logs from all replicas
kubectl logs deployment/rfd--user-mgt--be--app--prod -n rfd--user-mgt --all-containers=true --tail=100
# Test API endpoint
kubectl port-forward -n rfd--user-mgt service/rfd--user-mgt--be--app--prod 8080:80
OTP delivery issues:
# Check OTP consumer
kubectl logs -f deployment/rfd--user-mgt--be--consumer-send-otp--prod -n rfd--user-mgt
# Check for OTP errors
kubectl logs deployment/rfd--user-mgt--be--consumer-send-otp--prod -n rfd--user-mgt --tail=100 | grep -i "error\|otp\|fail"
# Restart OTP consumer
kubectl rollout restart deployment/rfd--user-mgt--be--consumer-send-otp--prod -n rfd--user-mgt
Email notification issues:
# Check email consumer
kubectl logs -f deployment/rfd--user-mgt--be--consumer-email-notification--prod -n rfd--user-mgt
# Check for email errors
kubectl logs deployment/rfd--user-mgt--be--consumer-email-notification--prod -n rfd--user-mgt --tail=100 | grep -i "error\|email\|fail"
# Restart email consumer
kubectl rollout restart deployment/rfd--user-mgt--be--consumer-email-notification--prod -n rfd--user-mgt
Login tracking issues:
# Check login consumer
kubectl logs -f deployment/rfd--user-mgt--be--consumer-internal-login-event--prod -n rfd--user-mgt
# Check for login errors
kubectl logs deployment/rfd--user-mgt--be--consumer-internal-login-event--prod -n rfd--user-mgt --tail=100 | grep -i "error\|login\|fail"
Customer sync issues:
# Check customer consumer
kubectl logs -f deployment/rfd--user-mgt--be--consumer-customer--prod -n rfd--user-mgt
# Check for sync errors
kubectl logs deployment/rfd--user-mgt--be--consumer-customer--prod -n rfd--user-mgt --tail=100 | grep -i "error\|sync\|fail"
Performance Metrics
Current Scale
- Main API: 3 replicas (good HA)
- Consumers: 4 consumers at 1 replica each
- Worker: 1 replica
- Observability: 1 OTel collector
- Total Active Pods: ~9 pods
Stability
- Namespace Age: ~3 years (very mature)
- Recent Update: 23h ago (very active development)
- Main API: Fixed 3 replicas (good HA)
- Critical Services: OTP and email consumers at 1 replica (consider increasing)