Skip to main content

imp

Overview

  • Namespace: imp
  • Purpose: IMP (Integrated Medical Platform) - PRODUCTION
  • Age: ~4 years
  • Status: Active - Integration and messaging platform
  • Workloads: 26 deployments (11 active, 15 scaled to 0)
  • Environment: PRODUCTION - Medical data integration and processing

Architecture

IMP is an integration and messaging platform that connects various medical systems:

  • API: REST API backend (2 replicas with HPA)
  • Event Consumers: Process events from Kafka queues (15 deployments, 11 active)
  • Workers: Background job processors (4 deployments, all scaled to 0)
  • Monitoring: Flower dashboard for Celery task monitoring (scaled to 0)

Auto-Scaling Configuration

HorizontalPodAutoscalers (7 HPAs)

HPA NameTargetMinMaxCurrentMetricsType
apiAPI deployment252CPU: 2%/100%, Mem: 255MB/1000MiStandard HPA
keda-hpa-kafka-imp-consumer-workerconsumer-worker1152Queue: 0/15KEDA
keda-hpa-kafka-imp-lis-test-resultconsumer-lis-test-result131Queue: 0/20KEDA
keda-hpa-kafka-imp-lis-test-result-b2bconsumer-lis-test-result-b2b131Queue: 3/20KEDA
keda-hpa-kafka-imp-lis-test-result-svip1consumer-lis-test-result-svip1111Queue: 3/20KEDA
keda-hpa-kafka-imp-lis-test-result-svip2consumer-lis-test-result-svip2111Queue: 3/20KEDA
keda-hpa-kafka-imp-lis-vid-statusconsumer-lis-test-status131Queue: 0/20KEDA

Scaling Summary:

  • API auto-scales based on CPU/memory load (currently at minimum)
  • 6 consumers use KEDA for Kafka queue-based autoscaling
  • Most consumers at minimum replicas (low current load)

Workload Categories

Main Application (1 deployment)

NameReplicasStatusPurpose
api2/2Running + HPAMain IMP API (auto-scales 2-5)

Active Event Consumers (11 deployments)

Process events from Kafka queues:

NameReplicasStatusPurpose
consumer-customer1/1RunningCustomer data sync
consumer-employee1/1RunningEmployee data sync
consumer-lis-orders3/3RunningLIS order processing (3 replicas)
consumer-lis-test-master1/1RunningTest master data sync
consumer-lis-test-result1/1Running + HPALIS test results (scales 1-3)
consumer-lis-test-result-b2b1/1Running + HPAB2B test results (scales 1-3)
consumer-lis-test-result-corp1/1RunningCorporate test results
consumer-lis-test-result-svip-repush1/1RunningSVIP result re-push
consumer-lis-test-result-svip11/1Running + HPASVIP tier 1 results
consumer-lis-test-result-svip21/1Running + HPASVIP tier 2 results
consumer-lis-test-status1/1Running + HPALIS test status (scales 1-3)
consumer-worker2/2Running + HPAGeneral worker (scales 1-15)
consumer-worker-order2/2RunningOrder worker (2 replicas)

Inactive Deployments (15 scaled to 0)

x Scaled to Zero - Review if still needed:

Consumers:

  • consumer-group
  • consumer-investigation
  • consumer-iris-test-result
  • consumer-orders
  • consumer-promotion
  • consumer-status-attune
  • consumer-test-result

Workers:

  • worker-create-order
  • worker-default
  • worker-notifications
  • worker-pdf

Monitoring:

  • flower (Celery task monitoring dashboard)

Services

NameTypeCluster IPPortsNodePortPurpose
apiNodePort10.8.22.2008031985IMP API
flowerNodePort10.8.20.1038030944Celery monitoring (scaled to 0)

Access & Management

View all resources:

kubectl get all -n imp

Check API:

# View API pods
kubectl get pods -n imp | grep "^api"

# Check HPA status
kubectl describe hpa api -n imp

# View logs
kubectl logs -f deployment/api -n imp

# Test API
kubectl port-forward -n imp service/api 8080:80
# Access http://localhost:8080

Check consumers:

# All active consumers
kubectl get pods -n imp | grep consumer | grep Running

# KEDA-scaled consumers
kubectl get hpa -n imp | grep keda

# Consumer logs
kubectl logs -f deployment/consumer-lis-test-result -n imp

View HPAs:

# All HPAs
kubectl get hpa -n imp

# Detailed HPA status
kubectl describe hpa api -n imp
kubectl describe hpa keda-hpa-kafka-imp-consumer-worker -n imp

Scaling:

# View current scaling
kubectl get hpa -n imp -w

# Check KEDA scaled objects
kubectl get scaledobjects -n imp

Restart services:

# Restart API
kubectl rollout restart deployment/api -n imp

# Restart specific consumer
kubectl rollout restart deployment/consumer-lis-test-result -n imp

# Restart all active consumers
kubectl get deployments -n imp | grep consumer | grep -v "0/0" | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n imp

Monitoring

Resource usage:

kubectl top pods -n imp --sort-by=memory
kubectl top pods -n imp --sort-by=cpu

HPA metrics:

# All HPAs
kubectl get hpa -n imp

# Detailed status
kubectl describe hpa api -n imp

Deployment status:

# Active deployments
kubectl get deployments -n imp | grep -v "0/0"

# Scaled to 0
kubectl get deployments -n imp | grep "0/0"

Events:

kubectl get events -n imp --sort-by='.lastTimestamp' | head -20

Kafka consumer lag (via KEDA):

# Check KEDA scaled objects
kubectl describe scaledobject -n imp

Data Flow

External Systems

IMP API (NodePort 31985)

API (2-5 replicas via HPA)

Publish to Kafka Topics

Kafka Brokers (external)

Consumers Process Events

Integrate with Target Systems
- LIS (Laboratory Information System)
- Customer systems
- Employee systems
- B2B partners

Integration Points

Laboratory Information System (LIS)

Multiple specialized consumers for LIS integration:

  1. Test Results:

    • consumer-lis-test-result (general results)
    • consumer-lis-test-result-b2b (B2B partners)
    • consumer-lis-test-result-corp (corporate clients)
    • consumer-lis-test-result-svip1/svip2 (VIP tiers)
    • consumer-lis-test-result-svip-repush (retry logic)
  2. Test Orders:

    • consumer-lis-orders (3 replicas for high volume)
    • consumer-lis-test-master (test catalog sync)
  3. Status Updates:

    • consumer-lis-test-status (test status changes)

Other Integrations

  • Customer Data: consumer-customer
  • Employee Data: consumer-employee
  • General Workers: consumer-worker (handles misc async tasks)
  • Order Processing: consumer-worker-order

Production Considerations

High Availability

Well Configured:

  • API: 2 replicas with HPA (scales to 5)
  • LIS orders: 3 replicas (high volume)
  • Worker consumers: 2 replicas each
  • KEDA autoscaling for queue-based consumers

x Single Points of Failure:

  • Most LIS consumers: 1 replica
  • Customer/Employee consumers: 1 replica

Auto-Scaling Performance

WorkloadTypeCurrentMinMaxLoadStatus
APIStandard HPA225CPU 2%, Mem 255MBLow load
consumer-workerKEDA2115Queue: 0At baseline
LIS test result consumersKEDA1 each11-3Queue: 0-3Low lag

Current State: System is running at low/moderate load with room to scale

Recommendations

  1. Resource Cleanup:

    • 15 deployments scaled to 0 - significant cleanup opportunity
    • Review each scaled-to-0 deployment:
      • If permanently unused → delete
      • If seasonal/occasional → keep but document
    • This will simplify namespace and reduce confusion
  2. Consumer Resilience:

    • Most LIS consumers at 1 replica
    • Consider baseline of 2 replicas for critical consumers:
      • consumer-lis-test-result
      • consumer-lis-test-result-b2b
      • consumer-lis-test-status
  3. API Capacity:

    • Currently at 2 replicas with capacity to 5
    • CPU at 2% - significant headroom
    • Consider reviewing if max of 5 is sufficient for peak loads
  4. KEDA Thresholds:

    • Current queue thresholds: 15-20 messages per replica
    • Review if appropriate for your SLAs
    • Adjust based on message processing times
  5. Monitoring:

    • Flower dashboard scaled to 0 (Celery monitoring)
    • If using Celery workers, consider enabling Flower
    • Otherwise, remove the deployment
  6. High-Priority Consumers:

    • consumer-lis-orders: Already at 3 replicas (good)
    • consumer-worker: Scales 1-15 via KEDA (good)
    • consumer-worker-order: At 2 replicas (good)

Troubleshooting

API issues:

# Check API pods
kubectl get pods -n imp | grep "^api"

# Check HPA status
kubectl describe hpa api -n imp

# Check logs
kubectl logs -f deployment/api -n imp --tail=100

# Test API endpoint
kubectl port-forward -n imp service/api 8080:80
# Access http://localhost:8080/health

Consumer not processing:

# Check consumer status
kubectl get pods -n imp | grep consumer-lis-test-result

# Check KEDA scaler
kubectl describe scaledobject consumer-lis-test-result -n imp

# Check Kafka consumer lag
kubectl describe hpa keda-hpa-kafka-imp-lis-test-result -n imp

# Check logs
kubectl logs -f deployment/consumer-lis-test-result -n imp

# Restart consumer
kubectl rollout restart deployment/consumer-lis-test-result -n imp

High consumer lag:

# Check all KEDA HPAs
kubectl get hpa -n imp | grep keda

# Check specific consumer lag
kubectl describe hpa keda-hpa-kafka-imp-consumer-worker -n imp

# Manually scale if needed (KEDA will override based on queue)
kubectl scale deployment consumer-worker -n imp --replicas=5

# Check KEDA operator
kubectl get pods -n keda
kubectl logs -n keda deployment/keda-operator

LIS integration issues:

# Check all LIS consumers
kubectl get pods -n imp | grep consumer-lis

# Check LIS order consumer (highest volume)
kubectl logs -f deployment/consumer-lis-orders -n imp

# Check test result consumers
for consumer in test-result test-result-b2b test-result-corp; do
echo "=== consumer-lis-$consumer ==="
kubectl logs deployment/consumer-lis-$consumer -n imp --tail=20
done

Scaled-to-0 deployment investigation:

# List all scaled-to-0 deployments
kubectl get deployments -n imp | grep "0/0"

# Check last configuration
kubectl get deployment consumer-group -n imp -o yaml

# Check history
kubectl rollout history deployment/consumer-group -n imp

# Scale up if needed
kubectl scale deployment consumer-group -n imp --replicas=1

Performance Metrics

Current Scale

  • API: 2 replicas (low load, can scale to 5)
  • Active Consumers: 11 deployments, 17 total pods
    • High volume: 3 replicas (lis-orders)
    • Standard: 1-2 replicas
  • Inactive: 15 deployments scaled to 0
  • Total Active Pods: ~19-20 pods

Kafka Consumer Groups

Each consumer belongs to a Kafka consumer group:

  • Processes messages from specific topics
  • KEDA monitors consumer lag
  • Scales based on lag threshold

Cleanup Opportunity:

  • 15 deployments (58% of total) scaled to 0
  • Review and clean up to simplify namespace
  • Document any that need to be kept for occasional use