Skip to main content

sapoche

Overview

  • Namespace: sapoche
  • Purpose: Main Sapoche application and microservices (PRODUCTION)
  • Age: ~4 years
  • Status: Active - Largest application in production1 cluster
  • Workloads: 39 deployments (36 active, 3 scaled to 0)
  • Environment: PRODUCTION - Critical medical platform

Architecture

Sapoche is the core medical platform with multiple components handling patient care, medical imaging, and clinic operations:

  • Consumers: Event consumers for various integrations (14 deployments)
  • Workers: Background job processors (17 deployments)
  • Backend Services: PHP-based application servers (8 replicas with HPA)
  • Supporting Services: Redis, DICOM viewers, design system, frontend

Auto-Scaling Configuration

HorizontalPodAutoscalers (5 HPAs)

HPA NameTargetMinMaxCurrentMetrics
sp-bnkd-phpsp-bnkd-php deployment1204-8CPU: 46%/85%, Mem: 109MB/160Mi
keda-hpa-consumer-lis-testresultconsumer-lis-testresult141Queue depth: 0/20
keda-hpa-consumer-spc-checkup-ordersconsumer-spc-checkup-orders181Queue depth: 0/20
keda-hpa-consumer-spc-pdf-generateconsumer-spc-pdf-generate1152Queue depth: 0/20
keda-hpa-consumer-spc-pdf-generate-examinationconsumer-spc-pdf-generate-examination131Queue depth: 0/20

KEDA Integration: 4 HPAs use KEDA (Kubernetes Event Driven Autoscaling) for queue-based autoscaling

Workload Categories

Event Consumers (14 deployments)

Consume events from message queues for various integrations:

NameReplicasStatusPurpose
consumer-dicom1/1RunningDICOM medical imaging
consumer-ecg1/1RunningECG data processing
consumer-employee1/1RunningEmployee data sync
consumer-lis-testresult1/1Running + HPALab test results (auto-scales 1-4)
consumer-pat-dtc-homekit-register0/0x Scaled to 0Patient homekit registration
consumer-pat-test-result-update1/1RunningPatient test result updates
consumer-revenue0/0x Scaled to 0Revenue processing
consumer-rfd-push-notification1/1RunningRFD notifications
consumer-spc-aborted-booking1/1RunningAborted booking handling
consumer-spc-checkup-orders1/1Running + HPACheckup orders (auto-scales 1-8)
consumer-spc-pdf-generate2/2Running + HPAPDF generation (auto-scales 1-15)
consumer-spc-pdf-generate-examination1/1Running + HPAExamination PDFs (auto-scales 1-3)
consumer-spc-pos-orders3/3RunningPOS order processing
consumer-spc-sync-test-master-data1/1RunningTest master data sync

Background Workers (17 deployments)

Process async jobs, generate PDFs, send notifications:

NameReplicasStatusPurpose
worker-booking4/4RunningBooking operations
worker-sapoche-batch-publisher1/1RunningBatch job publisher
worker-sapoche-checkup-audit-history1/1RunningCheckup audit logging
worker-sapoche-default10/10RunningDefault worker queue (10 replicas!)
worker-sapoche-event-tracking1/1RunningEvent tracking
worker-sapoche-examination-audit-history1/1RunningExamination audit logging
worker-sapoche-export-csv1/1RunningCSV export generation
worker-sapoche-imaging-report3/3RunningMedical imaging reports
worker-sapoche-notifications10/10RunningGeneral notifications (10 replicas!)
worker-sapoche-notifications-refdoc1/1RunningReference doctor notifications
worker-sapoche-pdf2/2RunningPDF generation (general)
worker-sapoche-pdf-en1/1RunningPDF generation (English)
worker-sapoche-pdf-svip-regen1/1RunningSVIP PDF regeneration
worker-sapoche-pdf-svip11/1RunningSVIP tier 1 PDFs
worker-sapoche-pdf-svip21/1RunningSVIP tier 2 PDFs
worker-sapoche-pdf-vip11/1RunningVIP tier 1 PDFs
ds-sapoche-customer-consumer3/3RunningData streaming customer consumer

Backend Application Services (5 deployments)

Main application servers:

NameReplicasStatusPurpose
sp-bnkd-php8/8Running + HPAPHP application backend (auto-scales 1-20)
sp-bknd-nginx3/3RunningNginx frontend for PHP
redis-master1/1RunningRedis cache for application
sapoche-be--scheduler--prod1/1RunningProduction task scheduler
orthanc-read-api4/4RunningOrthanc DICOM API (with sidecar)

Frontend Services (3 deployments)

NameReplicasStatusPurpose
sapoche-frontend-web2/2RunningMain frontend web application
diag-design-system-nginx0/0x Scaled to 0Design system/component library
diag-dicom-viewer-nginx1/1RunningDICOM medical image viewer

Services

NameTypeCluster IPPortsNodePortPurpose
sapoche-bknd-phpNodePort10.8.27.489000, 8032766, 30178PHP application
bknd-services-nginxNodePort10.8.21.2218032702Nginx proxy
redis-spClusterIP10.8.29.1286379-Redis cache
sapoche-frontend-webNodePort10.8.23.968031620Frontend web app
diag-design-system-nginxNodePort10.8.20.1688032262Design system
diag-dicom-viewer-nginxNodePort10.8.29.678030103DICOM viewer
orthanc-read-api-serviceNodePort10.8.25.1968032104Orthanc API

Access & Management

View all resources:

kubectl get all -n sapoche

Check specific workload type:

# All consumers
kubectl get pods -n sapoche | grep consumer

# All workers
kubectl get pods -n sapoche | grep worker

# Backend services
kubectl get pods -n sapoche | grep -E "sp-bnkd|redis|scheduler"

# Frontend
kubectl get pods -n sapoche | grep -E "frontend|nginx"

View HPAs:

# All HPAs
kubectl get hpa -n sapoche

# Detailed HPA status
kubectl describe hpa sp-bnkd-php -n sapoche
kubectl describe hpa keda-hpa-consumer-spc-pdf-generate -n sapoche

View logs:

# Consumer logs
kubectl logs -f deployment/consumer-lis-testresult -n sapoche

# Worker logs
kubectl logs -f deployment/worker-sapoche-pdf -n sapoche

# Backend logs (with HPA)
kubectl logs -f deployment/sp-bnkd-php -n sapoche

# Frontend logs
kubectl logs -f deployment/sapoche-frontend-web -n sapoche

Scale workloads:

# Manual scaling (HPA will override if configured)
kubectl scale deployment worker-sapoche-pdf -n sapoche --replicas=4

# View current scaling
kubectl get hpa -n sapoche -w

Restart services:

# Restart backend (HPA-managed)
kubectl rollout restart deployment sp-bnkd-php -n sapoche

# Restart specific consumer
kubectl rollout restart deployment consumer-lis-testresult -n sapoche

# Restart frontend
kubectl rollout restart deployment sapoche-frontend-web -n sapoche

Monitoring

Resource usage:

kubectl top pods -n sapoche --sort-by=memory
kubectl top pods -n sapoche --sort-by=cpu

Check deployment status:

kubectl get deployments -n sapoche

View HPA metrics:

kubectl get hpa -n sapoche
kubectl describe hpa sp-bnkd-php -n sapoche

View events:

kubectl get events -n sapoche --sort-by='.lastTimestamp' | head -20

Check KEDA scaled objects:

kubectl get scaledobjects -n sapoche
kubectl describe scaledobjects -n sapoche

Data Flow

External Requests

APISIX Gateway (apisix namespace)

bknd-services-nginx (NodePort 32702)

sp-bknd-nginx (load balancer, 3 replicas)

sp-bnkd-php (PHP application, 4-20 replicas via HPA)

redis-sp (cache)

External Databases / Storage

Events → Consumers (with KEDA HPAs) → Process → Workers (10+ replicas) → Background Jobs

Special Components

PHP Backend with HPA

  • Deployment: sp-bnkd-php
  • Current: 4-8 replicas
  • HPA Range: 1-20 replicas
  • Scaling Triggers: CPU 85%, Memory 160Mi
  • Purpose: Main application logic

Orthanc DICOM Server

  • Medical imaging server (PACS)
  • 4 replicas for high availability
  • Read-only API exposed on NodePort 32104
  • Stores and retrieves DICOM images (X-rays, CT scans, MRI, etc.)

DICOM Viewer

  • Web-based medical image viewer
  • Exposed on NodePort 30103
  • Displays DICOM images from Orthanc

Worker Scaling

  • worker-sapoche-default: 10 replicas (high-volume queue)
  • worker-sapoche-notifications: 10 replicas (high notification load)
  • worker-booking: 4 replicas (critical booking operations)

KEDA-Based Auto-Scaling

4 consumers use KEDA for queue-depth based scaling:

  • Monitors queue depth in message broker
  • Scales from 0 to max based on pending messages
  • More efficient than CPU/memory-based scaling for event consumers

Production Considerations

High Availability

Well Configured:

  • PHP backend: 4-8 replicas with HPA (scales to 20)
  • Notifications: 10 replicas
  • Default workers: 10 replicas
  • Orthanc DICOM: 4 replicas
  • Frontend: 2 replicas

x Single Points of Failure:

  • Redis: 1 replica (consider Redis Sentinel/Cluster)
  • Scheduler: 1 replica
  • Most consumers: 1 replica (but have KEDA for scaling)

Auto-Scaling Status

WorkloadTypeCurrentMinMaxStatus
PHP BackendCPU/Memory4-8120Active
PDF GenerateQueue2115KEDA Active
Checkup OrdersQueue118KEDA Active
LIS Test ResultQueue114KEDA Active
PDF ExaminationQueue113KEDA Active

Recommendations

  1. Redis High Availability:

    • x Currently 1 replica - single point of failure
    • Consider Redis Sentinel (3 nodes) or Redis Cluster
    • Critical for caching and session management
  2. Consumer Resilience:

    • Most consumers at 1 replica
    • KEDA HPAs configured but currently at minimum
    • Consider baseline of 2 replicas for critical consumers
  3. Monitoring Priorities:

    • Monitor HPA scaling events
    • Track queue depths (KEDA metrics)
    • Monitor PDF generation times
    • Alert on Redis failures
    • Track Orthanc storage capacity
  4. Resource Optimization:

    • Review scaled-to-0 deployments:
      • consumer-pat-dtc-homekit-register
      • consumer-revenue
      • diag-design-system-nginx
    • Remove if permanently unused
  5. Scaling Thresholds:

    • PHP Backend HPA target: CPU 85% (consider lowering to 70% for faster response)
    • KEDA queue thresholds: 20 messages (review if appropriate)

Troubleshooting

Consumer not processing:

# Check consumer logs
kubectl logs -f deployment/consumer-lis-testresult -n sapoche

# Check if KEDA scaler is working
kubectl describe scaledobject consumer-lis-testresult -n sapoche

# Check queue connection
kubectl exec -it deployment/consumer-lis-testresult -n sapoche -- env | grep QUEUE

# Restart consumer
kubectl rollout restart deployment/consumer-lis-testresult -n sapoche

Worker backlog:

# Check worker logs
kubectl logs deployment/worker-sapoche-pdf -n sapoche --tail=100

# Scale up workers temporarily
kubectl scale deployment/worker-sapoche-pdf -n sapoche --replicas=5

# Check Redis queue depth
kubectl exec -it deployment/redis-master -n sapoche -- redis-cli LLEN queue_name

PHP application issues:

# Check PHP pods
kubectl get pods -n sapoche | grep sp-bnkd-php

# Check HPA status
kubectl describe hpa sp-bnkd-php -n sapoche

# Check PHP logs
kubectl logs -f deployment/sp-bnkd-php -n sapoche

# Check Nginx logs
kubectl logs -f deployment/sp-bknd-nginx -n sapoche

# Execute into PHP pod
kubectl exec -it deployment/sp-bnkd-php -n sapoche -- bash

HPA not scaling:

# Check HPA events
kubectl describe hpa sp-bnkd-php -n sapoche

# Check metrics server
kubectl top nodes
kubectl top pods -n sapoche

# Check KEDA operator
kubectl get pods -n keda
kubectl logs -f -n keda deployment/keda-operator

DICOM issues:

# Check Orthanc API (4 replicas)
kubectl logs -f deployment/orthanc-read-api -n sapoche

# Test Orthanc API
kubectl port-forward -n sapoche service/orthanc-read-api-service 8042:80
# Access http://localhost:8042

# Check storage capacity
kubectl exec -it deployment/orthanc-read-api -n sapoche -- df -h

Frontend issues:

# Check frontend pods
kubectl get pods -n sapoche | grep sapoche-frontend-web

# Check frontend logs
kubectl logs -f deployment/sapoche-frontend-web -n sapoche

# Test frontend service
kubectl port-forward -n sapoche service/sapoche-frontend-web 8080:80
# Access http://localhost:8080

Performance Metrics

Current Scale (Production Load)

  • PHP Backend: 4-8 replicas (auto-scaling based on load)
  • Workers:
    • Default queue: 10 replicas (high volume)
    • Notifications: 10 replicas (high volume)
    • Booking: 4 replicas (critical path)
    • Others: 1-3 replicas
  • Consumers: Mostly 1 replica with KEDA for burst scaling
  • Frontend: 2 replicas

Total Pod Count

~65-75 pods running in sapoche namespace (varies with HPA/KEDA scaling)

Important Notes

x PRODUCTION ENVIRONMENT:

  • This is a CRITICAL PRODUCTION namespace
  • Serves medical platform - exercise extreme caution
  • Changes should be tested in staging first
  • Monitor HPA scaling during changes
  • Have rollback plan ready