spc--lis
Overview
- Namespace:
spc--lis - Purpose: Sapoche Laboratory Information System (LIS) - PRODUCTION
- Age: 416 days (~14 months)
- Status: Active - Critical medical laboratory system
- Workloads: 26 deployments (24 active, 2 scaled to 0)
- Environment: PRODUCTION - Handles all laboratory test processing
Architecture
The Laboratory Information System (LIS) manages the complete laboratory workflow:
- Main Application: REST API backend (6 replicas with HPA)
- Event Consumers: Process laboratory events and results (11 deployments)
- Batch Publishers: Async job publishers for various workflows (11 deployments)
- Cron Jobs: Scheduled tasks
- Observability: OpenTelemetry collector for tracing
Auto-Scaling Configuration
HorizontalPodAutoscalers (4 HPAs)
| HPA Name | Target | Min | Max | Current | Metrics | Type |
|---|---|---|---|---|---|---|
| spc--lis--be--app--prod | Main app | 2 | 20 | 6 | CPU: 46%/80%, Mem: 500Mi | Standard HPA |
| keda-hpa-consumer-lis-sample-status | Sample status consumer | 1 | 10 | 1 | Queue: 0/10 | KEDA |
| keda-hpa-consumer-lis-work-order | Work order consumer | 1 | 4 | 1 | Queue: 0/10 | KEDA |
| keda-hpa-consumer-pdf-webhook | PDF webhook consumer | 2 | 15 | 2 | Queue: 0/50 | KEDA |
Scaling Summary:
- Main app auto-scales based on CPU/memory load
- 3 consumers use KEDA for queue-based autoscaling
- PDF webhook maintains minimum 2 replicas for availability
Workload Categories
Main Application (1 deployment)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| spc--lis--be--app--prod | 6/6 | Running + HPA | Main LIS API (auto-scales 2-20) |
The main application handles:
- Laboratory test ordering
- Sample tracking
- Result entry and verification
- Quality control
- Integration with lab analyzers
- RESTful API for frontend and integrations
Event Consumers (11 deployments)
Process laboratory-related events from message queues:
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| consumer-iris-test-result | 1/1 | Running | IRIS test result integration |
| consumer-lis-auto-verify | 1/1 | Running | Automatic result verification |
| consumer-lis-qc-data | 1/1 | Running | Quality control data processing |
| consumer-lis-sample-status | 1/1 | Running + HPA | Sample status updates (scales 1-10) |
| consumer-lis-test-results-sync | 1/1 | Running | Test result synchronization |
| consumer-lis-vid-attachment-upload | 2/2 | Running | Visit ID attachment uploads (2 replicas) |
| consumer-lis-work-order | 1/1 | Running + HPA | Work order processing (scales 1-4) |
| consumer-order | 1/1 | Running | Order processing |
| consumer-pdf-webhook | 2/2 | Running + HPA | PDF webhook events (scales 2-15) |
| consumer-status-attune | 1/1 | Running | Attune device status |
| consumer-status-attune-forwarder | 1/1 | Running | Attune status forwarder |
Scaled to 0:
- consumer-lis-attune-evoke-ai (x AI integration - inactive)
Batch Publishers (11 deployments)
Publish async jobs for background processing:
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| wrk--batch-publisher | 1/1 | Running | General batch job publisher |
| wrk--batch-publisher-audit | 1/1 | Running | Audit log publishing |
| wrk--batch-publisher-lis-auto-verify | 1/1 | Running | Auto-verification jobs |
| wrk--batch-publisher-lis-qc-data | 1/1 | Running | QC data jobs |
| wrk--batch-publisher-lis-test-result | 1/1 | Running | Test result processing jobs |
| wrk--batch-publisher-lis-test-result-corp | 1/1 | Running | Corporate test results |
| wrk--batch-publisher-lis-test-result-repush | 1/1 | Running | Result re-push jobs |
| wrk--batch-publisher-lis-test-results-sync | 1/1 | Running | Result sync jobs |
| wrk--batch-publisher-sample-status | 1/1 | Running | Sample status jobs |
| wrk--batch-publisher-work-order | 1/1 | Running | Work order jobs |
Scaled to 0:
- wrk--batch-publisher-lis-attune-evoke-ai (x AI integration - inactive)
Supporting Services (2 deployments)
| Name | Replicas | Status | Purpose |
|---|---|---|---|
| spc--lis--be--cron--prod | 1/1 | Running | Scheduled cron jobs |
| spc-lis-otel-collector | 1/1 | Running | OpenTelemetry trace collector |
Services
| Name | Type | Cluster IP | Ports | NodePort | Purpose |
|---|---|---|---|---|---|
| spc--lis--be--app--prod | NodePort | 10.8.27.117 | 80 | 32705 | Main LIS API |
| spc-lis-otel-collector | ClusterIP | 10.8.24.224 | 4317, 4318 | - | OTLP trace collection |
Access & Management
View all resources:
kubectl get all -n spc--lis
Check main application:
# View app pods
kubectl get pods -n spc--lis | grep "app--prod"
# Check HPA status
kubectl describe hpa spc--lis--be--app--prod -n spc--lis
# View logs
kubectl logs -f deployment/spc--lis--be--app--prod -n spc--lis
Check consumers:
# All consumers
kubectl get pods -n spc--lis | grep consumer
# KEDA-scaled consumers
kubectl get hpa -n spc--lis | grep keda
# Consumer logs
kubectl logs -f deployment/spc--lis--be--consumer-lis-sample-status--prod -n spc--lis
Check batch publishers:
# All workers
kubectl get pods -n spc--lis | grep "wrk--"
# Worker logs
kubectl logs -f deployment/spc--lis--be--wrk--batch-publisher-lis-test-result--prod -n spc--lis
Scaling:
# View current scaling
kubectl get hpa -n spc--lis
# Manual scale (HPA will override)
kubectl scale deployment spc--lis--be--app--prod -n spc--lis --replicas=10
# Check KEDA scaled objects
kubectl get scaledobjects -n spc--lis
Restart services:
# Restart main app
kubectl rollout restart deployment/spc--lis--be--app--prod -n spc--lis
# Restart specific consumer
kubectl rollout restart deployment/spc--lis--be--consumer-pdf-webhook--prod -n spc--lis
# Restart all consumers
kubectl get deployments -n spc--lis | grep consumer | awk '{print $1}' | xargs -I {} kubectl rollout restart deployment/{} -n spc--lis
Monitoring
Resource usage:
kubectl top pods -n spc--lis --sort-by=memory
kubectl top pods -n spc--lis --sort-by=cpu
HPA metrics:
# All HPAs
kubectl get hpa -n spc--lis
# Detailed HPA status
kubectl describe hpa spc--lis--be--app--prod -n spc--lis
kubectl describe hpa keda-hpa-spc--lis--be--consumer-pdf-webhook--prod -n spc--lis
Deployment status:
kubectl get deployments -n spc--lis
Events:
kubectl get events -n spc--lis --sort-by='.lastTimestamp' | head -20
Traces (OpenTelemetry):
# Check OTEL collector
kubectl logs -f deployment/spc-lis-otel-collector -n spc--lis
# Port forward to access OTEL endpoints
kubectl port-forward -n spc--lis deployment/spc-lis-otel-collector 4317:4317
Data Flow
External Requests (via APISIX/Traefik)
↓
spc--lis--be--app--prod (NodePort 32705)
↓
Main LIS API (6-20 replicas via HPA)
↓
Database (external)
↓
Events Published to Message Queue
↓
Consumers Process Events
↓
Batch Publishers Create Background Jobs
↓
Workers Process Jobs (in other namespaces)
↓
Results, Notifications, PDFs
OpenTelemetry Tracing
Application → OTEL Collector (4317/4318) → Backend (Grafana/Jaeger)
Laboratory Workflow
1. Test Ordering
- Orders created via API
- Work orders published to queue
consumer-order/consumer-lis-work-orderprocess
2. Sample Collection & Tracking
- Sample status updates
consumer-lis-sample-statusprocesses (KEDA-scaled)- Barcode scanning and tracking
3. Analysis & Results
- Analyzer integration (IRIS, Attune devices)
consumer-iris-test-resultprocesses results- Auto-verification via
consumer-lis-auto-verify
4. Quality Control
- QC data processing
consumer-lis-qc-datahandles QC eventswrk--batch-publisher-lis-qc-datapublishes QC jobs
5. Result Verification & Reporting
- Manual/auto verification
- PDF generation via
consumer-pdf-webhook - Result synchronization across systems
6. Attachments & Documents
- Visit attachments upload (2 replicas for reliability)
- PDF webhooks (2-15 replicas based on load)
Production Considerations
High Availability
Well Configured:
- Main API: 6 replicas with HPA (scales to 20)
- Critical consumers: 2 replicas (vid-attachment-upload, pdf-webhook)
- KEDA autoscaling for queue-based consumers
x Single Points of Failure:
- Most consumers: 1 replica
- All batch publishers: 1 replica
- Cron job: 1 replica
- OTEL collector: 1 replica
Auto-Scaling Configuration
| Workload | Type | Current | Min | Max | Scaling Metric |
|---|---|---|---|---|---|
| Main API | Standard HPA | 6 | 2 | 20 | CPU 80%, Mem 500Mi |
| Sample Status Consumer | KEDA | 1 | 1 | 10 | Queue depth 10 |
| Work Order Consumer | KEDA | 1 | 1 | 4 | Queue depth 10 |
| PDF Webhook Consumer | KEDA | 2 | 2 | 15 | Queue depth 50 |
Recommendations
-
Main API Scaling:
- Currently at 6 replicas (30% of max capacity)
- Consider lowering CPU threshold from 80% to 70% for faster response
- Monitor during peak hours
-
Consumer Reliability:
- Critical consumers at 1 replica - consider baseline of 2
- KEDA autoscaling configured but at minimum
- Review queue thresholds (10/50 messages)
-
Batch Publisher Resilience:
- All at 1 replica - single points of failure
- Consider 2 replicas for critical publishers:
- lis-test-result
- lis-test-result-corp
- sample-status
-
Observability:
- OTEL collector at 1 replica
- Consider 2+ replicas or use daemonset
- Monitor trace collection lag
-
Resource Cleanup:
- 2 deployments scaled to 0 (AI integration)
- Review and remove if permanently unused
-
Monitoring Priorities:
- Main API response times
- Queue depths and consumer lag
- PDF generation success rate
- Auto-verification accuracy
- Sample tracking accuracy
Troubleshooting
Main API issues:
# Check API pods
kubectl get pods -n spc--lis | grep "app--prod"
# Check HPA status
kubectl describe hpa spc--lis--be--app--prod -n spc--lis
# Check logs
kubectl logs -f deployment/spc--lis--be--app--prod -n spc--lis --tail=100
# Test API endpoint
kubectl port-forward -n spc--lis service/spc--lis--be--app--prod 8080:80
# Access http://localhost:8080
Consumer not processing:
# Check consumer status
kubectl get pods -n spc--lis | grep consumer-lis-sample-status
# Check KEDA scaler
kubectl describe scaledobject spc--lis--be--consumer-lis-sample-status--prod -n spc--lis
# Check logs
kubectl logs -f deployment/spc--lis--be--consumer-lis-sample-status--prod -n spc--lis
# Restart consumer
kubectl rollout restart deployment/spc--lis--be--consumer-lis-sample-status--prod -n spc--lis
HPA not scaling:
# Check HPA events
kubectl describe hpa spc--lis--be--app--prod -n spc--lis
# Check metrics server
kubectl top nodes
kubectl top pods -n spc--lis
# Check KEDA operator
kubectl get pods -n keda
kubectl logs -n keda deployment/keda-operator
PDF generation delays:
# Check PDF webhook consumer
kubectl get hpa -n spc--lis | grep pdf-webhook
kubectl describe hpa keda-hpa-spc--lis--be--consumer-pdf-webhook--prod -n spc--lis
# Check consumer logs
kubectl logs -f deployment/spc--lis--be--consumer-pdf-webhook--prod -n spc--lis
# Check queue depth (via KEDA)
kubectl describe scaledobject spc--lis--be--consumer-pdf-webhook--prod -n spc--lis
Result synchronization issues:
# Check sync consumer
kubectl logs -f deployment/spc--lis--be--consumer-lis-test-results-sync--prod -n spc--lis
# Check sync batch publisher
kubectl logs -f deployment/spc--lis--be--wrk--batch-publisher-lis-test-results-sync--prod -n spc--lis
# Restart both
kubectl rollout restart deployment/spc--lis--be--consumer-lis-test-results-sync--prod -n spc--lis
kubectl rollout restart deployment/spc--lis--be--wrk--batch-publisher-lis-test-results-sync--prod -n spc--lis
Tracing issues:
# Check OTEL collector
kubectl logs -f deployment/spc-lis-otel-collector -n spc--lis
# Check collector metrics
kubectl port-forward -n spc--lis deployment/spc-lis-otel-collector 8888:8888
# Access http://localhost:8888/metrics
Performance Metrics
Current Scale (Production Load)
- Main API: 6 replicas (moderate load, can scale to 20)
- Consumers:
- Critical: 2 replicas (attachment upload, PDF webhook)
- Standard: 1 replica with KEDA autoscaling
- Batch Publishers: 1 replica each
- Total Pods: ~30-35 pods in namespace
Scaling Behavior
- Main API HPA: Scales based on CPU (80% threshold)
- KEDA HPAs: Scale based on queue depth
- Sample status: 10 messages per replica
- Work order: 10 messages per replica
- PDF webhook: 50 messages per replica
Integration Points
External Systems
-
Lab Analyzers:
- IRIS (hematology analyzer)
- Attune (flow cytometer)
- Consumer-based integration
-
Corporate Systems:
- Test result synchronization
- Corporate reporting
-
PDF Generation:
- Webhook-based PDF generation
- Result reports, labels, certificates
-
Patient Portal:
- Test result delivery
- Notification integration
Important Notes
x PRODUCTION ENVIRONMENT:
- This is a CRITICAL PRODUCTION system handling medical laboratory data
- Downtime directly impacts patient care and laboratory operations
- Changes must be tested in staging first
- Coordinate with laboratory operations team
- Monitor carefully during deployments
- Have immediate rollback plan ready
Compliance: Laboratory data is subject to strict regulatory requirements (HIPAA, local medical regulations)