Production1 Cluster (asia-southeast1)
Cluster Information
- Cluster Name:
gke_diagvn_asia-southeast1_production1 - Project: diagvn
- Region: Asia Southeast 1 (Singapore)
- Zone: Regional (multi-zone)
- Environment: Production
- Kubernetes Version: v1.30.9-gke.1127000 / v1.30.3-gke.1969002
Control Plane
- API Server: https://35.185.183.71
- Status: Running
- Components:
- GLBCDefaultBackend
- KubeDNS
- Metrics-server
Cluster Resources
Nodes (10 nodes across 4 node pools)
| Node Name | Node Pool | Machine Type | CPU | Memory | Age | Usage |
|---|---|---|---|---|---|---|
| gke-production1-diag-vpn-enable-pool-cf6245a3-ckjf | diag-vpn-enable-pool | e2-standard-4 | 4 | 16GB | 224d | CPU: 15%, Mem: 56% |
| gke-production1-high-performance-a0df873e-g9nd | high-performance | t2d-standard-2 | 2 | 8GB | 10h | CPU: 26%, Mem: 68% |
| gke-production1-high-performance-a0df873e-sjtp | high-performance | t2d-standard-2 | 2 | 8GB | 16d | CPU: 18%, Mem: 57% |
| gke-production1-high-performance-a0df873e-vg7v | high-performance | t2d-standard-2 | 2 | 8GB | 14d | CPU: 39%, Mem: 62% |
| gke-production1-production-asia-south-7b3f9129-hijo | production-asia-southeast1-b | e2-standard-4 | 4 | 16GB | 381d | CPU: 29%, Mem: 84% |
| gke-production1-production-asia-south-8d9044df-jiw3 | production-asia-southeast1-b | e2-standard-4 | 4 | 16GB | 381d | CPU: 33%, Mem: 63% |
| gke-production1-production-asia-south-8d9044df-uilb | production-asia-southeast1-b | e2-standard-4 | 4 | 16GB | 381d | CPU: 31%, Mem: 65% |
| gke-production1-production-asia-south-e4fd6e82-e28p | production-asia-southeast1-b | e2-standard-4 | 4 | 16GB | 381d | CPU: 47%, Mem: 93% x |
| gke-production1-production-asia-south-e4fd6e82-ex49 | production-asia-southeast1-b | e2-standard-4 | 4 | 16GB | 381d | CPU: 98% xx, Mem: 93% x |
| gke-production1-production-chrome-spo-409316b8-hlrk | production-chrome-spot (Spot) | e2-standard-8 | 8 | 32GB | 4h | CPU: 5%, Mem: 4% |
Total Capacity: 42 vCPUs, 160 GB RAM
xx CRITICAL: Node ex49 has 98% CPU usage and 93% memory usage
x WARNING: Node e28p has 93% memory usage
Node Pools Summary
| Pool | Machine Type | Spot | Nodes | Total vCPUs | Total Memory | Purpose |
|---|---|---|---|---|---|---|
| diag-vpn-enable-pool | e2-standard-4 | No | 1 | 4 | 16GB | VPN-enabled workloads |
| high-performance | t2d-standard-2 | No | 3 | 6 | 24GB | High-performance workloads |
| production-asia-southeast1-b | e2-standard-4 | No | 5 | 20 | 80GB | Main production workloads |
| production-chrome-spot | e2-standard-8 | Yes | 1 | 8 | 32GB | Chrome headless (cost-optimized) |
Workload Overview
- Total Deployments: 375
- Total Namespaces: 88
- Active Namespaces with Workloads: 66
- HorizontalPodAutoscalers: 40
- StatefulSets: Multiple (databases, etcd, monitoring)
Access
To connect to this cluster:
# Switch context
kubectx gke_diagvn_asia-southeast1_production1
# Verify connection
kubectl cluster-info
# View nodes
kubectl get nodes
# View all resources
kubectl get all --all-namespaces
External Access Points
| Service | Namespace | External IP | Ports | Purpose |
|---|---|---|---|---|
| apisix-gateway | apisix | 34.87.114.121 | 80, 443, 2222 | API Gateway |
| traefik | default | 34.126.178.249 | 80, 443 | Primary ingress controller |
| workflow-management | workflow-management | 34.124.238.59 | 5678 | n8n workflow automation |
Application Namespaces (Organized by Domain)
Infrastructure & Platform Services
| Namespace | Purpose | Deployments | Status |
|---|---|---|---|
| apisix | API Gateway | Multiple | Active |
| etcd | Key-value store | StatefulSet | Active |
| monitoring | Prometheus/Grafana stack | 10 deployments | Active |
| keda | Event-driven autoscaling | 3 | Active |
| sentry | Error tracking | Active | Active |
| sentry-relay | Error tracking relay | 2 | Active |
| netsuite-producer | NetSuite ERP Event Producer | 0 | Empty |
Patient Portal (PAT) Services
| Namespace | Purpose | Deployments | Status |
|---|---|---|---|
| pat--authenticate | Authentication service | Multiple | Active |
| pat--booking--be | Booking backend | 14 | Active |
| pat--patient-mgt--be | Patient management | Multiple | Active |
| pat--test-result--be | Test results | 21 | Active |
| pat--notification--be | Notifications | 8 | Active |
| pat--reminder-booking--be | Booking reminders | Multiple | Active |
| pat--dtc--be | DTC backend | Multiple | Active |
| pat--dlq--be | Dead letter queue | Multiple | Active |
| pat--webapp | Patient web application | Multiple | Active |
Referral/Results (RFD) Services
| Namespace | Purpose | Deployments | Status |
|---|---|---|---|
| rfd--webapp | RFD web application | 7 | Active |
| rfd--user-mgt | User management | 7 | Active |
| rfd--notification | Notifications | 8 | Active |
| rfd--test-library--be | Test library backend | Multiple | Active |
| rfd--doctor-kyc | Doctor KYC | Multiple | Active |
| rfd--doctor-statement | Doctor statements | Multiple | Active |
| rfd--doctor-test-result | Doctor test results | Multiple | Active |
| rfd--order-history | Order history | Multiple | Active |
| rfd--dashboard | Dashboard | Multiple | Active |
| rfd--mixpanel-integration | Analytics integration | Multiple | Active |
Sapoche Clinic (SPC) Services
| Namespace | Purpose | Deployments | Status |
|---|---|---|---|
| spc--lis | Laboratory Information System | 26 | Active |
| spc--pos | Point of Sale | 18 | Active |
| spc--webapp | Clinic web application | 7 | Active |
| spc--appointment--be | Appointments | 8 | Active |
| spc--audit | Audit logging | Multiple | Active |
| spc--delivery--be | Delivery management | Multiple | Active |
| spc--promotion--be | Promotions | Multiple | Active |
| spc--booking-qs | Booking queue system | Multiple | Active |
| spc--noti-centre--be | Notification center | Multiple | Active |
| spc--pdf-generate--be | PDF generation | 10 | Active |
| spc--purchase-order--be | Purchase orders | Multiple | Active |
| spc--websocket-server | WebSocket server | Multiple | Active |
Sapoche Core Services
| Namespace | Purpose | Deployments | Status |
|---|---|---|---|
| sapoche | Main Sapoche application | 39 | Active |
| sapoche-micro-finance | Finance microservice | 1 | Active |
| sapoche-micro-search-pid | Patient ID search | 1 | Active |
| sapoche-micro-search-vid | Visit ID search | 1 | Active |
Data Streaming Services
| Namespace | Purpose | Deployments | Status |
|---|---|---|---|
| data-streaming | Core data streaming | 3 | Inactive (Scaled to 0) |
| data-streaming-consumers | Consumer services (6 namespaces) | Multiple | Active |
| ds-ns-producer | NetSuite producer | 3 | Active |
| prod-ds-producer | Production data streaming | Multiple | Active |
| netsuite-consumer | NetSuite consumer | 11 | Active |
DIAG Internal Applications
| Namespace | Purpose | Deployments | Status |
|---|---|---|---|
| diagcorp | DIAG corporate portal | 4 (all scaled to 0) | Inactive |
| diagpass | DIAG employee pass system | 3 (all scaled to 0) | Inactive |
| diag-accounting-team | Accounting tools | 1 | Active |
Supporting Services
| Namespace | Purpose | Deployments | Status |
|---|---|---|---|
| imp | IMP application | 26 | Active |
| ecg | ECG services | 3 + CronJob | Active |
| signature | Signature services | 1 active, 4 scaled to 0 | Partially Active |
| webportal | Web portal | Multiple | Active |
| short-links | URL shortener | Multiple | Active |
| dynamic-links | Dynamic link generator | Multiple | Active |
| chrome-headless | Headless Chrome for automation | 4 | Active |
| external-container | External containers | Multiple | Active |
| scalar-docs | API documentation | Multiple | Active |
| workflow-management | n8n workflow automation | 1 | Active |
| auto-assign-bot | Automated assignment | Multiple | Active |
| slackbot | Slack integration | Multiple | Active |
| zendesk | Customer support integration | Multiple | Active |
| fingerprint | Biometric authentication | Multiple | Active |
| nocodb | No-code database | Multiple | Active |
| pmm | Performance monitoring | Multiple | Active |
Critical Alerts
Resource Pressure
xx IMMEDIATE ACTION REQUIRED:
Node gke-production1-production-asia-south-e4fd6e82-ex49:
- CPU Usage: 98% (CRITICAL)
- Memory Usage: 93% (CRITICAL)
- Age: 381 days
- Recommended Actions:
- Identify high-consuming pods:
kubectl top pods --all-namespaces --sort-by=cpu - Consider horizontal scaling of workloads
- Investigate if pods can be migrated to other nodes
- Add more nodes to the pool or scale up existing pods' requests
- Identify high-consuming pods:
Node gke-production1-production-asia-south-e4fd6e82-e28p:
- Memory Usage: 93% (WARNING)
- Recommended Action: Monitor closely, prepare to scale
Performance Optimization
- Node Distribution: 5 nodes in main pool show varying load (29-98% CPU)
- Spot Node: Chrome spot node significantly underutilized (5% CPU, 4% mem)
- HPAs: 40 HPAs configured - verify they're responding appropriately
Important Notes
- This is a production cluster - exercise extreme caution when making changes
- CRITICAL resource pressure on node
ex49requires immediate attention - Large scale: 375 deployments across 88 namespaces
- Multiple major applications: PAT (patient portal), RFD (referral), SPC (clinic), Sapoche
- Microservices architecture with domain-based namespace organization
- 40 HPAs configured for auto-scaling
Quick Commands
# Check critical node
kubectl top node gke-production1-production-asia-south-e4fd6e82-ex49
kubectl describe node gke-production1-production-asia-south-e4fd6e82-ex49
# Find high-consuming pods
kubectl top pods --all-namespaces --sort-by=cpu | head -20
kubectl top pods --all-namespaces --sort-by=memory | head -20
# View resource usage by namespace
kubectl top pods --all-namespaces --sort-by=cpu | awk '{print $1}' | sort | uniq -c | sort -rn
# View HPAs
kubectl get hpa --all-namespaces
# View all LoadBalancer services
kubectl get svc --all-namespaces --field-selector spec.type=LoadBalancer
# Check pod distribution per node
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=gke-production1-production-asia-south-e4fd6e82-ex49
Architecture Overview
Production1 cluster follows a microservices architecture organized by business domains:
- PAT: Patient-facing portal services (9 namespaces)
- RFD: Referral and test results management (10 namespaces)
- SPC: Sapoche clinic management system (12 namespaces)
- Sapoche: Core Sapoche platform and microservices (4 namespaces)
- Data Streaming: Event streaming and integration services (8 namespaces)
- Infrastructure: Supporting platform services (monitoring, API gateway, databases)
- DIAG Internal: Corporate applications (diagcorp, diagpass, accounting)
Each domain has multiple microservices deployed in separate namespaces for isolation and independent scaling.