chrome-headless

Overview

Namespace: chrome-headless
Purpose: Headless Chrome Services for PDF/Screenshot Generation - PRODUCTION
Age: ~682-705 days
Status: Active - Multiple Chrome instances for automation
Workloads: 4 deployments (4-5 pods, 1 in unknown state)
Environment: PRODUCTION - Browser automation and rendering

Architecture

Multiple headless Chrome instances for different purposes:

Primary: Main Chrome headless service (1 replica)
Backup 1: Backup Chrome instance (1 replica)
Backup 2: Second backup instance (1 replica)
Backup SVIP: SVIP-specific Chrome instance (1-2 replicas, 1 in unknown state)

Auto-Scaling Configuration

Not Applicable:

All deployments at fixed 1 replica each
No HPA configured
Manual scaling only

Workload Categories

Headless Chrome Services (4 deployments)

Name	Replicas	Status	Purpose
chrome-headless--app--prod	1/1	Running	Primary Chrome headless
chrome-headless-backup-1--app--prod	1/1	Running	Backup instance 1
chrome-headless-backup-2--app--prod	1/1	Running	Backup instance 2 (recent update)
chrome-headless-backup-svip--app--prod	1/1 (2 pods)	Running/Unknown	SVIP Chrome (1 pod in unknown state)

Pod Issues

ContainerStatusUnknown:

chrome-headless-backup-svip--app--prod-7ff7985d9f-rgdsv (382 days old, 5 restarts)
New pod created 65 days ago to replace

Services

Name	Type	Cluster IP	Ports	Purpose
chrome-headless--app--prod	ClusterIP	10.8.25.8	3000	Primary Chrome service
chrome-headless-backup-1--app--prod	ClusterIP	10.8.24.30	3000	Backup 1 service
chrome-headless-backup-2--app--prod	ClusterIP	10.8.25.191	3000	Backup 2 service
chrome-headless-backup-svip--app--prod	ClusterIP	10.8.27.94	3000	SVIP service

Access & Management

View all resources:

kubectl get all -n chrome-headless

Check pod status:

# View all pods
kubectl get pods -n chrome-headless

# Check for issues
kubectl get pods -n chrome-headless | grep -v "Running"

# View logs
kubectl logs -f deployment/chrome-headless--app--prod -n chrome-headless

Test Chrome service:

# Port forward to test
kubectl port-forward -n chrome-headless service/chrome-headless--app--prod 3000:3000

# Test endpoint (example)
curl http://localhost:3000/health

Restart services:

# Restart primary
kubectl rollout restart deployment/chrome-headless--app--prod -n chrome-headless

# Restart all
kubectl rollout restart deployment --all -n chrome-headless

Monitoring

Pod metrics:

kubectl top pods -n chrome-headless

# Sort by memory (Chrome is memory-intensive)
kubectl top pods -n chrome-headless --sort-by=memory

Check pod status:

# All pods
kubectl get pods -n chrome-headless -o wide

# Check restarts
kubectl get pods -n chrome-headless -o json | jq '.items[] | {name: .metadata.name, restarts: .status.containerStatuses[0].restartCount}'

Events:

kubectl get events -n chrome-headless --sort-by='.lastTimestamp' | head -20

Data Flow

Internal Services (SPC, IMP, etc.)
    ↓
Chrome Headless Services (Port 3000)
    ↓
Puppeteer/Chrome API
    ↓
HTML Rendering / PDF Generation
    ↓
Return Generated PDF/Screenshot

Chrome Headless Workflow

1. Primary Service

chrome-headless--app--prod (1 replica)
Main Chrome instance
Handles general requests
686 days old (very stable)

2. Backup Services

backup-1: General backup instance
backup-2: Recently updated backup (4h ago)
Load distribution across instances
Failover capability

3. SVIP Service

Dedicated Chrome instance for SVIP customers
Higher priority workloads
One pod in unknown state (needs attention)

4. Use Cases

PDF generation from HTML
Screenshot capture
Web page rendering
Print preview generation
Test result PDF creation

Production Considerations

High Availability

Multiple Instances:

4 separate deployments (primary + 3 backups)
Each at 1 replica (No HA per service)
Load distribution across services

Issues:

One pod in ContainerStatusUnknown state (SVIP)
No redundancy per service (all 1 replica)

Recommendations

Fix SVIP Pod:
- URGENT: chrome-headless-backup-svip pod in unknown state
- Delete the old pod: kubectl delete pod chrome-headless-backup-svip--app--prod-7ff7985d9f-rgdsv -n chrome-headless
- Scale down old ReplicaSet if needed
Resource Monitoring:
- Chrome is memory-intensive
- Monitor memory usage closely
- Set appropriate memory limits
- Watch for OOM kills
Scaling Strategy:
- Consider HPA if load increases
- Chrome pods are expensive (high memory)
- May need multiple replicas per service for HA
Performance Tuning:
- Monitor response times
- Check Chrome process limits
- Adjust memory/CPU resources
- Consider page pool size limits
Service Separation:
- SVIP has dedicated service (good)
- Consider separate services for different workload types
- Monitor usage patterns per service

Troubleshooting

Pod in unknown state:

# Check the problem pod
kubectl describe pod chrome-headless-backup-svip--app--prod-7ff7985d9f-rgdsv -n chrome-headless

# Delete problematic pod
kubectl delete pod chrome-headless-backup-svip--app--prod-7ff7985d9f-rgdsv -n chrome-headless

# Scale deployment to force recreation
kubectl rollout restart deployment/chrome-headless-backup-svip--app--prod -n chrome-headless

Chrome service not responding:

# Check pod logs
kubectl logs -f deployment/chrome-headless--app--prod -n chrome-headless

# Check for OOM kills
kubectl describe pod -n chrome-headless | grep -i "oom\|killed"

# Check resource usage
kubectl top pods -n chrome-headless

# Restart service
kubectl rollout restart deployment/chrome-headless--app--prod -n chrome-headless

High memory usage:

# Check memory usage
kubectl top pods -n chrome-headless --sort-by=memory

# Check resource limits
kubectl describe deployment chrome-headless--app--prod -n chrome-headless | grep -A 10 "Limits\|Requests"

# Check for OOM events
kubectl get events -n chrome-headless | grep -i "oom"

PDF generation failures:

# Check logs for errors
kubectl logs deployment/chrome-headless--app--prod -n chrome-headless --tail=100 | grep -i "error\|fail"

# Test Chrome service
kubectl port-forward -n chrome-headless service/chrome-headless--app--prod 3000:3000
# curl http://localhost:3000/health

# Check all Chrome services
for svc in chrome-headless--app--prod chrome-headless-backup-1--app--prod chrome-headless-backup-2--app--prod chrome-headless-backup-svip--app--prod; do
  echo "=== $svc ==="
  kubectl logs deployment/$svc -n chrome-headless --tail=20
done

Performance Metrics

Current Scale

Total Deployments: 4
Total Pods: 4-5 (1 in unknown state)
Memory: High (Chrome is memory-intensive)
Age: 682-705 days (very mature)

Stability

Primary: 382 days old (very stable)
Backup 1: 382 days old with 9 restarts (moderate)
Backup 2: Recently updated (4h ago)
SVIP: 65 days old, 1 pod stuck in unknown state

Architecture Notes

Headless Chrome: Puppeteer/Chrome for server-side rendering
Multiple Services: Primary + backups for redundancy
SVIP Separation: Dedicated service for VIP customers
Resource Intensive: High memory requirements per pod
Critical Service: Used for PDF generation across multiple applications

Overview​

Architecture​

Auto-Scaling Configuration​

Workload Categories​

Headless Chrome Services (4 deployments)​

Pod Issues​

Services​

Access & Management​

View all resources:​

Check pod status:​

Test Chrome service:​

Restart services:​

Monitoring​

Pod metrics:​

Check pod status:​

Events:​

Data Flow​

Chrome Headless Workflow​

1. Primary Service​

2. Backup Services​

3. SVIP Service​

4. Use Cases​

Production Considerations​

High Availability​

Recommendations​

Troubleshooting​

Pod in unknown state:​

Chrome service not responding:​

High memory usage:​

PDF generation failures:​

Performance Metrics​

Current Scale​

Stability​

Architecture Notes​

Overview

Architecture

Auto-Scaling Configuration

Workload Categories

Headless Chrome Services (4 deployments)

Pod Issues

Services

Access & Management

View all resources:

Check pod status:

Test Chrome service:

Restart services:

Monitoring

Pod metrics:

Check pod status:

Events:

Data Flow

Chrome Headless Workflow

1. Primary Service

2. Backup Services

3. SVIP Service

4. Use Cases

Production Considerations

High Availability

Recommendations

Troubleshooting

Pod in unknown state:

Chrome service not responding:

High memory usage:

PDF generation failures:

Performance Metrics

Current Scale

Stability

Architecture Notes