Skip to main content

chrome-headless

Overview

  • Namespace: chrome-headless
  • Purpose: Headless Chrome Services for PDF/Screenshot Generation - PRODUCTION
  • Age: ~682-705 days
  • Status: Active - Multiple Chrome instances for automation
  • Workloads: 4 deployments (4-5 pods, 1 in unknown state)
  • Environment: PRODUCTION - Browser automation and rendering

Architecture

Multiple headless Chrome instances for different purposes:

  • Primary: Main Chrome headless service (1 replica)
  • Backup 1: Backup Chrome instance (1 replica)
  • Backup 2: Second backup instance (1 replica)
  • Backup SVIP: SVIP-specific Chrome instance (1-2 replicas, 1 in unknown state)

Auto-Scaling Configuration

Not Applicable:

  • All deployments at fixed 1 replica each
  • No HPA configured
  • Manual scaling only

Workload Categories

Headless Chrome Services (4 deployments)

NameReplicasStatusPurpose
chrome-headless--app--prod1/1RunningPrimary Chrome headless
chrome-headless-backup-1--app--prod1/1RunningBackup instance 1
chrome-headless-backup-2--app--prod1/1RunningBackup instance 2 (recent update)
chrome-headless-backup-svip--app--prod1/1 (2 pods)Running/UnknownSVIP Chrome (1 pod in unknown state)

Pod Issues

ContainerStatusUnknown:

  • chrome-headless-backup-svip--app--prod-7ff7985d9f-rgdsv (382 days old, 5 restarts)
  • New pod created 65 days ago to replace

Services

NameTypeCluster IPPortsPurpose
chrome-headless--app--prodClusterIP10.8.25.83000Primary Chrome service
chrome-headless-backup-1--app--prodClusterIP10.8.24.303000Backup 1 service
chrome-headless-backup-2--app--prodClusterIP10.8.25.1913000Backup 2 service
chrome-headless-backup-svip--app--prodClusterIP10.8.27.943000SVIP service

Access & Management

View all resources:

kubectl get all -n chrome-headless

Check pod status:

# View all pods
kubectl get pods -n chrome-headless

# Check for issues
kubectl get pods -n chrome-headless | grep -v "Running"

# View logs
kubectl logs -f deployment/chrome-headless--app--prod -n chrome-headless

Test Chrome service:

# Port forward to test
kubectl port-forward -n chrome-headless service/chrome-headless--app--prod 3000:3000

# Test endpoint (example)
curl http://localhost:3000/health

Restart services:

# Restart primary
kubectl rollout restart deployment/chrome-headless--app--prod -n chrome-headless

# Restart all
kubectl rollout restart deployment --all -n chrome-headless

Monitoring

Pod metrics:

kubectl top pods -n chrome-headless

# Sort by memory (Chrome is memory-intensive)
kubectl top pods -n chrome-headless --sort-by=memory

Check pod status:

# All pods
kubectl get pods -n chrome-headless -o wide

# Check restarts
kubectl get pods -n chrome-headless -o json | jq '.items[] | {name: .metadata.name, restarts: .status.containerStatuses[0].restartCount}'

Events:

kubectl get events -n chrome-headless --sort-by='.lastTimestamp' | head -20

Data Flow

Internal Services (SPC, IMP, etc.)

Chrome Headless Services (Port 3000)

Puppeteer/Chrome API

HTML Rendering / PDF Generation

Return Generated PDF/Screenshot

Chrome Headless Workflow

1. Primary Service

  • chrome-headless--app--prod (1 replica)
  • Main Chrome instance
  • Handles general requests
  • 686 days old (very stable)

2. Backup Services

  • backup-1: General backup instance
  • backup-2: Recently updated backup (4h ago)
  • Load distribution across instances
  • Failover capability

3. SVIP Service

  • Dedicated Chrome instance for SVIP customers
  • Higher priority workloads
  • One pod in unknown state (needs attention)

4. Use Cases

  • PDF generation from HTML
  • Screenshot capture
  • Web page rendering
  • Print preview generation
  • Test result PDF creation

Production Considerations

High Availability

Multiple Instances:

  • 4 separate deployments (primary + 3 backups)
  • Each at 1 replica (No HA per service)
  • Load distribution across services

Issues:

  • One pod in ContainerStatusUnknown state (SVIP)
  • No redundancy per service (all 1 replica)

Recommendations

  1. Fix SVIP Pod:

    • URGENT: chrome-headless-backup-svip pod in unknown state
    • Delete the old pod: kubectl delete pod chrome-headless-backup-svip--app--prod-7ff7985d9f-rgdsv -n chrome-headless
    • Scale down old ReplicaSet if needed
  2. Resource Monitoring:

    • Chrome is memory-intensive
    • Monitor memory usage closely
    • Set appropriate memory limits
    • Watch for OOM kills
  3. Scaling Strategy:

    • Consider HPA if load increases
    • Chrome pods are expensive (high memory)
    • May need multiple replicas per service for HA
  4. Performance Tuning:

    • Monitor response times
    • Check Chrome process limits
    • Adjust memory/CPU resources
    • Consider page pool size limits
  5. Service Separation:

    • SVIP has dedicated service (good)
    • Consider separate services for different workload types
    • Monitor usage patterns per service

Troubleshooting

Pod in unknown state:

# Check the problem pod
kubectl describe pod chrome-headless-backup-svip--app--prod-7ff7985d9f-rgdsv -n chrome-headless

# Delete problematic pod
kubectl delete pod chrome-headless-backup-svip--app--prod-7ff7985d9f-rgdsv -n chrome-headless

# Scale deployment to force recreation
kubectl rollout restart deployment/chrome-headless-backup-svip--app--prod -n chrome-headless

Chrome service not responding:

# Check pod logs
kubectl logs -f deployment/chrome-headless--app--prod -n chrome-headless

# Check for OOM kills
kubectl describe pod -n chrome-headless | grep -i "oom\|killed"

# Check resource usage
kubectl top pods -n chrome-headless

# Restart service
kubectl rollout restart deployment/chrome-headless--app--prod -n chrome-headless

High memory usage:

# Check memory usage
kubectl top pods -n chrome-headless --sort-by=memory

# Check resource limits
kubectl describe deployment chrome-headless--app--prod -n chrome-headless | grep -A 10 "Limits\|Requests"

# Check for OOM events
kubectl get events -n chrome-headless | grep -i "oom"

PDF generation failures:

# Check logs for errors
kubectl logs deployment/chrome-headless--app--prod -n chrome-headless --tail=100 | grep -i "error\|fail"

# Test Chrome service
kubectl port-forward -n chrome-headless service/chrome-headless--app--prod 3000:3000
# curl http://localhost:3000/health

# Check all Chrome services
for svc in chrome-headless--app--prod chrome-headless-backup-1--app--prod chrome-headless-backup-2--app--prod chrome-headless-backup-svip--app--prod; do
echo "=== $svc ==="
kubectl logs deployment/$svc -n chrome-headless --tail=20
done

Performance Metrics

Current Scale

  • Total Deployments: 4
  • Total Pods: 4-5 (1 in unknown state)
  • Memory: High (Chrome is memory-intensive)
  • Age: 682-705 days (very mature)

Stability

  • Primary: 382 days old (very stable)
  • Backup 1: 382 days old with 9 restarts (moderate)
  • Backup 2: Recently updated (4h ago)
  • SVIP: 65 days old, 1 pod stuck in unknown state

Architecture Notes

  • Headless Chrome: Puppeteer/Chrome for server-side rendering
  • Multiple Services: Primary + backups for redundancy
  • SVIP Separation: Dedicated service for VIP customers
  • Resource Intensive: High memory requirements per pod
  • Critical Service: Used for PDF generation across multiple applications