PHP-FPM & Kubernetes: Stop Misconfiguring Your Process Manager

PHP-FPM (FastCGI Process Manager) is the component responsible for managing PHP worker processes in your application. Its configuration is often copy-pasted without truly being understood and that’s exactly what causes production issues, especially in cloud-native environments like Kubernetes.

In this article, I’ll show you why you need to adapt your PHP-FPM configuration to the underlying infrastructure, and how I simplified scalability management by combining PHP-FPM in static mode with Kubernetes’ Horizontal Pod Autoscaler (HPA).

Understanding PHP-FPM Process Manager Modes

PHP-FPM offers three modes via the pm (process manager) directive:

`pm = dynamic` (the default)

pm = dynamic
pm.max_children = 50
pm.start_servers = 5
pm.min_spare_servers = 5
pm.max_spare_servers = 35
pm.max_requests = 500

PHP-FPM starts with a minimum number of workers and spawns new ones as load increases, up to pm.max_children. When load drops, it kills the excess workers.

Pros:

Memory-efficient when traffic is low
Automatically adapts to load

Cons:

Spawning new workers introduces latency during sudden traffic spikes
Complex tuning: you need to balance 4 parameters correctly
Unpredictable behavior under heavy load

`pm = ondemand`

pm = ondemand
pm.max_children = 50
pm.process_idle_timeout = 10s

Workers are only created on demand and killed after an idle timeout. Even more memory-efficient, but even slower to absorb traffic spikes.

`pm = static`

pm = static
pm.max_children = 10

A fixed number of workers is started when the container launches and never changes.

The Problem with `dynamic` in Kubernetes

On a VM, dynamic mode makes sense: resources are shared, and PHP-FPM manages its own memory footprint based on load.

In Kubernetes, the reasoning is different. Each pod has explicitly defined resource requests and limits:

resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"

With pm = dynamic, PHP-FPM will attempt to create up to pm.max_children workers, potentially triggering an OOMKill if memory limits are exceeded, and introducing variable latency depending on whether workers are already warm or not.

Horizontal scaling is the HPA’s job not PHP-FPM’s.

My Approach: `static` + HPA

The idea is straightforward:

One pod = a fixed number of PHP-FPM workers. When more capacity is needed, Kubernetes creates more pods.

The Memory Contract: `memory_limit` × `pm.max_children`

This is where everything comes together. PHP exposes the memory_limit directive which caps the memory each worker can consume:

; php.ini
memory_limit = 128M

The calculation then becomes deterministic:

pm.max_children = available pod memory / memory_limit

Example with a 512Mi pod and ~64Mi system overhead:

(512 - 64) / 128 ≈ 3.5 → rounded down to 3 workers

These three settings form a coherent contract between PHP, PHP-FPM, and Kubernetes:

; php.ini
memory_limit = 128M

; php-fpm.conf
pm = static
pm.max_children = 3

# Kubernetes Deployment
resources:
  limits:
    memory: "512Mi"

In the worst case (every worker consuming its memory_limit simultaneously), the pod uses 3 × 128Mi + ~64Mi overhead ≈ 448Mi safely under the 512Mi limit. No OOMKill.

The classic mistake is setting pm.max_children without accounting for memory_limit, or worse, leaving memory_limit = -1 (unlimited) thinking it boosts performance. It’s a recipe for OOMKills.

HPA Configuration

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-app
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

When the average CPU across your pods exceeds 70%, the HPA creates new pods each with its workers already warm and ready to serve traffic.

Why This Is Better

	`dynamic` on K8s	`static` + HPA
Startup latency	Variable (workers spawned on demand)	None (workers ready immediately)
Memory predictability	Low	High
Tuning complexity	4 parameters to calibrate	1 parameter
Scalability ownership	PHP-FPM (inside the pod)	Kubernetes (HPA)
OOMKill risk	High	Controlled

The Most Common Mistakes I See

1. Copy-pasting `pm.max_children = 50` without thinking

If your memory_limit is 128M, 50 workers = 6.4 GB of potential memory usage. If your pod has 512Mi, that’s an OOMKill waiting to happen.

2. Leaving `pm = dynamic` by default in Kubernetes

This is the default setting in most PHP Docker images. It was never designed for containers with fixed resource boundaries.

3. Setting `memory_limit = -1`

Removing the per-process memory limit completely defeats the ability to reason about your pod’s memory usage. You can no longer predict how many workers are safe to run.

4. Not configuring `pm.max_requests`

Without this directive, PHP workers run indefinitely and accumulate memory leaks. A periodic restart after N requests is good hygiene:

pm.max_requests = 500

5. Ignoring PHP-FPM metrics in the HPA

CPU-based scaling works well for CPU-bound workloads. But if your app is I/O-bound (waiting on DB queries, external APIs), CPU stays low even under heavy load. In that case, exposing PHP-FPM metrics (via pm.status_path) and scaling on active worker count can be more relevant:

pm.status_path = /status

Bonus: Adapt Your Readiness Probe

With pm = static, your workers are ready immediately at startup. Make sure your readiness probe reflects that:

readinessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

No need for a high initialDelaySeconds the pod is operational as soon as PHP-FPM has started.

Conclusion

The rule I apply:

VM / bare metal → pm = dynamic, let PHP-FPM manage load
Kubernetes → pm = static, let the HPA manage scalability

This clear separation of responsibilities makes the system more predictable, easier to monitor, and eliminates an entire category of memory-related bugs.

The key insight is that memory_limit, pm.max_children, and your pod’s memory limit are not independent settings they form a contract. Define one, derive the others.

PHP-FPM configuration isn’t glamorous, but it’s often the difference between an app that holds under load and one that OOMKills at 200 req/s.

PHP-FPM & Kubernetes: Stop Misconfiguring Your Process Manager

Understanding PHP-FPM Process Manager Modes

pm = dynamic (the default)

pm = ondemand

pm = static

The Problem with dynamic in Kubernetes

My Approach: static + HPA

The Memory Contract: memory_limit × pm.max_children