Spinnaker is Down with the Error of "OOM command not allowed when used memory > 'maxmemory" (Redis)


Issue

The Spinnaker environment is down and unaccessible, and comes with the Redis error as following:

{"timestamp":1601481892xxx,"status":500,"error":"Internal Server Error","message":"OOM command not allowed when used memory > 'maxmemory'.; nested exception is redis.clients.jedis.exceptions.JedisDataException: OOM command not allowed when used memory > 'maxmemory'."}

Cause

The OOM command not allowed when used memory > ‘maxmemory’ error means that Redis was configured with a memory limit and that particular limit was reached.

In other words: its memory is full, it can’t store any new data.

Solution

In this situation, Redis memory usage is growing unbounded.  It’s probably because the orchestration engine stores all pipeline executions forever, and by default it is set to do so.

This can be solved by configuring the following setting in the ORCA service. 
In ~/.hal/$DEPLOYMENT/profiles/orca-local.yml file ($DEPLOYMENT is typically default) add the following content to orca-local.yml file, and then perform a hal deploy apply to deploy the changes. 

Note: If the orca-local.yml file doesn't exist, feel free to create it.

pollers:
  oldPipelineCleanup:
    enabled: true                  # This enables old pipeline execution cleanup (default: false)
    intervalMs: 3600000            # How many milliseconds between pipeline cleanup runs (default: 1hr or 3600000 milliseconds)
    thresholdDays: 30              # How old a pipeline execution must be to be deleted (default: 30)
    minimumPipelineExecutions: 5   # How many executions to keep around (default: 5)

tasks:
  daysOfExecutionHistory: 180      # How many days to keep old task executions around

As Tested On Version

2.20.x, 2.21.x