SaltStack Orchestration and the Reactor System

When Highstate Isn't Enough

state.highstate works great when your servers can all apply their configuration independently. Most of the time, that's the right model — declare what each server should look like, apply it, done. But there's a class of deployment problem where order matters across machines. Take a web application deployment: you want to drain a server from the load balancer before deploying new code to it, run the deployment, verify it started correctly, then add it back. You can't express that with highstate because highstate has no concept of "do this on server A, then do that on server B, then check a condition."

That's what Salt orchestration solves. And the reactor system extends that further: event-driven automation that responds to things happening in your infrastructure rather than waiting for you to kick off a run.

Orchestration with salt-run

Orchestration runs from the master using the salt-run command rather than salt. The distinction matters: salt runs things on minions and collects results; salt-run invokes runners, which are Python modules that execute on the master itself.

salt-run state.orchestrate orch.deploy

This runs the orchestration state defined at /srv/salt/orch/deploy.sls. Orchestration SLS files look like state files but they operate at a higher level — they target minions as a master-side operation and can sequence work across multiple targets.

Here's a real orchestration file for deploying an application update:

# /srv/salt/orch/deploy.sls

remove_from_lb:
  salt.function:
    - name: cmd.run
    - tgt: 'G@role:loadbalancer'
    - tgt_type: compound
    - arg:
      - 'haproxy-disable backend web01'

deploy_app:
  salt.state:
    - tgt: 'web01'
    - sls:
      - app.deploy
    - require:
      - salt: remove_from_lb

run_smoke_tests:
  salt.function:
    - name: cmd.run
    - tgt: 'web01'
    - arg:
      - '/opt/app/bin/smoke-test.sh'
    - require:
      - salt: deploy_app

add_to_lb:
  salt.function:
    - name: cmd.run
    - tgt: 'G@role:loadbalancer'
    - tgt_type: compound
    - arg:
      - 'haproxy-enable backend web01'
    - require:
      - salt: run_smoke_tests

Each step uses require to enforce ordering. salt.function calls an execution module on the target. salt.state applies a state SLS. salt.runner would invoke another runner on the master.

If any step fails, the orchestration halts at that point. The subsequent steps don't run. This is why orchestration is useful for deployments — if the smoke test fails, the server doesn't get added back to the load balancer.

To run it with verbose output:

salt-run state.orchestrate orch.deploy -l debug

The Event Bus

Salt maintains an event bus on the master. Every significant action emits an event: a minion comes online, a job completes, a state fails, a custom event fires. You can watch the bus in real time:

salt-run state.event pretty=True

Run this in one terminal while doing things in another. You'll see events like:

salt/minion/web01/start
salt/job/20160817143022/ret/web01
salt/auth

This is extremely useful for debugging reactor triggers — you can see exactly what event tag fires and what data comes with it before writing the reactor config.

The Reactor System

The reactor listens for events on the bus and triggers responses. Configure it in the master config. I keep a separate file at /etc/salt/master.d/reactor.conf:

reactor:
  - 'salt/minion/*/start':
    - /srv/reactor/minion_start.sls
  - 'salt/auth':
    - /srv/reactor/auth.sls

The reactor SLS file at /srv/reactor/minion_start.sls:

# When a new minion comes online, automatically apply its highstate
apply_highstate:
  local.state.highstate:
    - tgt: {{ data['id'] }}

data['id'] is the minion ID from the event payload. The local. prefix means it runs the function on a minion (as opposed to runner. which runs on the master or caller. which runs locally).

The result: any new minion that completes key acceptance and comes online will automatically get its highstate applied. For auto-scaling groups this is the difference between a self-configuring fleet and one that needs manual intervention after every scale-out event.

Sending Custom Events

From within a state or a running minion, you can fire custom events to the bus:

salt 'web01' event.send 'myapp/deploy/complete' '{"version": "1.4.2", "status": "ok"}'

Or from inside a state:

notify_deploy_complete:
  module.run:
    - name: event.send
    - tag: myapp/deploy/complete
    - data:
        version: {{ pillar['app_version'] }}
        status: ok

Then your reactor can listen for myapp/deploy/complete and trigger downstream actions — notifying a monitoring system, triggering a test run on a staging environment, whatever makes sense for your workflow. This is where Salt starts to feel less like a config management tool and more like infrastructure middleware.

A More Complex Reactor: Auth with Auto-Accept

A pattern I used for auto-scaling: automatically accept minion keys that match a naming convention, then apply highstate.

/srv/reactor/auth.sls:

{% if data['act'] == 'pend' and data['id'].startswith('web-asg-') %}
accept_minion:
  wheel.key.accept:
    - match: {{ data['id'] }}

apply_state:
  local.state.highstate:
    - tgt: {{ data['id'] }}
    - require:
      - wheel: accept_minion
{% endif %}

The Jinja2 conditional checks the event data before triggering anything. Only auto-scaling web minions get auto-accepted. Everything else still requires manual key acceptance. This is the kind of nuance that makes the reactor system worth learning — it's flexible enough to express real operational policies, not just simple triggers.

Honest Assessment

Orchestration and the reactor are genuinely powerful. They're also where Salt's complexity starts to bite. A few things I've learned:

The reactor fires asynchronously. If you fire an event and the reactor triggers a state, that state runs in the background — there's no built-in way to wait for it from the triggering context. If you need synchronous cross-minion coordination with return values, orchestration (salt-run) is the right tool, not the reactor.

Debugging failed reactor triggers requires correlating event bus output (salt-run state.event pretty=True) with the master log (/var/log/salt/master). The master log shows reactor evaluation errors; the event bus shows you whether the trigger fired at all.

Orchestration adds a real bottleneck: everything runs through the master, sequentially per require chain. For large fleets this can be slow. Batch operations that don't need cross-minion coordination are almost always better handled with state.highstate --batch.

My advice: don't reach for orchestration until you genuinely need it. If you're deploying to a single server type and order doesn't matter across machines, highstate with batching is simpler and less brittle. Orchestration earns its complexity when you have actual cross-service dependencies that need to be respected during deployments. The reactor earns its complexity when you're managing auto-scaling infrastructure and manual intervention doesn't scale. For everything else, keep it simple.