PowerShell Fleet Automation from a Linux Bastion
I have a confession: the first thing I do when I sit down at a Windows box is install Cygwin. I grew up on Linux — it's where I think clearly, where my muscle memory lives, where I reach instinctively when something needs to be automated. A Windows desktop without a proper terminal feels like trying to work in oven mitts.
This creates an interesting problem when your job is automating a few hundred Windows servers in a customer's VMware environment with no existing configuration management, no Ansible tower, and no budget timeline for building one. The Windows admins are comfortable in PowerShell. I'm comfortable in bash. The servers aren't reachable from the internet. The deadline is not comfortable.
My solution: spin up a Linux VM in their VMware environment, use it as the automation bastion, and drive everything from there via WinRM and Ansible's Windows modules. Write as much of the logic as possible in bash and Python on the Linux side; push only what has to be PowerShell down to the Windows hosts.
This post is about how that worked in practice.
The Setup: Linux Bastion in a Windows World
The VMware environment had several clusters with Windows Server 2012–2016 VMs across multiple subnets. WinRM (Windows Remote Management) was already enabled on most hosts — it's on by default in newer Windows Server versions and was something the Windows team had enabled years earlier for ad-hoc remote management. That was the wire I needed.
I deployed an Ubuntu 20.04 VM into their management VLAN. Network access to the Windows hosts on WinRM ports (5985 HTTP, 5986 HTTPS) was already permitted between management and workload VLANs. I installed Ansible, Python, and a handful of utilities, pointed it at a dynamic inventory built from their CMDB export, and had a working orchestration platform in an afternoon.
The Linux bastion gave me:
- Ansible with ansible.windows for structured automation with retries, inventory management, and idempotent state
- Python for pre/post processing, report generation, and anything needing complex logic
- Bash for orchestrating multi-step workflows, parallel execution, and the kind of glue code that would be awkward in PowerShell
- Git for version-controlling everything, including the PowerShell scripts we pushed to hosts
- pssh / parallel-ssh for the rare cases where I needed raw parallel shell access without Ansible overhead
The Windows hosts ran PowerShell. That was their job — execute what the bastion told them to execute, report results, exit cleanly.
Inventory from the CMDB
The customer had a CMDB that was loosely accurate. I wrote a Python script to pull a CSV export and generate an Ansible inventory:
#!/usr/bin/env python3
import csv
import json
import sys
def cmdb_to_inventory(csv_path):
inventory = {
"_meta": {"hostvars": {}},
"windows": {"hosts": [], "vars": {
"ansible_connection": "winrm",
"ansible_winrm_transport": "ntlm",
"ansible_winrm_server_cert_validation": "ignore",
"ansible_port": 5985
}},
"windows_iis": {"hosts": []},
"windows_sql": {"hosts": []},
}
with open(csv_path) as f:
for row in csv.DictReader(f):
if row['os_type'] != 'Windows':
continue
hostname = row['hostname'].strip().lower()
ip = row['ip_address'].strip()
inventory['windows']['hosts'].append(hostname)
inventory['_meta']['hostvars'][hostname] = {
"ansible_host": ip,
"env": row.get('environment', 'unknown'),
"role": row.get('role', 'unknown'),
}
if 'IIS' in row.get('role', ''):
inventory['windows_iis']['hosts'].append(hostname)
if 'SQL' in row.get('role', ''):
inventory['windows_sql']['hosts'].append(hostname)
return inventory
if __name__ == '__main__':
print(json.dumps(cmdb_to_inventory(sys.argv[1]), indent=2))
python3 cmdb_to_inventory.py servers.csv > inventory.json
ansible -i inventory.json windows -m win_ping
The win_ping sweep was always the first thing I ran against a new environment. It told me which hosts were actually reachable and which CMDB entries were stale. In this environment: about 12% of the inventory was gone — decommissioned servers that nobody had removed from the CMDB. Better to find that out with win_ping than mid-deployment.
WinRM Authentication
The customer used NTLM authentication (domain environment, no Kerberos configured for remote management, which is common in mid-sized shops). The Ansible winrm connection with ntlm transport worked fine from Linux once I had pywinrm and requests_ntlm installed:
pip3 install pywinrm requests-ntlm
For the credentials themselves, I used Ansible Vault rather than plaintext variables:
ansible-vault create group_vars/windows/vault.yml
# ansible_user: DOMAIN\svcaccount
# ansible_password: <password>
Then the playbooks use --ask-vault-pass or a vault password file. This is basic hygiene but worth stating: storing domain credentials in plaintext in a repo is a category of mistake that haunts environments for years.
The Core Pattern: Ansible Wrapper, PowerShell Payload
Most tasks followed the same structure: Ansible handles targeting, retry logic, and result collection; PowerShell does the Windows-specific work.
A typical playbook for collecting system state across the fleet:
---
- name: Collect Windows host inventory
hosts: windows
gather_facts: false
tasks:
- name: Get OS and hardware info
ansible.windows.win_powershell:
script: |
$os = Get-CimInstance Win32_OperatingSystem
$cs = Get-CimInstance Win32_ComputerSystem
$disks = Get-CimInstance Win32_LogicalDisk | Where-Object {$_.DriveType -eq 3}
@{
hostname = $env:COMPUTERNAME
os_name = $os.Caption
os_build = $os.BuildNumber
total_ram_gb = [math]::Round($cs.TotalPhysicalMemory / 1GB, 1)
cpu_cores = $cs.NumberOfLogicalProcessors
disks = $disks | ForEach-Object {
@{
drive = $_.DeviceID
size_gb = [math]::Round($_.Size / 1GB, 1)
free_gb = [math]::Round($_.FreeSpace / 1GB, 1)
free_pct = [math]::Round(($_.FreeSpace / $_.Size) * 100, 1)
}
}
} | ConvertTo-Json -Depth 3
register: host_info
- name: Save result
ansible.builtin.copy:
content: "{{ host_info.output[0] }}"
dest: "/tmp/inventory/{{ inventory_hostname }}.json"
delegate_to: localhost
Running this across 200 hosts took about 4 minutes with forks: 50 in ansible.cfg. The results landed in /tmp/inventory/ as JSON files on the Linux bastion, where I could process them with jq, Python, or feed them into reports.
This is the pattern I kept coming back to: collect JSON from PowerShell, process it on Linux. PowerShell's ConvertTo-Json is genuinely good. The Linux side is where I'm faster and where the tooling for data processing is richer.
Parallel Execution for Operational Tasks
For tasks that needed to run fast across large groups — patching checks, service restarts, disk space alerts — I used Ansible's async + poll: 0 to fire-and-forget and then gather results:
- name: Trigger Windows Update check on all hosts
hosts: windows
gather_facts: false
tasks:
- name: Check for available updates (async)
ansible.windows.win_powershell:
script: |
$session = New-Object -ComObject Microsoft.Update.Session
$searcher = $session.CreateUpdateSearcher()
$result = $searcher.Search("IsInstalled=0 and Type='Software'")
@{
pending_updates = $result.Updates.Count
titles = $result.Updates | ForEach-Object { $_.Title }
} | ConvertTo-Json
async: 300
poll: 0
register: update_check_job
- name: Wait for update checks to complete
ansible.builtin.async_status:
jid: "{{ update_check_job.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 30
delay: 10
The Windows Update COM object is slow — checking for updates on a single host can take 30–90 seconds. Running it synchronously across 200 hosts would serialize into an hours-long operation. Async brought it down to roughly the time it took the slowest host.
Bash Orchestration for Multi-Stage Workflows
Some workflows were too procedural for a single Ansible play — things like "drain this server from the load balancer, apply updates, verify it came back up, add it back." I wrote these as bash scripts on the bastion that called Ansible as a subprocess:
#!/bin/bash
# rolling_patch.sh — apply patches to a group with LB drain/restore
set -euo pipefail
HOSTS_FILE=$1
BATCH_SIZE=${2:-5}
if [[ ! -f "$HOSTS_FILE" ]]; then
echo "Usage: $0 <hosts_file> [batch_size]"
exit 1
fi
mapfile -t ALL_HOSTS < "$HOSTS_FILE"
TOTAL=${#ALL_HOSTS[@]}
echo "Starting rolling patch: $TOTAL hosts, batch size $BATCH_SIZE"
for ((i=0; i<TOTAL; i+=BATCH_SIZE)); do
BATCH=("${ALL_HOSTS[@]:$i:$BATCH_SIZE}")
BATCH_STR=$(IFS=,; echo "${BATCH[*]}")
echo ""
echo "── Batch $((i/BATCH_SIZE + 1)): ${BATCH[*]}"
echo " Draining from load balancer..."
ansible -i inventory.json "$BATCH_STR" \
-m ansible.windows.win_powershell \
-a "script='Set-HostLBState -Hostname $env:COMPUTERNAME -State Drain'" \
--become
echo " Applying patches..."
ansible-playbook -i inventory.json apply_patches.yml \
--limit "$BATCH_STR" \
--extra-vars "reboot_after=true"
echo " Verifying health..."
ansible -i inventory.json "$BATCH_STR" \
-m ansible.windows.win_powershell \
-a "script='Test-ServiceHealth'" \
| grep -E "(FAILED|unreachable|ok)" || true
echo " Restoring to load balancer..."
ansible -i inventory.json "$BATCH_STR" \
-m ansible.windows.win_powershell \
-a "script='Set-HostLBState -Hostname $env:COMPUTERNAME -State Active'" \
--become
echo " Batch complete. Sleeping 30s before next batch..."
sleep 30
done
echo ""
echo "Rolling patch complete."
This is exactly the kind of script that's natural to write in bash but awkward in PowerShell — it orchestrates Ansible runs, captures exit codes, handles batching, logs progress to stdout where I can watch it in a tmux session. The actual Windows work happens inside Ansible; the bash is just the conductor.
When I Had to Go Full PowerShell
Some things genuinely required native PowerShell with no good Ansible wrapper — usually anything involving COM objects, registry depth, or Windows-specific APIs that ansible.windows hadn't abstracted. For those I'd write a .ps1 file on the bastion, push it to the host, run it, and collect the results:
- name: Push and run diagnostic script
block:
- name: Copy script to host
ansible.windows.win_copy:
src: scripts/collect_wmi_deep.ps1
dest: C:\temp\collect_wmi_deep.ps1
- name: Execute script
ansible.windows.win_shell: |
C:\temp\collect_wmi_deep.ps1 | Out-File C:\temp\wmi_results.json
register: script_run
- name: Fetch results
ansible.windows.win_fetch:
src: C:\temp\wmi_results.json
dest: /tmp/results/{{ inventory_hostname }}_wmi.json
flat: true
- name: Cleanup
ansible.windows.win_file:
path: C:\temp\collect_wmi_deep.ps1
state: absent
The fetch-and-process-on-Linux pattern meant I didn't have to write data analysis in PowerShell. Push the data collection logic to Windows, pull the JSON back to Linux, process with Python. Each side does what it's best at.
The Cygwin Confession
I mentioned Cygwin at the top, and it's worth saying more. Part of why the Linux bastion approach worked so well here was that it encoded a philosophy I've always operated with: the tools you're fastest with are the right tools, and it's worth investing in getting your preferred tools into whatever environment you're working in rather than adapting to the environment's defaults.
On Windows desktops that I had to work from directly — jumping onto a server for manual diagnosis — I'd install Cygwin and be functional in minutes. grep, awk, curl, ssh, proper tab completion, a sane shell. The Windows team found this either impressive or baffling depending on who you asked. One of the senior Windows admins watched me grep through an IIS log with a regex from a bash prompt on his own server and asked if I'd broken something.
The Linux bastion was just the same instinct at infrastructure scale: rather than adapt my entire workflow to Windows PowerShell (which is genuinely capable, but not where my fluency is), I moved the environment closer to how I work. The Windows servers ran their workloads. The Linux bastion ran the automation. That division of responsibility was clean, and it worked.
What I'd Do Differently
WinRM over HTTP on port 5985 is fine inside a secure management VLAN. I'd push harder for HTTPS (5986) with certificate auth in environments where the management traffic traverses less-trusted segments. NTLM over cleartext is not something I'd accept outside a tightly controlled VLAN.
The CMDB-generated inventory was a constant source of friction — stale entries, incorrect IPs, missing role tags. In a longer engagement I'd invest more in building a live inventory from Active Directory or from the hypervisor's own API (vSphere has a good one) rather than a periodic CSV export.
The bash orchestration scripts grew organically and got messy. A proper Ansible role structure with task files and proper variable management would have been cleaner for anything that ran more than a few times. Bash is great for glue; it doesn't scale well as the primary automation layer once the workflow complexity grows.
But the core pattern — Linux bastion, Ansible with win_powershell, JSON in and out, process results on Linux — I'd use again without hesitation. It got a fleet of Windows servers into a managed state in a fraction of the time it would have taken to build native PowerShell remoting infrastructure, and it let me work in the environment where I'm most effective.
