Automating VMware with Ansible
When I started at Penn Engineering, the VMware infrastructure was managed the way most long-running enterprise VMware environments are managed: through the vSphere Web Client, by hand, by whoever needed to do the thing. There was documentation, some of it accurate, describing what VMs existed and what they were for. There was no automation, no version control for VM configurations, and no consistent way to push a change across all the machines that needed it.
The DR project gave us a forcing function to fix this. You can't reliably automate disaster recovery if you don't know exactly what your source environment looks like. Bringing the VMware fleet under Ansible control was the prerequisite.
Prerequisites: pyvmomi
The first thing I'll tell you because I wasted time on it: before any VMware Ansible module will work, you need pyvmomi installed on the Ansible control node. It's the Python SDK for the VMware vSphere API, and the community.vmware collection uses it internally.
pip install pyvmomi
This is not mentioned prominently enough in the documentation. You'll install the collection, write a perfectly valid playbook, run it, and get a confusing error about a missing Python module. Install pyvmomi first.
ansible-galaxy collection install community.vmware
pip install pyvmomi
Authentication
Every task in the community.vmware collection needs connection details for your vCenter server. These go in playbook vars and should always be pulled from Ansible Vault for anything that touches production:
vars:
vcenter_hostname: vcenter.internal.upenn.edu
vcenter_username: "{{ vault_vcenter_username }}"
vcenter_password: "{{ vault_vcenter_password }}"
vcenter_datacenter: Penn-Engineering-DC
validate_certs: false # Internal CA, not in standard trust store
validate_certs: false is a pragmatic choice for internal infrastructure with a private CA. If you can add your CA to the trust store, do that instead.
Dynamic Inventory with vmware_vm_inventory
Static inventory files don't work at scale. Penn Engineering had hundreds of VMs across multiple clusters and datacenters. The vmware_vm_inventory plugin discovers all of them from vCenter automatically and groups them by datacenter, cluster, host, folder, and resource pool.
The inventory plugin config goes in a YAML file (I named it vmware.yml):
plugin: community.vmware.vmware_vm_inventory
hostname: vcenter.internal.upenn.edu
username: "{{ lookup('env', 'VCENTER_USERNAME') }}"
password: "{{ lookup('env', 'VCENTER_PASSWORD') }}"
validate_certs: false
with_tags: true
hostnames:
- config.name
properties:
- name
- config.guestId
- summary.runtime.powerState
- summary.guest.ipAddress
- config.hardware.numCPU
- config.hardware.memoryMB
Run with ansible-inventory -i vmware.yml --list to see what it discovers. The with_tags: true option pulls vSphere tags, which becomes important for grouping.
After discovery, you have groups like datacenter_Penn_Engineering_DC, cluster_Research_Compute, folder_DR_Protected. Your playbooks target these groups instead of manually maintained host lists.
Gathering VM Information
A common first task is auditing what you have. vmware_guest_info is the module for this:
- name: Gather VM inventory
hosts: localhost
gather_facts: false
vars:
vcenter_hostname: vcenter.internal.upenn.edu
vcenter_username: "{{ vault_vcenter_username }}"
vcenter_password: "{{ vault_vcenter_password }}"
tasks:
- name: Get info for all VMs
community.vmware.vmware_guest_info:
hostname: "{{ vcenter_hostname }}"
username: "{{ vcenter_username }}"
password: "{{ vcenter_password }}"
datacenter: Penn-Engineering-DC
name: "{{ item }}"
validate_certs: false
loop: "{{ groups['all'] }}"
register: vm_info
- name: Write VM report
copy:
content: "{{ vm_info.results | to_nice_yaml }}"
dest: /tmp/vm_inventory_report.yaml
The gather_facts Performance Trick
When you're running playbooks against 50 or 100 VMs, Ansible's default behavior of gathering OS facts on every host adds significant overhead — it's an SSH connection and a facts module execution per host before your tasks even start. For tasks where you don't need OS facts (anything that talks to vCenter rather than to the VM's OS directly, or bulk operations where you already know what you need), turn it off:
- name: Apply vSphere tags to DR-protected VMs
hosts: all
gather_facts: false
tasks:
...
On a 100-VM play, this can cut runtime from 8 minutes to under 2 minutes.
Tagging VMs Programmatically
vSphere tags are how we grouped VMs for the DR project. Rather than maintaining manual group membership, we applied a dr-protected tag to every VM that CloudEndure needed to replicate, and the inventory plugin picked that up as a group automatically.
- name: Tag DR-protected VMs
hosts: localhost
gather_facts: false
tasks:
- name: Create DR tag category if not present
community.vmware.vmware_tag_manager:
hostname: "{{ vcenter_hostname }}"
username: "{{ vcenter_username }}"
password: "{{ vcenter_password }}"
validate_certs: false
tag_names:
- category: DR
name: dr-protected
object_name: "{{ item }}"
object_type: VirtualMachine
state: present
loop: "{{ dr_protected_vms }}"
Configuring Linux VMs via vmware_guest_exec
For Linux VMs, vmware_vm_shell lets you run commands inside the guest via VMware Tools (no SSH required):
- name: Install monitoring agent on Linux VMs
hosts: localhost
gather_facts: false
tasks:
- name: Run installer via VMware Tools
community.vmware.vmware_vm_shell:
hostname: "{{ vcenter_hostname }}"
username: "{{ vcenter_username }}"
password: "{{ vcenter_password }}"
validate_certs: false
datacenter: Penn-Engineering-DC
vm_id: "{{ item }}"
vm_username: "{{ vault_vm_username }}"
vm_password: "{{ vault_vm_password }}"
vm_shell: /bin/bash
vm_shell_args: "-c 'curl -s https://monitoring.internal.upenn.edu/install.sh | bash'"
wait_for_process: true
timeout: 300
loop: "{{ groups['linux_vms'] }}"
This requires VMware Tools to be installed and running in the guest. It's slower than SSH and has less output visibility, but it works without needing network access or firewall exceptions to the VM.
Windows VMs via WinRM
For Windows VMs, the approach is WinRM rather than VMware Tools shell. WinRM needs to be enabled on each Windows machine — VMware ships a script for this:
# Run this once per Windows VM (can be pushed via Group Policy or a startup script)
$url = "https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1"
$file = "$env:temp\ConfigureRemotingForAnsible.ps1"
(New-Object -TypeName System.Net.WebClient).DownloadFile($url, $file)
powershell.exe -ExecutionPolicy ByPass -File $file
This enables WinRM over HTTPS, creates a self-signed cert, and opens the firewall. In a managed environment you'd push this via GPO rather than running it interactively.
Once WinRM is configured, Windows hosts work like any other Ansible target using win_* modules:
- name: Configure Windows VMs
hosts: windows_vms
gather_facts: false
vars:
ansible_connection: winrm
ansible_winrm_transport: ntlm
ansible_winrm_server_cert_validation: ignore
tasks:
- name: Ensure Windows Update service is running
win_service:
name: wuauserv
state: started
start_mode: auto
- name: Install required Windows features
win_feature:
name: RSAT-AD-Tools
state: present
include_management_tools: true
Creating VMs with vmware_guest
Provisioning new VMs from a template is vmware_guest:
- name: Provision VM from template
community.vmware.vmware_guest:
hostname: "{{ vcenter_hostname }}"
username: "{{ vcenter_username }}"
password: "{{ vcenter_password }}"
validate_certs: false
datacenter: Penn-Engineering-DC
cluster: Research-Compute-Cluster
folder: /Penn-Engineering-DC/vm/Managed
name: "{{ vm_name }}"
template: RHEL8-Template
state: poweredon
hardware:
num_cpus: 4
memory_mb: 8192
disk:
- size_gb: 50
type: thin
datastore: vsanDatastore
networks:
- name: VM-Network-10.200
ip: "{{ vm_ip }}"
netmask: 255.255.255.0
gateway: 10.200.0.1
customization:
hostname: "{{ vm_name }}"
dns_servers:
- 10.200.0.10
- 10.200.0.11
Guest customization (the customization block) requires VMware Tools in the template. Without it, the VM comes up with the template's original hostname and IP.
What This Enabled
Once the VMware fleet was under Ansible control, the DR project became much more tractable. We could run the CloudEndure agent installation playbook reliably across all protected VMs. We could audit VM configurations consistently. We could push policy changes across the environment without logging into vCenter and clicking through GUI dialogs for each machine. The inventory plugin meant our playbooks always reflected the actual current state of the environment rather than a stale spreadsheet.
The main lesson: pyvmomi, install it first, before you do anything else.
