Blog
September 5, 2019 Marie H.

Automating VMware with Ansible

Automating VMware with Ansible

Automating VMware with Ansible

When I started at Penn Engineering, the VMware infrastructure was managed the way most long-running enterprise VMware environments are managed: through the vSphere Web Client, by hand, by whoever needed to do the thing. There was documentation, some of it accurate, describing what VMs existed and what they were for. There was no automation, no version control for VM configurations, and no consistent way to push a change across all the machines that needed it.

The DR project gave us a forcing function to fix this. You can't reliably automate disaster recovery if you don't know exactly what your source environment looks like. Bringing the VMware fleet under Ansible control was the prerequisite.

Prerequisites: pyvmomi

The first thing I'll tell you because I wasted time on it: before any VMware Ansible module will work, you need pyvmomi installed on the Ansible control node. It's the Python SDK for the VMware vSphere API, and the community.vmware collection uses it internally.

pip install pyvmomi

This is not mentioned prominently enough in the documentation. You'll install the collection, write a perfectly valid playbook, run it, and get a confusing error about a missing Python module. Install pyvmomi first.

ansible-galaxy collection install community.vmware
pip install pyvmomi

Authentication

Every task in the community.vmware collection needs connection details for your vCenter server. These go in playbook vars and should always be pulled from Ansible Vault for anything that touches production:

vars:
  vcenter_hostname: vcenter.internal.upenn.edu
  vcenter_username: "{{ vault_vcenter_username }}"
  vcenter_password: "{{ vault_vcenter_password }}"
  vcenter_datacenter: Penn-Engineering-DC
  validate_certs: false  # Internal CA, not in standard trust store

validate_certs: false is a pragmatic choice for internal infrastructure with a private CA. If you can add your CA to the trust store, do that instead.

Dynamic Inventory with vmware_vm_inventory

Static inventory files don't work at scale. Penn Engineering had hundreds of VMs across multiple clusters and datacenters. The vmware_vm_inventory plugin discovers all of them from vCenter automatically and groups them by datacenter, cluster, host, folder, and resource pool.

The inventory plugin config goes in a YAML file (I named it vmware.yml):

plugin: community.vmware.vmware_vm_inventory
hostname: vcenter.internal.upenn.edu
username: "{{ lookup('env', 'VCENTER_USERNAME') }}"
password: "{{ lookup('env', 'VCENTER_PASSWORD') }}"
validate_certs: false
with_tags: true
hostnames:
  - config.name
properties:
  - name
  - config.guestId
  - summary.runtime.powerState
  - summary.guest.ipAddress
  - config.hardware.numCPU
  - config.hardware.memoryMB

Run with ansible-inventory -i vmware.yml --list to see what it discovers. The with_tags: true option pulls vSphere tags, which becomes important for grouping.

After discovery, you have groups like datacenter_Penn_Engineering_DC, cluster_Research_Compute, folder_DR_Protected. Your playbooks target these groups instead of manually maintained host lists.

Gathering VM Information

A common first task is auditing what you have. vmware_guest_info is the module for this:

- name: Gather VM inventory
  hosts: localhost
  gather_facts: false
  vars:
    vcenter_hostname: vcenter.internal.upenn.edu
    vcenter_username: "{{ vault_vcenter_username }}"
    vcenter_password: "{{ vault_vcenter_password }}"
  tasks:
    - name: Get info for all VMs
      community.vmware.vmware_guest_info:
        hostname: "{{ vcenter_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        datacenter: Penn-Engineering-DC
        name: "{{ item }}"
        validate_certs: false
      loop: "{{ groups['all'] }}"
      register: vm_info

    - name: Write VM report
      copy:
        content: "{{ vm_info.results | to_nice_yaml }}"
        dest: /tmp/vm_inventory_report.yaml

The gather_facts Performance Trick

When you're running playbooks against 50 or 100 VMs, Ansible's default behavior of gathering OS facts on every host adds significant overhead — it's an SSH connection and a facts module execution per host before your tasks even start. For tasks where you don't need OS facts (anything that talks to vCenter rather than to the VM's OS directly, or bulk operations where you already know what you need), turn it off:

- name: Apply vSphere tags to DR-protected VMs
  hosts: all
  gather_facts: false
  tasks:
    ...

On a 100-VM play, this can cut runtime from 8 minutes to under 2 minutes.

Tagging VMs Programmatically

vSphere tags are how we grouped VMs for the DR project. Rather than maintaining manual group membership, we applied a dr-protected tag to every VM that CloudEndure needed to replicate, and the inventory plugin picked that up as a group automatically.

- name: Tag DR-protected VMs
  hosts: localhost
  gather_facts: false
  tasks:
    - name: Create DR tag category if not present
      community.vmware.vmware_tag_manager:
        hostname: "{{ vcenter_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        validate_certs: false
        tag_names:
          - category: DR
            name: dr-protected
        object_name: "{{ item }}"
        object_type: VirtualMachine
        state: present
      loop: "{{ dr_protected_vms }}"

Configuring Linux VMs via vmware_guest_exec

For Linux VMs, vmware_vm_shell lets you run commands inside the guest via VMware Tools (no SSH required):

- name: Install monitoring agent on Linux VMs
  hosts: localhost
  gather_facts: false
  tasks:
    - name: Run installer via VMware Tools
      community.vmware.vmware_vm_shell:
        hostname: "{{ vcenter_hostname }}"
        username: "{{ vcenter_username }}"
        password: "{{ vcenter_password }}"
        validate_certs: false
        datacenter: Penn-Engineering-DC
        vm_id: "{{ item }}"
        vm_username: "{{ vault_vm_username }}"
        vm_password: "{{ vault_vm_password }}"
        vm_shell: /bin/bash
        vm_shell_args: "-c 'curl -s https://monitoring.internal.upenn.edu/install.sh | bash'"
        wait_for_process: true
        timeout: 300
      loop: "{{ groups['linux_vms'] }}"

This requires VMware Tools to be installed and running in the guest. It's slower than SSH and has less output visibility, but it works without needing network access or firewall exceptions to the VM.

Windows VMs via WinRM

For Windows VMs, the approach is WinRM rather than VMware Tools shell. WinRM needs to be enabled on each Windows machine — VMware ships a script for this:

# Run this once per Windows VM (can be pushed via Group Policy or a startup script)
$url = "https://raw.githubusercontent.com/ansible/ansible/devel/examples/scripts/ConfigureRemotingForAnsible.ps1"
$file = "$env:temp\ConfigureRemotingForAnsible.ps1"
(New-Object -TypeName System.Net.WebClient).DownloadFile($url, $file)
powershell.exe -ExecutionPolicy ByPass -File $file

This enables WinRM over HTTPS, creates a self-signed cert, and opens the firewall. In a managed environment you'd push this via GPO rather than running it interactively.

Once WinRM is configured, Windows hosts work like any other Ansible target using win_* modules:

- name: Configure Windows VMs
  hosts: windows_vms
  gather_facts: false
  vars:
    ansible_connection: winrm
    ansible_winrm_transport: ntlm
    ansible_winrm_server_cert_validation: ignore
  tasks:
    - name: Ensure Windows Update service is running
      win_service:
        name: wuauserv
        state: started
        start_mode: auto

    - name: Install required Windows features
      win_feature:
        name: RSAT-AD-Tools
        state: present
        include_management_tools: true

Creating VMs with vmware_guest

Provisioning new VMs from a template is vmware_guest:

- name: Provision VM from template
  community.vmware.vmware_guest:
    hostname: "{{ vcenter_hostname }}"
    username: "{{ vcenter_username }}"
    password: "{{ vcenter_password }}"
    validate_certs: false
    datacenter: Penn-Engineering-DC
    cluster: Research-Compute-Cluster
    folder: /Penn-Engineering-DC/vm/Managed
    name: "{{ vm_name }}"
    template: RHEL8-Template
    state: poweredon
    hardware:
      num_cpus: 4
      memory_mb: 8192
    disk:
      - size_gb: 50
        type: thin
        datastore: vsanDatastore
    networks:
      - name: VM-Network-10.200
        ip: "{{ vm_ip }}"
        netmask: 255.255.255.0
        gateway: 10.200.0.1
    customization:
      hostname: "{{ vm_name }}"
      dns_servers:
        - 10.200.0.10
        - 10.200.0.11

Guest customization (the customization block) requires VMware Tools in the template. Without it, the VM comes up with the template's original hostname and IP.

What This Enabled

Once the VMware fleet was under Ansible control, the DR project became much more tractable. We could run the CloudEndure agent installation playbook reliably across all protected VMs. We could audit VM configurations consistently. We could push policy changes across the environment without logging into vCenter and clicking through GUI dialogs for each machine. The inventory plugin meant our playbooks always reflected the actual current state of the environment rather than a stale spreadsheet.

The main lesson: pyvmomi, install it first, before you do anything else.