Infrastructure as Code (IaC) Part 2: Configuration Management With Ansible
Part 2 of our Infrastructure as Code series, where we use Ansible to configure and deploy a complete Kubernetes cluster on the infrastructure we built with Terraform, demonstrating the power of combining infrastructure provisioning with configuration management.

In this second installment of our Infrastructure as Code series, we’ll bridge the gap between infrastructure provisioning and application deployment by introducing Ansible for configuration management. While Terraform excels at creating and managing infrastructure resources, Ansible shines at configuring and maintaining the software stack that runs on top of that infrastructure.
Ansible is an agentless automation platform that uses SSH to execute tasks across multiple systems simultaneously. Unlike infrastructure provisioning tools, Ansible focuses on configuration management, application deployment, and orchestration. It uses simple, human-readable YAML playbooks to describe automation jobs, making it accessible to both developers and system administrators.
In this tutorial, we’ll take the 5-VM cluster we created in Part 1 and transform it into a fully functional Kubernetes cluster using Ansible. You’ll learn how to:
- Integrate Terraform and Ansible for end-to-end infrastructure automation
- Build reusable Ansible roles for consistent configuration management
- Automate Kubernetes cluster deployment using kubeadm and container runtime setup
- Implement dynamic inventory generation that adapts to changing infrastructure
- Create deployment pipelines that combine infrastructure provisioning with application readiness
This hands-on approach demonstrates a real-world DevOps pattern where infrastructure provisioning and configuration management work together seamlessly. By the end of this tutorial, you’ll have a complete automation pipeline that can provision virtual machines and configure them into a production-ready Kubernetes cluster with a single command.
Prerequisites and Current State
This tutorial picks up right where Part 1 of our Infrastructure as Code series: Introduction to Terraform left off. To follow along, you’ll need to have already completed that Terraform tutorial and have your 5-VM cluster infrastructure ready. This cluster, made up of one master node and four worker nodes, will be the environment where we deploy Kubernetes.
It’s crucial that you complete Part 1 first because the Ansible configuration we’re about to build relies on the specific VM setup, networking, and cloud-init templates established there. The seamless integration we’ll demonstrate between Terraform and Ansible depends on both tools working with the exact same infrastructure state and SSH key configuration.
Install Ansible
First, we must install the required packages for Ansible:
# On Ubuntu/Debian:
sudo apt update
sudo apt install ansible
# On RHEL/CentOS/Fedora:
sudo dnf install ansible
# or on older systems:
sudo yum install ansible
# On macOS:
brew install ansible
# Alternative: Install via pip (works on any OS with Python):
pip install ansible
Also install required system packages:
# On Ubuntu/Debian:
sudo apt install jq ssh-client
# On RHEL/CentOS/Fedora:
sudo dnf install jq openssh-clients
# On macOS:
brew install jq
# (ssh is already included)
Ansible Collections
Ansible collections are distribution formats for packaging and distributing Ansible content including playbooks, roles, modules, and plugins. They extend Ansible’s core functionality by providing specialized modules for specific platforms and services. For our Kubernetes deployment, we need collections that can interact with Kubernetes APIs and provide additional system administration capabilities.
# Install Kubernetes collection (required for control-plane role)
ansible-galaxy collection install kubernetes.core
# Install community.general collection (often useful)
ansible-galaxy collection install community.general
Python Dependencies
# Install Python Kubernetes client (required by kubernetes.core collection)
pip install kubernetes
# Alternative: install via system package manager
# On Ubuntu/Debian:
sudo apt install python3-kubernetes
# On RHEL/CentOS/Fedora:
sudo dnf install python3-kubernetes
Project Structure
Before diving into Ansible configuration, we need to restructure our project to accommodate both Terraform and Ansible components. This organization follows DevOps best practices by separating infrastructure provisioning from configuration management while maintaining clear relationships between components.
Understanding how Ansible organizes automation content is crucial for building maintainable configurations. Ansible uses several key concepts:
- Roles: Reusable units of automation that group related tasks, variables, and files
- Playbooks: YAML files that define which roles to apply to which hosts
- Inventory: Files that define the hosts and groups that Ansible will manage
Our project structure separates these concerns while enabling seamless integration between Terraform’s infrastructure provisioning and Ansible’s configuration management.
First, create a parent directory to house both your Terraform and Ansible projects. Move your existing introduction-to-terraform
directory into this new structure, then create the Ansible directory alongside it.
Your final project structure should look like this:
infrastructure-as-code/
├── Makefile
├── introduction-to-terraform/
| ├── main.tf # Primary resource definitions
| ├── variables.tf # Input variable declarations
| ├── outputs.tf # Output value definitions
| ├── locals.tf # Local value computations
| └── cloud-init/ # VM initialization templates
| ├── user-data.tpl # User and SSH configuration
| └── network-config.tpl # Static IP configuration
└── configuration-with-ansible/
├── ansible.cfg # Ansible configuration file with SSH settings and output formatting
├── generate_inventory.sh # Script to parse Terraform output and generate Ansible inventory
├── inventory.ini # Generated inventory file (created by generate_inventory.sh)
├── site.yml # Main Ansible playbook that orchestrates all roles
└── roles/ # Directory containing all Ansible roles
├── common/ # Role for common tasks across all nodes
│ └── tasks/
│ └── main.yml # Disables swap, loads kernel modules, sets sysctl parameters
├── containerd/ # Role for container runtime installation and configuration
│ └── tasks/
│ └── main.yml # Installs containerd and configures systemd cgroup driver
├── kubernetes/ # Role for Kubernetes component installation
│ └── tasks/
│ └── main.yml # Installs kubelet, kubeadm, kubectl with version pinning
├── control-plane/ # Role for Kubernetes master node setup
│ └── tasks/
│ └── main.yml # Runs kubeadm init, sets up kubeconfig, installs Calico CNI
└── worker/ # Role for Kubernetes worker node setup
└── tasks/
└── main.yml # Joins worker nodes to the cluster using kubeadm join
Configuring Ansible for Dynamic Infrastructure
Ansible’s configuration file (ansible.cfg
) controls how Ansible behaves when connecting to and managing remote hosts. When working with dynamic infrastructure, where IP addresses and host details change frequently, specific configuration optimizations become essential for reliability and performance.
The configuration settings we’ll implement address several challenges common in automated infrastructure environments:
- SSH connection optimization reduces overhead when managing multiple hosts simultaneously
- Security settings handle the dynamic nature of VM IP addresses and SSH keys
- Performance tuning enables faster execution across multiple nodes
- User configuration accommodates the cloud-init user setup from our Terraform deployment
These settings ensure Ansible can reliably connect to and manage the VMs created by Terraform, even when those VMs are destroyed and recreated with different SSH host keys.
Create new file configuration-with-ansible/ansible.cfg
:
[defaults]
host_key_checking = False
pipelining = True
gathering = smart
fact_caching = memory
stdout_callback = yaml
bin_ansible_callbacks = True
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes
Dynamic Inventory Generation
Static inventory files work well for stable infrastructure, but they become a maintenance burden when working with dynamic infrastructure that changes frequently. In our Terraform-Ansible integration, VM IP addresses and host details are determined at provision time, making static inventory files impractical.
Dynamic inventory generation solves this problem by programmatically extracting infrastructure details from Terraform’s state and converting them into Ansible inventory format. This approach ensures that Ansible always has current, accurate information about the infrastructure it needs to manage, eliminating manual inventory maintenance and reducing the potential for configuration drift.
Create a new file named generate_inventory.sh
in the configuration-with-ansible
directory:
#!/usr/bin/env bash
set -e
TF_OUTPUT_JSON="$1"
INVENTORY_FILE="$2"
if [[ ! -f "$TF_OUTPUT_JSON" ]]; then
echo "Error: Terraform output file $TF_OUTPUT_JSON not found"
exit 1
fi
# Extract SSH configuration from Terraform outputs
SSH_USER=$(jq -r '.ssh_user.value // "ubuntu"' "$TF_OUTPUT_JSON")
SSH_KEY=$(jq -r '.ssh_private_key_path.value // "~/.ssh/id_rsa"' "$TF_OUTPUT_JSON")
# Extract IPs and create inventory entries with SSH config
masters=$(jq -r '.master_ips.value // {} | to_entries[] | "\(.key) ansible_host=\(.value)"' "$TF_OUTPUT_JSON")
workers=$(jq -r '.worker_ips.value // {} | to_entries[] | "\(.key) ansible_host=\(.value)"' "$TF_OUTPUT_JSON")
# Create inventory file
{
echo "[masters]"
if [[ -n "$masters" ]]; then
echo "$masters"
fi
echo ""
echo "[workers]"
if [[ -n "$workers" ]]; then
echo "$workers"
fi
echo ""
echo "[all:vars]"
echo "ansible_user=$SSH_USER"
echo "ansible_ssh_private_key_file=$SSH_KEY"
echo "ansible_ssh_common_args='-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null'"
} > "$INVENTORY_FILE"
echo "Generated inventory file: $INVENTORY_FILE"
cat "$INVENTORY_FILE"
This script demonstrates several important bash scripting and JSON processing techniques:
- Error handling: The
set -e
directive ensures the script exits immediately if any command fails - JSON parsing: Uses
jq
to extract specific values from Terraform’s JSON output, with fallback defaults using the//
operator - String processing: Constructs Ansible inventory entries by combining host names with IP addresses
- File generation: Creates a properly formatted INI-style inventory file with host groups and global variables
The script separates master and worker nodes into distinct inventory groups, enabling Ansible to apply different roles to different node types. The [all:vars]
section provides SSH configuration that applies to all hosts, ensuring consistent connection behavior across the entire cluster.
Testing Connectivity
After generating the inventory, verify that Ansible can successfully connect to all nodes:
# Make the script executable
chmod +x configuration-with-ansible/generate_inventory.sh
# Test the inventory generation (assuming Terraform has been applied)
cd infrastructure-as-code
configuration-with-ansible/generate_inventory.sh introduction-to-terraform/terraform_output.json configuration-with-ansible/inventory.ini
If everything is working properly, the generate_inventory.sh
script should have generated an inventory file and returned something similiar to:
Generated inventory file: configuration-with-ansible/inventory.ini
[masters]
master-1 ansible_host=192.168.122.100
[workers]
worker-1 ansible_host=192.168.122.101
worker-2 ansible_host=192.168.122.102
worker-3 ansible_host=192.168.122.103
worker-4 ansible_host=192.168.122.104
[all:vars]
ansible_user=ubuntu
ansible_ssh_private_key_file=~/.ssh/id_rsa
ansible_ssh_common_args='-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null'
Now, use ansible to perform a ping test on the hosts in your newly created inventory file:
# Test Ansible connectivity
ANSIBLE_CONFIG=configuration-with-ansible/ansible.cfg ansible -i \
configuration-with-ansible/inventory.ini all -m ping
You should get output similiar to:
PLAY [Ansible Ad-Hoc] *******************************************************************************
TASK [ping] *****************************************************************************************
ok: [master-1]
ok: [worker-3]
ok: [worker-2]
ok: [worker-1]
ok: [worker-4]
PLAY RECAP ******************************************************************************************
master-1 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
worker-1 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
worker-2 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
worker-3 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
worker-4 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Building Reusable Ansible Roles
Ansible roles provide a structured way to organize automation content into reusable components. Each role encapsulates a specific piece of functionality, such as installing containerd, configuring networking, or setting up a database, making it easy to compose complex automation by combining simple, focused roles.
Roles promote consistency across environments by ensuring the same configuration steps are applied identically every time. They also enable collaboration by providing clear interfaces and documentation for automation components. In our Kubernetes deployment, we’ll create specialized roles for each layer of the technology stack, building from basic system configuration up to application-ready cluster components.
The Common Role
The common role establishes the foundational system configuration required for Kubernetes nodes. This includes package installation, kernel parameter tuning, and system service configuration that must be consistent across all cluster members. By implementing these prerequisites in a shared role, we ensure that both master and worker nodes start from an identical, known-good state.
Create a new file configuration-with-ansible/roles/common/tasks/main.yml
:
- name: Log system information
ansible.builtin.debug:
msg:
- "=== Node Information ==="
- "Node: {{ inventory_hostname }}"
- "OS: {{ ansible_distribution }} {{ ansible_distribution_version }}"
- "Kernel: {{ ansible_kernel }}"
- "Memory: {{ ansible_memtotal_mb }}MB"
- "CPU: {{ ansible_processor_vcpus }} cores"
- "Architecture: {{ ansible_architecture }}"
- "========================"
- name: Update apt cache
ansible.builtin.apt:
update_cache: yes
cache_valid_time: 3600
- name: Install required packages
ansible.builtin.apt:
name:
- python3-pip
- python3-setuptools
- python3-kubernetes
- python3-yaml
- apt-transport-https
- ca-certificates
- curl
- gnupg
- lsb-release
state: present
- name: Disable swap
ansible.builtin.mount:
path: swap
fstype: swap
state: unmounted
- name: Remove swap entry from /etc/fstab
ansible.builtin.lineinfile:
path: /etc/fstab
regexp: '^\s*([^#]\S+\s+\S+\s+swap\s+)'
state: absent
- name: Ensure required kernel modules are loaded
ansible.builtin.modprobe:
name: "{{ item }}"
state: present
loop:
- br_netfilter
- overlay
- name: Ensure sysctl settings for Kubernetes networking
ansible.builtin.sysctl:
name: "{{ item.key }}"
value: "{{ item.value }}"
state: present
reload: yes
loop:
- { key: 'net.bridge.bridge-nf-call-iptables', value: 1 }
- { key: 'net.ipv4.ip_forward', value: 1 }
- { key: 'net.bridge.bridge-nf-call-ip6tables', value: 1 }
- name: Log package installation results
ansible.builtin.debug:
msg: "Installed packages: {{ apt_result.stdout_lines | default([]) }}"
when: apt_result is defined and apt_result.changed
This role performs several critical system-level configurations:
- Package management: Installs Python libraries and system tools required by subsequent roles and Kubernetes components
- Swap disabling: Kubernetes requires swap to be disabled for proper memory management and performance
- Kernel module loading: Enables container networking features (
br_netfilter
) and overlay filesystem support (overlay
) - Network parameter tuning: Configures kernel parameters for proper container networking and IP forwarding
These configurations address Kubernetes’ specific requirements for the underlying operating system, ensuring that the cluster components can function correctly once installed.
The Containerd Role
The containerd role installs and configures the container runtime that Kubernetes will use to manage application containers. Containerd is a high-performance container runtime that implements the Container Runtime Interface (CRI) specification, making it compatible with Kubernetes. This role ensures that the container runtime is properly integrated with systemd for process management and cgroup handling.
Create a new file configuration-with-ansible/roles/containerd/tasks/main.yml
:
---
# roles/containerd/tasks/main.yml
- name: Install containerd
ansible.builtin.apt:
name: containerd
state: present
update_cache: yes
- name: Configure containerd with systemd cgroup driver
ansible.builtin.shell: |
mkdir -p /etc/containerd
containerd config default | sed 's/SystemdCgroup = false/SystemdCgroup = true/' > /etc/containerd/config.toml
args:
creates: /etc/containerd/config.toml
- name: Restart containerd
ansible.builtin.service:
name: containerd
state: restarted
enabled: yes
The systemd cgroup driver configuration is particularly important as it ensures that containerd and Kubernetes use the same cgroup hierarchy, preventing resource management conflicts.
The Kubernetes Role
The Kubernetes role installs the core Kubernetes components (kubelet, kubeadm, and kubectl) from the official Kubernetes package repository. This role carefully manages package versions to ensure cluster consistency and includes version pinning to prevent unexpected upgrades that could destabilize the cluster.
---
# roles/kubernetes/tasks/main.yml
- name: Update apt cache
ansible.builtin.apt:
update_cache: yes
- name: Install required packages
ansible.builtin.apt:
name:
- apt-transport-https
- ca-certificates
- curl
- gpg
state: present
- name: Create keyrings directory
ansible.builtin.file:
path: /etc/apt/keyrings
state: directory
mode: '0755'
- name: Add Kubernetes apt key
ansible.builtin.shell: |
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
args:
creates: /etc/apt/keyrings/kubernetes-apt-keyring.gpg
- name: Add Kubernetes apt repository
ansible.builtin.apt_repository:
repo: "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /"
state: present
filename: kubernetes
- name: Install kubelet, kubeadm, kubectl
ansible.builtin.apt:
name:
- kubelet=1.28.*
- kubeadm=1.28.*
- kubectl=1.28.*
state: present
update_cache: yes
- name: Hold Kubernetes packages
ansible.builtin.dpkg_selections:
name: "{{ item }}"
selection: hold
loop:
- kubelet
- kubeadm
- kubectl
- name: Enable and start kubelet
ansible.builtin.service:
name: kubelet
enabled: yes
state: started
This role manages the installation and configuration of Kubernetes components:
- Repository management: Adds the official Kubernetes APT repository with proper GPG key verification for security
- Package installation: Installs specific versions of kubelet (node agent), kubeadm (cluster bootstrapping tool), and kubectl (command-line interface)
- Version pinning: Uses
dpkg_selections
to prevent automatic package updates that could break cluster compatibility - Service management: Enables the kubelet service so it can be started by kubeadm during cluster initialization
Version pinning is crucial in Kubernetes environments because minor version differences between cluster components can cause compatibility issues or unexpected behavior.
The Control Plane Role
The control plane role transforms a prepared node into a Kubernetes master by initializing the cluster control plane components. This role handles the complex bootstrap process that creates the cluster’s initial state, configures administrative access, and installs essential cluster networking components.
Create a new file configuration-with-ansible/roles/control-plane/tasks/main.yml
:
---
# roles/control-plane/tasks/main.yml
- name: Check if kubeadm has already run
ansible.builtin.stat:
path: /etc/kubernetes/admin.conf
register: kubeadm_init_stat
- name: Initialize Kubernetes control plane
ansible.builtin.command: kubeadm init --pod-network-cidr=192.168.0.0/16
when: not kubeadm_init_stat.stat.exists
register: kubeadm_init_result
- name: Create .kube directory for ubuntu user
ansible.builtin.file:
path: /home/ubuntu/.kube
state: directory
owner: ubuntu
group: ubuntu
mode: '0755'
- name: Copy kubeconfig for ubuntu user
ansible.builtin.copy:
src: /etc/kubernetes/admin.conf
dest: /home/ubuntu/.kube/config
remote_src: yes
owner: ubuntu
group: ubuntu
mode: '0600'
- name: Generate kubeadm join command
ansible.builtin.shell: kubeadm token create --print-join-command
register: join_command_result
when: not kubeadm_init_stat.stat.exists or ansible_play_hosts | length > 1
- name: Save join command to file
ansible.builtin.copy:
content: "{{ join_command_result.stdout }}"
dest: /tmp/kubeadm_join_cmd.sh
mode: '0755'
when: join_command_result is defined and join_command_result.stdout is defined
- name: Install Calico CNI
ansible.builtin.shell: kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml
become_user: ubuntu
environment:
KUBECONFIG: /home/ubuntu/.kube/config
when: not kubeadm_init_stat.stat.exists
register: calico_install
- name: Wait for Calico Controller to be ready
ansible.builtin.command:
cmd: kubectl rollout status deployment/calico-kube-controllers -n kube-system --timeout=300s
become_user: ubuntu
environment:
KUBECONFIG: /home/ubuntu/.kube/config
changed_when: false
when: not kubeadm_init_stat.stat.exists
- name: Wait for Calico Node DaemonSet to be ready
ansible.builtin.command:
cmd: kubectl rollout status daemonset/calico-node -n kube-system --timeout=300s
become_user: ubuntu
environment:
KUBECONFIG: /home/ubuntu/.kube/config
changed_when: false
when: not kubeadm_init_stat.stat.exists
- name: Display Calico installation result
ansible.builtin.debug:
var: calico_install
when: not kubeadm_init_stat.stat.exists
- name: Verify system pods are running
ansible.builtin.shell: kubectl get pods -n kube-system
become_user: ubuntu
environment:
KUBECONFIG: /home/ubuntu/.kube/config
register: system_pods
when: not kubeadm_init_stat.stat.exists
- name: Display system pods status
ansible.builtin.debug:
var: system_pods.stdout_lines
when: not kubeadm_init_stat.stat.exists and system_pods is defined
This role orchestrates the complex process of creating a Kubernetes control plane:
- Idempotency checking: Uses the presence of
/etc/kubernetes/admin.conf
to determine if cluster initialization has already occurred - Cluster initialization: Runs
kubeadm init
with a specific pod network CIDR that’s compatible with Calico CNI - User access configuration: Sets up kubectl access for the ubuntu user by copying and configuring the kubeconfig file
- Join token generation: Creates the kubeadm join command that worker nodes will use to join the cluster
- Network plugin installation: Downloads and applies the Calico CNI manifest to enable pod-to-pod networking
The Calico CNI (Container Network Interface) plugin is essential because Kubernetes doesn’t include built-in networking for pod communication across nodes. Calico provides this functionality using BGP routing and network policies.
The Worker Node Role
The worker node role handles the process of joining additional nodes to an existing Kubernetes cluster. This role retrieves the join command generated by the control plane and executes it on worker nodes, establishing secure communication with the cluster and registering the node as available for workload scheduling.
Create a new file configuration-with-ansible/roles/worker/tasks/main.yml
:
---
# roles/worker/tasks/main.yml
- name: Check if node is already joined
ansible.builtin.stat:
path: /etc/kubernetes/kubelet.conf
register: kubelet_conf_stat
- name: Fetch join command from master
ansible.builtin.slurp:
src: /tmp/kubeadm_join_cmd.sh
delegate_to: "{{ groups['masters'][0] }}"
register: join_cmd_content
when: not kubelet_conf_stat.stat.exists
- name: Join the node to the cluster
ansible.builtin.shell: "{{ join_cmd_content.content | b64decode | trim }}"
when: not kubelet_conf_stat.stat.exists and join_cmd_content is defined
This role manages the worker node join process:
- Join status checking: Examines the
/etc/kubernetes/kubelet.conf
file to determine if the node has already joined a cluster - Command retrieval: Uses Ansible’s delegation feature to fetch the join command from the master node without requiring shared storage
- Secure joining: Executes the kubeadm join command which establishes encrypted communication with the control plane and registers the node
The delegate_to
directive is particularly important here as it allows worker nodes to retrieve information from the master node dynamically, eliminating the need for external coordination mechanisms or shared file systems.
Orchestrating the Deployment
The main playbook (site.yml
) serves as the orchestration layer that coordinates the application of our roles across different node types. This playbook demonstrates key Ansible concepts including host groups, role sequencing, and privilege escalation. By organizing the deployment into logical phases, we ensure that dependencies are satisfied and that cluster initialization occurs in the correct order.
The playbook structure reflects the natural dependency hierarchy of Kubernetes cluster deployment: all nodes need basic system configuration first, then the control plane must be established before worker nodes can join. This sequencing is critical for successful cluster initialization.
Create new file configuration-with-ansible/site.yml
:
- hosts: all
become: true
roles:
- common
- containerd
- kubernetes
- hosts: masters
become: true
roles:
- control-plane
- hosts: workers
become: true
roles:
- worker
In Ansible, become: true
tells Ansible to run tasks with elevated privileges, typically as the root user. This is similar to using sudo on the command line.
Why is this needed?
Many system-level tasks (like installing packages, modifying system files, or configuring services) require root access. By setting become: true, you ensure these tasks have the necessary permissions.
Note: If the user ansible is using for SSH does not have
sudo
privileges,become: true
will not work, regardless of it being set to true.
Integration with Terraform
Creating a seamless integration between Terraform and Ansible requires careful coordination of the deployment pipeline. The Makefile provides a automation layer that orchestrates the entire infrastructure lifecycle, from initial provisioning through application deployment to final cleanup.
The automation ensures that infrastructure changes are immediately followed by configuration updates, maintaining consistency between the desired state defined in code and the actual deployed state.
Create a new file in project root called Makefile
:
.PHONY: plan apply inventory ansible deploy destroy clean-ssh
TF_DIR := introduction-to-terraform
ANSIBLE_DIR := configuration-with-ansible
INVENTORY := $(ANSIBLE_DIR)/inventory.ini
TF_OUTPUT := $(TF_DIR)/terraform_output.json
plan:
cd $(TF_DIR) && terraform init && terraform plan
apply:
cd $(TF_DIR) && terraform init && terraform apply -auto-approve
cd $(TF_DIR) && terraform output -json > terraform_output.json
inventory: apply
$(ANSIBLE_DIR)/generate_inventory.sh $(TF_OUTPUT) $(INVENTORY)
wait-for-ssh: inventory
$(ANSIBLE_DIR)/wait_for_ssh.sh $(INVENTORY)
ansible: wait-for-ssh
ANSIBLE_CONFIG=$(ANSIBLE_DIR)/ansible.cfg ansible-playbook -i $(INVENTORY) $(ANSIBLE_DIR)/site.yml
deploy: apply inventory wait-for-ssh ansible
destroy:
cd $(TF_DIR) && terraform destroy -auto-approve
$(MAKE) clean-ssh
clean-ssh:
@echo "Clearing SSH known_hosts entries for libvirt VMs..."
@bash -c 'for ip in {100..104}; do ssh-keygen -f "$$HOME/.ssh/known_hosts" -R "192.168.122.$$ip" 2>/dev/null || true; done'
@echo "SSH known_hosts cleaned"
For more information on Makefile’s, check out this resource for more examples.
Also, we need to create our helper script that ensures all VM’s are up and reachable before the automation continues. Create a new file configuration-with-ansible/wait_for_ssh.sh
:
#!/usr/bin/env bash
set -e
INVENTORY_FILE="$1"
if [[ ! -f "$INVENTORY_FILE" ]]; then
echo "Error: Inventory file $INVENTORY_FILE not found"
exit 1
fi
echo "Waiting for SSH to be available on all VMs..."
# Extract IPs from inventory file
ips=$(grep -E "ansible_host=" "$INVENTORY_FILE" | sed 's/.*ansible_host=\([0-9.]*\).*/\1/')
for ip in $ips; do
echo "Waiting for SSH on $ip..."
timeout=120
while ! nc -z "$ip" 22 2>/dev/null && [ $timeout -gt 0 ]; do
sleep 2
timeout=$((timeout-2))
done
if [ $timeout -le 0 ]; then
echo "Timeout waiting for SSH on $ip"
exit 1
else
echo "SSH available on $ip"
fi
done
echo "All VMs are ready for SSH connections"
Don’t forget to make the script executable:
chmod +x configuration-with-ansible/wait_for_ssh.sh
Testing and Validation
After deploying your Kubernetes cluster, it’s essential to verify that all components are functioning correctly before deploying workloads. This validation process ensures cluster health and helps identify any configuration issues early.
Verifying Cluster Functionality
Connect to your master node and run these validation commands:
# SSH to the master node (replace IP with your master's IP)
ssh -i ~/.ssh/id_rsa [email protected]
# Check cluster status
kubectl get nodes
# Verify all nodes are in Ready state
kubectl get nodes -o wide
# Check pod status across all namespaces
kubectl get pods --all-namespaces
# Verify Calico networking is working
kubectl get pods -n kube-system | grep calico
Running Test Workloads
Deploy a simple test application to validate cluster functionality:
# Create a test deployment
kubectl create deployment nginx-test --image=nginx:latest --replicas=3
# Expose the deployment as a service
kubectl expose deployment nginx-test --port=80 --target-port=80
# Check if pods are distributed across worker nodes
kubectl get pods -o wide
# Test service connectivity
kubectl get svc nginx-test
Troubleshooting Common Issues
If you encounter problems, check these common areas:
- Node connectivity: Ensure all nodes can communicate on the pod network CIDR
- Container runtime: Verify containerd is running on all nodes with
systemctl status containerd
- Kubelet status: Check kubelet logs with
journalctl -u kubelet -f
- CNI networking: Verify Calico pods are running in the kube-system namespace
Cleanup and Tear Down
Proper cleanup is essential when working with dynamic infrastructure, especially in development and testing environments. The cleanup process must handle both the application layer (Kubernetes cluster) and the infrastructure layer (virtual machines) while managing ancillary effects like SSH known_hosts entries.
Ansible Considerations for Infrastructure Destruction
Unlike some configuration management tools, Ansible doesn’t automatically track and reverse the changes it makes. When destroying infrastructure, it’s often more efficient to destroy the underlying VMs rather than attempting to reverse all configuration changes. However, in production environments, you might want to create specific “cleanup” playbooks for graceful service shutdown and data preservation.
SSH Known_hosts Management
When VMs are destroyed and recreated, their SSH host keys change, leading to SSH connection warnings. The clean-ssh
target in our Makefile proactively removes these entries, preventing connection issues in subsequent deployments.
Complete Cleanup Workflow
To tear down the entire environment:
# Destroy everything and clean SSH entries
make destroy
# Or run individual steps
make clean-ssh # Clean SSH known_hosts only
cd introduction-to-terraform && terraform destroy # Destroy infrastructure only
This approach ensures complete environment cleanup while maintaining the ability to quickly rebuild the infrastructure for testing or development purposes.
Best Practices and Next Steps
The Terraform and Ansible integration pattern we’ve implemented represents a powerful foundation for production infrastructure automation. However, several considerations become important as you scale this approach or adapt it for production use.
Security Considerations for Production Deployments
Production environments require additional security measures:
- SSH key management: Implement proper key rotation and use dedicated service accounts rather than personal SSH keys
- Network segmentation: Configure firewalls and network policies to restrict communication between cluster components
- Secrets management: Use tools like HashiCorp Vault or Kubernetes secrets for sensitive configuration data
- RBAC implementation: Configure Kubernetes Role-Based Access Control to limit user and service permissions
Scaling the Approach for Larger Clusters
As your infrastructure grows, consider these optimizations:
- Ansible parallelism: Tune the
forks
setting inansible.cfg
to manage more nodes simultaneously - Role parameterization: Add variables to roles for different environment configurations (dev, staging, production)
- Inventory grouping: Create more sophisticated inventory groups for different node types or environments
- State management: Consider using remote state storage for Terraform and Ansible facts caching for improved performance
Adding Monitoring and Logging Roles
Extend the automation with additional roles for operational capabilities:
- Prometheus monitoring: Create roles for metrics collection and alerting
- Log aggregation: Implement centralized logging with tools like Fluentd or Logstash
- Backup automation: Add roles for automated backup and disaster recovery procedures
Version Management and GitOps Integration
For production-ready deployments, implement version control and deployment automation:
- Git workflow: Store all configuration in version control with proper branching strategies
- CI/CD integration: Automate testing and deployment using tools like GitLab CI or GitHub Actions
- Immutable infrastructure: Consider implementing blue-green deployments for safer production updates
- Configuration drift detection: Implement monitoring to detect when actual configuration diverges from desired state
Conclusion
The integration of Terraform and Ansible represents a powerful paradigm in Infrastructure as Code that addresses the complete infrastructure lifecycle. By combining Terraform’s declarative infrastructure provisioning with Ansible’s flexible configuration management, we’ve created an automation pipeline that can consistently deploy complex, multi-tier applications from bare metal to production-ready state.
Benefits of the Terraform + Ansible Approach
This integrated approach offers several key advantages:
- Separation of concerns: Infrastructure provisioning and application configuration are handled by tools optimized for each task
- Flexibility: Changes to infrastructure or application configuration can be made independently
- Reusability: Ansible roles can be applied to infrastructure provisioned by any tool, not just Terraform
- Testability: Each layer can be tested independently, improving reliability and debugging capability
- Scalability: The pattern scales from development environments to large production deployments
When to Use This Pattern vs Alternatives
This Terraform-Ansible pattern works best when:
- You need complex, multi-step configuration that goes beyond basic package installation
- Your infrastructure spans multiple environments or cloud providers
- You require fine-grained control over the configuration process
- Your team has expertise in both infrastructure and configuration management
Alternative approaches like cloud-init, Helm charts, or container-based deployments may be more appropriate for simpler use cases or when working within specific ecosystems like Kubernetes-native applications.
Further Learning Resources
To deepen your understanding of Infrastructure as Code and expand on the concepts covered in this series:
- Terraform Documentation: HashiCorp’s official documentation for advanced provider usage and state management
- Ansible Documentation: Red Hat’s Ansible documentation for advanced playbook patterns and enterprise features
- Kubernetes the Hard Way: Kelsey Hightower’s tutorial for understanding Kubernetes internals
- Infrastructure as Code Patterns: Explore advanced patterns in Kief Morris’s “Infrastructure as Code” book
- GitOps Practices: Learn about ArgoCD and Flux for Kubernetes-native deployment automation
The foundation you’ve built with this two-part series provides a solid base for exploring more advanced DevOps patterns and tools. Whether you’re managing a homelab or preparing for production deployments, the principles of declarative infrastructure and automated configuration management will serve you well as you continue to build reliable, scalable systems.
Stay Tuned…
In Part 3 of this series, we’ll take the next logical step by enhancing our Kubernetes cluster with production-ready components including MetalLB load balancing, Istio service mesh for traffic management, and persistent storage solutions, all automated through Ansible.
As always, you can find all the code examples and configuration files from this tutorial in our GitHub repository.

Aaron Mathis
Systems administrator and software engineer specializing in cloud development, AI/ML, and modern web technologies. Passionate about building scalable solutions and sharing knowledge with the developer community.
Related Articles
Discover more insights on similar topics