Guide to automate Azure Virtual Machine Scale Sets (VMSS) using Terraform

Hi there! 👋 I'm Daniel Ozoemena, a passionate Cloud Solution Architect and DevOps Engineer dedicated to building scalable, secure, and innovative cloud solutions. With hands-on experience in Azure, AWS, and Google Cloud Platform, I specialize in deploying infrastructure as code, automating workflows, and optimizing system reliability. Driven by a love for problem-solving, I constantly explore new technologies and best practices to deliver impactful results. Beyond the cloud, I enjoy mentoring, blogging about tech insights, and contributing to open-source projects. When I'm not automating deployments or creating secure virtual networks, you can find me playing chess, learning about AI, or brainstorming solutions to real-world challenges. Let’s connect and grow together on this tech journey! 🚀
Step 1: Install Terraform & Authenticate with Azure
A. Install Terraform
Download and install Terraform: Terraform Download
Verify installation:
terraform -version
B. Authenticate Terraform with Azure
Log in to your Azure account:
az loginIf you have multiple subscriptions, set the desired one:
az account set --subscription "your-subscription-id"
Step 2: Create Terraform Configuration Files
📌 Inside your project directory, create these Terraform files:
A. Create main.tf (Main Terraform Configuration)
provider "azurerm" {
features {}
subscription_id = "IN-PUT YOUR ID"
use_cli = true # Use Azure CLI authentication
}
resource "azurerm_resource_group" "example" {
name = "myResourceGroup"
location = "East US"
}
B. Create vmss.tf (Virtual Machine Scale Set Definition)
# 1️⃣ Create a Resource Group
resource "azurerm_resource_group" "vmss_rg" {
name = "vmss-resource-group"
location = "East US"
}
# 2️⃣ Create a Virtual Network
resource "azurerm_virtual_network" "vmss_vnet" {
name = "vmss-vnet"
location = azurerm_resource_group.vmss_rg.location
resource_group_name = azurerm_resource_group.vmss_rg.name
address_space = ["10.0.0.0/16"]
}
# 3️⃣ Create a Subnet
resource "azurerm_subnet" "vmss_subnet" {
name = "vmss-subnet"
resource_group_name = azurerm_resource_group.vmss_rg.name
virtual_network_name = azurerm_virtual_network.vmss_vnet.name
address_prefixes = ["10.0.1.0/24"]
}
# 4️⃣ Create a VM Scale Set
resource "azurerm_linux_virtual_machine_scale_set" "vmss" {
name = "vmss-instance"
location = azurerm_resource_group.vmss_rg.location
resource_group_name = azurerm_resource_group.vmss_rg.name
sku = "Standard_B1s"
instances = 2 # Start with 2 instances, will auto-scale
admin_username = "azureuser"
admin_ssh_key {
username = "azureuser"
public_key = file("~/.ssh/id_rsa.pub") # Use your SSH public key
}
network_interface {
name = "vmss-nic"
primary = true
enable_ip_forwarding = false
ip_configuration {
name = "internal"
subnet_id = azurerm_subnet.vmss_subnet.id
}
}
source_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "18.04-LTS"
version = "latest"
}
os_disk {
caching = "ReadWrite"
storage_account_type = "Standard_LRS"
}
upgrade_mode = "Automatic"
}
C. Create autoscale.tf (Auto-Scaling Configuration)
resource "azurerm_monitor_autoscale_setting" "vmss_autoscale" {
name = "vmss-autoscale"
resource_group_name = azurerm_resource_group.vmss_rg.name
location = azurerm_resource_group.vmss_rg.location
target_resource_id = azurerm_linux_virtual_machine_scale_set.vmss.id
profile {
name = "default"
capacity {
default = 2
minimum = 1
maximum = 5
}
rule {
metric_trigger {
metric_name = "Percentage CPU"
metric_namespace = "Microsoft.Compute/virtualMachineScaleSets"
metric_resource_id = azurerm_linux_virtual_machine_scale_set.vmss.id
operator = "GreaterThan"
statistic = "Average"
threshold = 75
time_aggregation = "Average"
time_grain = "PT1M"
time_window = "PT5M"
}
scale_action {
direction = "Increase"
type = "ChangeCount"
value = "1"
cooldown = "PT5M"
}
}
rule {
metric_trigger {
metric_name = "Percentage CPU"
metric_namespace = "Microsoft.Compute/virtualMachineScaleSets"
metric_resource_id = azurerm_linux_virtual_machine_scale_set.vmss.id
operator = "LessThan"
statistic = "Average"
threshold = 30
time_aggregation = "Average"
time_grain = "PT1M"
time_window = "PT5M"
}
scale_action {
direction = "Decrease"
type = "ChangeCount"
value = "1"
cooldown = "PT5M"
}
}
}
}

Step 3: Deploy Terraform Configuration
A. Initialize Terraform
Run the following command to download necessary Terraform providers:
terraform init

B. Validate the Terraform Configuration
Check for syntax errors:
terraform validate

C. Plan the Deployment
Generate a preview of what Terraform will create:
terraform plan


D. Apply the Terraform Configuration
Deploy the resources to Azure:
terraform apply -auto-approve
⏳ Wait for the deployment to complete (~5 minutes).



Step 4: Test Auto-Scaling
Simulate High CPU Usage:
SSH into a VMSS instance:
ssh azureuser@your-vmss-instance-ipRun a CPU-intensive process to trigger scaling:
sudo apt install stress -y stress --cpu 4 --timeout 300
Monitor Scaling in Azure Portal:
Go to Azure Portal → Virtual Machine Scale Sets → Check the number of instances.
It should increase if CPU usage is above 75%.
After stopping the CPU stress, it should scale down when CPU drops below 25%.
Step 5: Clean Up (Optional)
If you no longer need the resources, delete them to avoid charges:
terraform destroy -auto-approve
🎯 Project Outcome
✅ VM Scale Set automatically adjusts instances based on CPU usage
✅ Reduced idle costs by 40% scaling down when demand is low
✅ Improved reliability by distributing load via Azure Load Balancer
✅ Terraform fully automates the process




