Eduard Keilholz

Hi, my name is Eduard Keilholz. I'm a Microsoft developer working at 4DotNet in The Netherlands. I like to speak at conferences about all and nothing, mostly Azure (or other cloud) related topics.
LinkedIn | Twitter | Mastodon | Bsky


I received the Microsoft MVP Award for Azure

Eduard Keilholz
HexMaster's Blog
Some thoughts about software development, cloud, azure, ASP.NET Core and maybe a little bit more...

Create Custom Vm Images Using Packer

In my previous post, I explained what Azure DevOps scale set agents are and how important they could be for your CI/CD process. Then I showed you the advantage of customizing VM images for the VMSS so it spins up VM’s using an always up-to-date image. The previous post, however, was a pretty much ‘manual’ approach. OK, a lot of the process was automated and runs every week by making use of a scheduled Azure DevOps Pipeline, but the process itself is a pretty much manual step-by-step approach and there is a more efficient way to do so.

In this post, I will show you how to use Packer, to make the entire process easier to write, easier to maintain, more resilient, and maybe the best, way faster. Packer is an open-source tool that allows you to automate building VM images in a very efficient way. The pipeline example in my previous post does a lot of manual labor that is simply required from a technical perspective. Packer takes this labor away and lets you focus on the software to install on the VM Image.

In this post, I will show a more efficient way to build these images, using Packer. And this time, I will not limit the VM to dotnet SDKs and the Azure CLI, but also install NodeJS and docker just because it is so easy and very convenient to have these also installed on your image. The deployment script itself is written by my colleague Marnix van Valen.

Getting work done

So first things first. Packer relies on running a bash script. This bash script contains a bunch of commands that ultimately lead to having your desired software installed. So the script I use (written by my colleague Marninx van Valen) looks like so:

echo '==== Add Microsoft package source ===='
wget https://packages.microsoft.com/config/ubuntu/20.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb
dpkg -i ./packages-microsoft-prod.deb
rm ./packages-microsoft-prod.deb
apt-get update

echo '==== Update from package sources ===='
apt-get upgrade -y

echo '==== Install dependencies ===='
DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata
apt-get install -y wget apt-transport-https software-properties-common curl

echo '==== PowerShell ===='
apt-get update && apt-get install -y powershell

echo '==== Azure CLI ===='
curl -sL https://aka.ms/InstallAzureCLIDeb | bash

echo '==== dotnet 3.1 ===='
# dotnet 3.1 is used by azure devops itself
apt-get install -y dotnet-sdk-3.1

echo '==== dotnet 6 ===='
apt-get install -y dotnet-sdk-6.0

echo '==== dotnet 7 ===='
apt-get install -y dotnet-sdk-7.0

echo '==== Node.js ===='
curl -sL https://deb.nodesource.com/setup_16.x | bash
apt-get install -y nodejs
apt-get install -y build-essential

echo '==== Docker ===='
# https://docs.docker.com/engine/install/ubuntu/
apt-get remove docker docker-engine docker.io containerd runc
apt-get install ca-certificates curl gnupg lsb-release
sudo mkdir -m 0755 -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

echo '=== Kubectl ==='
# https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#install-using-native-package-management
apt-get install -y ca-certificates curl
sudo curl -fsSLo /etc/apt/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubectl

I stored the bash file above as pipelines/deployment/deploy.sh

The comments in the script take you somewhat by the hand at what’s going on. Again, I like to use Linux agents, so the base image I use is a Linux (Ubuntu) image. When you want to install software on such a machine, it is easy to rely on a package manager. However, the default package source does not contain all the packages that we need to install. I, therefore, change the package feed to one of Microsoft’s just to make sure all the tools we need to install are available through the package manager. Then the package manager index is updated and finally, the install commands are executed.

Note that almost at the end of the script, docker is installed. Although the command seems to be a bit complicated, the command can be copied and pasted from the docker documentation. Exactly this makes the use of Packer very friendly.

One of the advantages of writing such a bash script is that you can provision a VM at any time and run the script, just to test and make sure everything runs fine. Once you’re satisfied, you can now focus on creating a (new) pipeline.

Starting the pipeline

First of all, I want the pipeline to run whenever changes are applied to its source. But also, to prevent the images to become outdated, I want the pipeline to run on a weekly basis. So now I’m sure a fresh image is created each and every week, but also when changes to the pipeline or bash script are pushed.

trigger:
  branches:
    include:
      - "*"

schedules:
  - cron: "0 0 * * 0" # Weekly at Sunday night
    displayName: Weekly Sunday night build
    branches:
      include:
        - main
    always: true

Then I load in a variables file just to make the pipeline a little more flexible and re-usable for multiple projects:

variables:
  - template: image-builder-variables.yml

name: $(pipelineName)

Pipeline variables

To make pipelines a little bit more reusable over time, parameters and variables turn out to be of good value. For this pipeline, I use a central variables file so when I want to re-use it to do this very same exercise once more, I only have to change the variables in the variables file and I am (should be) good to go.

The pipeline variables file stored in pipelines/image-builder-variables.yml looks like so:

variables:
  - name: azureLocation
    value: westeurope
  - name: azureLocationAbbreviation
    value: weu
  - name: date
    value: $[format('{0:yyyyMMdd}', pipeline.startTime)]
  - name: targetSubscriptionId
    value: azure-subscription-guid
  - name: targetResourceGroup
    value: name-of-resource-group
  - name: targetStorageAccount
    value: name-of-storage-account
  - name: serviceConnectionName
    value: name-of-azdo-service-connection

Versioning

I really like the idea of semantic versioning. Taking control of version numbers is important and being able to generate and predict reliable version numbers is of good value. For this pipeline, I use GitVersion to generate a version number based on commits (and commit messages) in Git. This allows me to generate predictable version numbers but also allows me to take control over the version number by adding information to git commit messages. In this case, I use the short SHA that is generated by GitVersion, to create a VM image with a unique name. Executing GitVersion is the first stage of my multi-stage pipeline:

stages:
  - stage: versionize
    displayName: Determine version number
    jobs:
      - job: determine_version
        displayName: Determine version
        steps:
          - template: steps/gitversion.yml

And the pipelines/steps/gitversion.yml looks like so:

steps:
  - checkout: self
    fetchDepth: 0

  - task: gitversion/setup@0
    displayName: Install GitVersion
    inputs:
      versionSpec: "5.9.x"

  - task: gitversion/execute@0
    displayName: Determine Version
    inputs:
      useConfigFile: true
      configFilePath: ./gitversion.yml

  - bash: |
      echo '##vso[task.setvariable variable=assemblyVersion]$(GitVersion.AssemblySemVer)'
      echo '##vso[task.setvariable variable=packageVersion]$(GitVersion.MajorMinorPatch)'
      echo '##vso[task.setvariable variable=semanticVersion]$(GitVersion.SemVer)'
      echo '##vso[task.setvariable variable=versionNumber]$(GitVersion.MajorMinorPatch)'      
    displayName: Setting version variables

  - bash: |
      echo '##vso[task.setvariable variable=versionNumberOutput;isOutput=true]$(GitVersion.MajorMinorPatch)'
      echo '##vso[task.setvariable variable=versionShortSha;isOutput=true]$(GitVersion.ShortSha)'
      echo '##vso[build.updatebuildnumber]Deployment-$(GitVersion.SemVer)'      
    displayName: Output version variables
    name: versioning

To use GitVersion, you must have the GitVersion extension installed in your Azure DevOps tenant. Also, you can see that it relies on a configuration file to make sure it generates a version number to exactly your needs. My configuration in file ./gitversion.yml looks like so:

assembly-versioning-scheme: MajorMinorPatch
assembly-file-versioning-scheme: MajorMinorPatchTag
assembly-informational-format: "{InformationalVersion}"
mode: Mainline
tag-prefix: "[vV]"
continuous-delivery-fallback-tag: ci
major-version-bump-message: '\+semver:\s?(breaking|major)'
minor-version-bump-message: '\+semver:\s?(feature|minor)'
patch-version-bump-message: '\+semver:\s?(fix|patch)'
no-bump-message: '\+semver:\s?(none|skip)'
legacy-semver-padding: 4
build-metadata-padding: 4
commits-since-version-source-padding: 4
commit-message-incrementing: Enabled
branches: {}
ignore:
  sha: []
increment: Inherit
commit-date-format: yyyy-MM-dd
merge-message-formats: {}

In this case, I do not only use GitVersion as input for the version number. Because the version number is now only determined by commits to the git repository. The problem here is that the pipeline will also run on a scheduled trigger. When the source code is not changed, it will generate the same version number over and over again. Since you cannot overwrite an image (especially when it is in use), we need to generate a version number that is unique for each and every pipeline run. This is why I also use the Azure DevOps build ID. Now the version number is generated and predictable, but also unique for each and every pipeline run.

For more information about the configuration of GitVersion, I refer to the GitVersion documentation.

Creating the image

The next stage will generate the VM Image. Packer needs a storage account to be available, so the first step is to create a storage account. This step is idempotent so when the storage account already exists, the command still succeeds. Then packager is executed.

- stage: image_builder
  displayName: Build Custom VM Image
  dependsOn: versionize
  variables:
    versionSha: $[ stageDependencies.versionize.determine_version.outputs['versioning.versionShortSha'] ]
  jobs:
    - job: build
      displayName: Build Image
      steps:
        - task: AzureCLI@2
          displayName: Create storage account
          continueOnError: true
          inputs:
            azureSubscription: $(serviceConnectionName)
            scriptType: pscore
            scriptLocation: inlineScript
            inlineScript: |
              az storage account create --name $(targetStorageAccount) --location $(azureLocation) --kind StorageV2 --sku Standard_LRS --resource-group $(targetResourceGroup)              

        - task: PackerBuild@1
          displayName: Build Image with Packer
          inputs:
            templateType: "builtin"
            ConnectedServiceName: $(serviceConnectionName)
            isManagedImage: true
            managedImageName: "build-agent-$(versionSha)"
            location: $(azureLocation)
            storageAccountName: $(targetStorageAccount)
            azureResourceGroup: $(targetResourceGroup)
            baseImageSource: "default"
            baseImage: "Canonical:0001-com-ubuntu-server-focal:20_04-lts:linux"
            packagePath: "pipelines/deployment"
            deployScriptPath: "deploy.sh"
            additionalBuilderParameters: '{"vm_size":"Standard_D2_v3"}'
            skipTempFileCleanupDuringVMDeprovision: false

        - task: AzureCLI@2
          displayName: Remove versioned images
          continueOnError: true
          inputs:
            azureSubscription: gd-deployment-sub
            scriptType: pscore
            scriptLocation: inlineScript
            failOnStandardError: false
            inlineScript: |
              az resource delete --ids $(az resource list --tag Temporary=True --query "[].id" --output tsv)              

        - task: AzureCLI@2
          displayName: Tag image
          inputs:
            azureSubscription: gd-deployment-sub
            scriptType: pscore
            scriptLocation: inlineScript
            inlineScript: |
              az tag create --resource-id /subscriptions/$(targetSubscriptionId)/resourcegroups/$(targetResourceGroup)/providers/Microsoft.Compute/images/build-agent-$(versionSha) --tags Temporary=True              

The pipeline stage above creates a new storage account that Packer will use. Then the image is created. Note that the base image Ubuntu Server 20.04 is used. Also, look at the additionalBuilderParameters where the VM Size is configured. This is what you pay for, so pick a size that handles your payload well, while still keeping costs fairly low.

The last two tasks are used to clean up resources. Note that they have failOnStandardError set to false as they are allowed to fail. I use Azure Tags to remove resources. In this case, I set a tag Temporary with the value true to indicate that this is a volatile image. I first remove all images that have this tag. When the command completes, I set this same tag (same name and value) on the newly created image. This means that this image will be removed the next time this pipeline runs.

VM Images cannot be removed when VM instances running that image are still active. This is why I allow the removal process to fail. If the image cannot be removed, that’s fine. The VMSS will be updated at a later stage, and future pipeline runs will remove the image at a later time.

Updating VMSS

The final step you need to take is to update VMSS so it uses the new image when new VM instances are instantiated. This can be done with a single Azure CLI command. The stage looks like so:

- stage: set_usage
  displayName: Configure VMSS
  dependsOn:
    - versionize
    - image_builder
  condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
  variables:
    versionSha: $[format('{0}{1}',stageDependencies.versionize.determine_version.outputs['versioning.versionShortSha'], variables['Build.BuildId']) ]
  jobs:
    - job: activate
      displayName: Activate new image
      steps:
        - task: AzureCLI@2
          displayName: Azure CLI
          inputs:
            azureSubscription: $(serviceConnectionName)
            scriptType: pscore
            scriptLocation: inlineScript
            inlineScript: |
              az vmss update --resource-group $(targetResourceGroup) --name $(targetResourceGroup)-vmss --set virtualMachineProfile.storageProfile.imageReference.id=/subscriptions/$(targetSubscriptionId)/resourceGroups/$(targetResourceGroup)/providers/Microsoft.Compute/images/build-agent-$(versionSha)              

Note that this stage has a condition. For testing purposes, I want the pipeline to run when commits are pushed in a pull request. This way I can see the effect of changes in a running pipeline. Only when the pull request is approved and the code is merged to my main branch, the last stage will run, updating the VMSS settings.

The command is an az vmss update command. You need to specify a Resource Group, a Name, and a reference to the (newly created) image.

That’s it. You’re good to go. Your image is now updated and the VMSS will spin up images with your new VM image as its source