Eduard Keilholz

Hi, my name is Eduard Keilholz. I'm a Microsoft developer working at 4DotNet in The Netherlands. I like to speak at conferences about all and nothing, mostly Azure (or other cloud) related topics.
LinkedIn | Twitter | Mastodon | Bsky


I received the Microsoft MVP Award for Azure

Eduard Keilholz
HexMaster's Blog
Some thoughts about software development, cloud, azure, ASP.NET Core and maybe a little bit more...

Container Apps - Scaling with KEDA (part 3)

Container Apps - Scaling with KEDA

When I explain scaling challenges and ways to handle changes in workload demand, I always use a bank as comparison. Imagine you are a digital bank. Your customers need a service where they can change their address or request a new bank pass whenever one is lost. These services probably don’t need to scale very flexibly because the changes in workload demand are not that heavy. However, there will also probably be a service that handles financial transactions. This service has a different workload footprint because the amount of received financial transactions changes a lot during 24 hours, but also for example during holiday seasons.

The problem

This post, explains how I solved the financial transaction problem, but now related to the PollStar solution. If you did not read the previous posts of this thread, PollStar is a tool where you can create polls (ask a question with specified answer options) and push those polls to attendees of your session. Now creating and maintaining sessions, and polls are not used that often. However, the number of attendees affects the demand for the Votes Service dramatically. When 10,000 attendees join your session, the service needs to be able to handle 10,000 vote requests in a very short time.

Distributing the workload

The first part of my solution is to eliminate the synchronous handling of the vote requests. The first version of the Votes Service, handled the request immediately, calculated a summary for the poll, and communicated that summary over Azure Web PubSub for real-time communication. I solved this by introducing an Azure Service Bus queue. The Votes Service (that scales depending on HTTP traffic) now stores incoming votes on a Service Bus queue. The vote is not processed yet, but it arrived at the server and is now scheduled to be processed. As a response, I return an HTTP Accepted response to indicate the vote was successfully received.

Persisting incoming votes

Then I created an Azure Function with a Service Bus Queue trigger on the votes queue. This Azure Function will grab a couple of those vote requests (16 by default I believe) from the queue and process them. By processing. Processing the votes, in this case, means validating and storing the incoming votes in a votes repository. I use Azure Table Storage because it can handle large amounts of data, and because it is a hobby project so needs to be as cheap as possible ;)

Generating a summary

Now the votes a stored in a repository, let’s make this system a little bit more sophisticated. When an attendee enters a session (you can group one or more polls in one session), the currently active poll will be displayed and a summary is retrieved to display the processed votes. This is done by summarizing all processed votes in a simple table like so:

Answer Votes
A 4
B 12
C 37

The type of repository (Azure Table Storage) does not support queries that will give me this result, I need to go through all the votes and calculate the result manually. This process is somewhat time-consuming but also requires me to run through the entire table of votes, which is expensive. So I introduced a new Azure Service Bus queue to be able to schedule summary calculations. The queue again triggers an Azure Function that will calculate the summary and store the summary information in a repository, but also a cache layer.

Accessing the votes summary

Now when this Azure Function calculated a new votes summary for a certain poll, this summary is pushed over Azure Web PubSub to all attendees in real time. When new attendees join the session, the votes summary is requested from the Votes Service. This service will now use the cache-aside pattern to retrieve the summary from the cache when available, or from the repository when the value is not available in the cache.

Configuring scaling

Now the entire process of handling votes and generating vote summaries is offloaded to a distributed process, there still is the demand to scale this process whenever a large number of votes need to be processed. To do so I use KEDA. KEDA (or Kubernetes Event Driven Autoscaler) is a tool that you can install on top of Kubernetes and allows you to connect with ’event mechanisms’ like the Azure Service Bus, Event Grid, and a large number of other systems. It can determine the amount of work to be done and scale a container in a Kubernetes environment depending on exactly how you configured it.

KEDA comes with Azure Container Apps out of the box, so there is no need for you to install KEDA. The only thing you need to do is make sure it connects to your event mechanism and tell it how to scale. For the connection part, you can take advantage of the Secrets that come with Azure Container Apps, to store the connection information to, in this case, Azure Service Bus.

resource funcContainerApp 'Microsoft.App/containerApps@2022-03-01' = {
  name: '${defaultResourceName}-func'
  location: location
  identity: {
    type: 'SystemAssigned'
  }
  properties: {
    managedEnvironmentId: containerAppEnvironments.id
    configuration: {
      activeRevisionsMode: 'Single'
      secrets: [
        {
          name: 'servicebus-connection-string'
          value: serviceBusConnectionString
        }
      ]
      ...

The snippet above shows part of a Bicep file that I use to deploy the Container App. You can see that I add a servicebus-connection-string to the array of Container App secrets.

Then in the scale section of the Container App properties in Bicep, you can configure KEDA how to connect to Azure Service Bus and how to scale the Container App like so:

scale: {
  minReplicas: 1
  maxReplicas: 6
  rules: [
    {
      name: 'event-driven'
      custom: {
        type: 'azure-servicebus'
        metadata: {
          queueName: 'votes'
          messageCount: '20'
        }
        auth: [
          {
            secretRef: 'servicebus-connection-string'
            triggerParameter: 'connection'
          }
        ]
      }
    }
  ]
}

The container scales down to a minimum of 1 and a maximum of 6 replicas and it will do so using a custom scale rule. This rule is of type azure-servicebus and attaches to a queue called votes. This is the queue that accepts incoming vote requests before they are processed. The messageCount property is set to 20 meaning that the service will scale up when 20 or more unprocessed messages are waiting in the queue. When this happens, the service is probably not able to handle the number of incoming votes, so time to scale up. Finally, in the auth section, you see a reference to the earlier created secret, allowing for the configuration of the Service Bus connection.

When the pressure is off the valve, KEDA will automatically scale the container back down until its minimum amount of replicas is reached.