Run Entity Framework Core Migrations with Kubernetes Jobs and Managed Identity

Database migration is a flexible operation. You can run them as a deployment step on a build server. The disadvantage of that approach is a larger window between migration being applied and the application being ready to run. This is an even bigger issue when trying to employ GitOps strategies and pull-based releases where the gap between build server and deployment is longer. It's also the security issue with having to handle access and IP openings for the build agent.

It's possible to run a migration as part of the application runtime. This won't have a security issue because the application runs the migration with its own connection. However, it's a disadvantage when running long-running migrations, causing long startup times and a myriad of other issues.

The golden middle way; Running the migration as a job! This allows a way to apply the migration from the environment in the application is running and runs just before the application is ready to go live. Rollback is automatic if the migration fails and will fall back to the previous working version of the deployment.

Create an Image containing the Migration Script

The first step is to create an image containing the migrations to run.

Generate the migration script with dotnet ef command with the idempotent flag.
Copy the script to the image.
Run the migration with a simple dotnet script. The script uses the Microsoft.Data.SqlClient package and connects to the database using the managed identity. Using managed identity gives the migration script the same privileges as the service itself.
Build and push the image to your container registry.

I've written a C# script that targets Azure SQL servers. Your script might look like this.

using System.CommandLine;  
using System.Transactions;  
using Microsoft.Data.SqlClient;  
using Serilog;  
  
Log.Logger = new LoggerConfiguration()  
    .WriteTo.Console()  
    .CreateLogger();  
  
var envOption = new Option<string>(  
    "--env",  
    getDefaultValue: () => Environment.GetEnvironmentVariable("ASPNETCORE_ENVIRONMENT") ?? "dev",  
    description: "The environment (e.g., dev, test, prod)");  
var dbNameOption = new Option<string>(  
    "--db-name",  
    description: "The name of the database");  
var commandTimeoutOption = new Option<int>(  
    "--command-timeout",  
    getDefaultValue: () => 120,  
    description: "The command timeout in seconds");  
  
var rootCommand = new RootCommand("Database migration init container")
{ envOption, dbNameOption, commandTimeoutOption };
  
rootCommand.SetHandler((env, dbName, timeout) =>  
{  
    Log.Information("Running database migration for {DbName} in {Env}", dbName, env);  
    try  
    {  
        var files = Directory.EnumerateFiles("sql-scripts",  
                "*.sql",  
                SearchOption.AllDirectories);  
	    var dbServer = $"myservice-{env}-sqlserver";  
  
        var connectionString = $"Server=tcp:{dbServer}.database.windows.net,1433;Database={dbName};Authentication=Active Directory Default; Command Timeout={timeout}";  
  
        foreach (var file in files)  
        {
	        Log.Information("Running: {File}", file);  
            var scriptSegments = File.ReadAllText(file)  
                .Split("GO")  
                .ToList();  
  
            using var scope = new TransactionScope();  
            using var connection = new SqlConnection(connectionString);  
            connection.Open();
            foreach (var scriptSegment in scriptSegments)  
            {
	            using var command = new SqlCommand(scriptSegment, connection);  
                command.ExecuteNonQuery();  
            }
            scope.Complete();  
            Log.Information("Completed: {File}", file);  
        }
    }
    catch (Exception e)  
    {
	    Log.Error(e, "Error running migration");  
        Environment.Exit(1);  
    }
    Environment.Exit(0);  
}, envOption, dbNameOption, commandTimeoutOption);  
  
await rootCommand.InvokeAsync(args);

We want to use this script as a base image and build separate images per service containing the SQL migrations needed. To build a base image:

# Build Stage
FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build
WORKDIR /src

# Copy everything and restore as distinct layers
COPY . .
RUN dotnet restore

# Build and publish the application
RUN dotnet publish -c Release -o /app/publish

# Runtime Stage
FROM mcr.microsoft.com/dotnet/runtime:8.0
WORKDIR /app
COPY --from=build /app/publish .

USER 64198
ENTRYPOINT ["dotnet", "MigrateDatabase.dll"]

Build and push the image to your registry. docker build --file Dockerfile --tag myregistry.azurecr.io/database-migrator:some-tag . and docker push myregistry.azurecr.io/database-migrator:some-tag.

Copy the migration scripts and build the image

It's time to build the migrations and copy them onto our image. Create the migration by running dotnet-ef migrations script --idempotent --output artifacts/sql-scripts. You can also copy other SQL scripts that you want to apply at deployment into this folder.

FROM myregistry.azurecr.io/database-migrator:some-tag  
COPY /artifacts/sql-scripts sql-scripts
CMD ["--db-name", "myservice"]

Why use a Job?

I initially wanted to run the migration as an init container. Running the deployment with an init container is a simple configuration and would work for single deployments, however, I wanted to start several deployments that depend on this single migration. In that case, we need to run a Kubernetes Job. Setting up a managed identity for a service is already described in my previous blog post about the matter.

Set up a manifest like this for the migration job:

apiVersion: batch/v1
kind: Job
metadata:
  name: myservice-database-migrator
  labels:
    component: database
spec:
  ttlSecondsAfterFinished: 60
  template:
    metadata:
      labels:
        component: database
        azure.workload.identity/use: 'true'
    spec:
      serviceAccountName: myservice-workload-identity
      containers:
        - name: myservice-database-migrator
          image: thonatlas.azurecr.io/myservice-database-migrator:latest
          env:
            - name: ASPNETCORE_ENVIRONMENT
              value: dev
      restartPolicy: OnFailure

We want the containers to wait for the jobs to finish before starting the containers. Here we can use an init container with groundnuty's k8s-wait-for image and target the job that we want to wait for. Add an init container to wait for the migration job:

initContainers:
  - name: wait-for-database-migration-job
    image: groundnuty/k8s-wait-for:v2.0
    args:
      - "job"
      - "myservice-database-migrator"

Grant permissions for the init container to read the job status:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: myservice-job-reader-binding-cluster
subjects:
  - kind: ServiceAccount
    name: myservice-workload-identity
    namespace: default
roleRef:
  kind: ClusterRole
  name: job-reader-cluster
  apiGroup: rbac.authorization.k8s.io

Conclusion

After everything is set up we should be able to deploy the application to the cluster and the database migration should run before everything else. This technique also applies to other types of migrations and Kubernetes makes it easy to chain and manage these in an orderly fashion.