View all tutorials
Running migrations for an Aurora Serverless cluster
July 29, 2021
Mohammed Ali Chherawalla
Software Engineer
Contents

Amazon Aurora Serverless enables you to run your database in the cloud without managing any database capacity. It automatically starts up, scales capacity up or down based on your application’s needs. However, Aurora Serverless does not have a public endpoint. There is no way to make the cluster public.

Hence, running migrations for Aurora Serverless is challenging since we can’t directly access it over the internet. It can only be accessed by

  • using a VPN into your VPC;
  • running your migration script in a VPC;
  • launching an EC2 instance and installing the DB client to work with the database; or
  • use the Aurora Web Data API

The above options involve a lot of additional work for a seemingly solved problem or incur additional cost. For example, provisioning an EC2 instance just to run database migrations for your “serverless” app seems like an anti-pattern and an unnecessary expense.

AWS Lambdas, through some trickery, however, can access the Serverless cluster. Why not delegate the execution of the migrations to a Lambda? This is a cost-effective approach since you’ll only pay for computing when it’s invoked - in the true Serverless spirit.

In this tutorial, we’ll set up and run database migrations for a ToDo application against a PostgreSQL Aurora Serverless Cluster.

This tutorial assumes that you are familiar with the following tools or frameworks

Starter Project

Please clone the starter project present here. This project contains the setup to create a PostgreSQL Aurora serverless cluster using the serverless-framework.

In this tutorial, we will add support to run database migrations on successful deployments.

Getting started

This tutorial is broken down into the following parts

  • Creating migration resources
  • Writing utility functions to run migrations
  • Creating a migration file
  • Writing a lambda for database migrations
  • Registering the Lambda using serverless
  • Writing and registering a post-deployment script
  • Updating the webpack config

Creating migration resources

Migrations require versioning. I prefer a database-first migration approach. This involves creating a version folder and adding all the .sql files related to the migration in that folder.

Step 1

Create the folder for the first migration


mkdir -p migrations/resources/v1

Step 2

Create a trigger to update the value of the updated_at column whenever an entry in the database is updated.

Create a new file 01_updated_at_trigger.sql


touch migrations/resources/v1/01_updated_at_trigger.sql

Copy the snippet below into the newly created file


CREATE OR REPLACE FUNCTION trigger_set_timestamp()
RETURNS TRIGGER AS $$
BEGIN
  NEW.updated_at = NOW();
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

Step 3

Create a users table to store the users.
Create a new file 02_create_users.sql


touch migrations/resources/v1/02_create_users.sql

Copy the snippet below into the newly created file


CREATE TABLE IF NOT EXISTS users (
 id SERIAL PRIMARY KEY,
 name TEXT NOT NULL, 
 user_ref TEXT NOT NULL,
    created_at timestamp NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at timestamp NULL DEFAULT CURRENT_TIMESTAMP,
    deleted_at timestamp NULL
);
CREATE INDEX IF NOT EXISTS users__idx__user_ref ON users (user_ref);
CREATE TRIGGER set_timestamp
BEFORE UPDATE ON users
FOR EACH ROW
EXECUTE PROCEDURE trigger_set_timestamp();

Step 4

Create a lists table to store the lists.

Create a new file 03_create_lists.sql


touch migrations/resources/v1/03_create_lists.sql

Copy the snippet below into the newly created file


CREATE TABLE IF NOT EXISTS lists (
 id SERIAL PRIMARY KEY,
 name TEXT NOT NULL,
 user_id INT NOT NULL,
    created_at timestamp NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at timestamp NULL DEFAULT CURRENT_TIMESTAMP,
    deleted_at timestamp NULL,
 CONSTRAINT lists__fk_user_id FOREIGN KEY (user_id) REFERENCES users (id)
);
CREATE INDEX IF NOT EXISTS list__idx__name ON lists (name);
CREATE TRIGGER set_timestamp
BEFORE UPDATE ON lists
FOR EACH ROW
EXECUTE PROCEDURE trigger_set_timestamp();

Step 5

Create a notes table to store the todos.

Create a new file 04_create_notes.sql


touch migrations/resources/v1/04_create_notes.sql

Copy the snippet below into the newly created file


CREATE TABLE IF NOT EXISTS notes (
 id SERIAL,
 note TEXT NOT NULL,
 deadline timestamp WITH time zone NOT NULL,
 list_id INT NOT NULL,
 done SMALLINT NOT NULL DEFAULT 0,
    created_at timestamp NULL DEFAULT CURRENT_TIMESTAMP,
    updated_at timestamp NULL DEFAULT CURRENT_TIMESTAMP,
    deleted_at timestamp NULL,
CONSTRAINT notes__fk_list_id FOREIGN KEY (
  list_id
) REFERENCES lists (
  id
) ON DELETE CASCADE ON UPDATE CASCADE
);
CREATE INDEX IF NOT EXISTS notes__idx__list_id ON notes (list_id);
CREATE INDEX IF NOT EXISTS notes__idx__note ON notes ("note");
CREATE TRIGGER set_timestamp
BEFORE UPDATE ON notes
FOR EACH ROW
EXECUTE PROCEDURE trigger_set_timestamp();

Commit all the code you’ve written so far.


git add .
git commit -m 'add migrations resources'

Writing utility functions to run migrations

We will use the sequelize framework to handle migrations but it doesn’t come with out-of-the-box support for database-first migrations. To support this, we will write a few utility functions.

Create a new folder for migrations utils using the code below


mkdir -p migrations/utils

Create a new file


touch migrations/utils/index.js

Copy the snippet below into the newly created file


const fs = require('fs');
const shell = require('shelljs');
// 1
function getVersion(currentFileName) {
  let version = 1;
  shell.ls(`./migrations`).forEach((item, index) => {
    if (item === currentFileName) {
      version = index + 1;
    }
  });
  return version;
}
// 2
async function migrate(currentFileName, queryInterface) {
  const migrationResourceDir = './migrations/resources/v';
  const version = getVersion(currentFileName.split('/')[currentFileName.split('/').length - 1]);
  const directories = shell.ls(`${migrationResourceDir}${version}`);
  for (let index = 0; index < directories.length; index++) {
    const fileName = directories[index];
    console.log('migrating: ', fileName);
    await queryInterface.sequelize
      .query(fs.readFileSync(`${migrationResourceDir}${version}/${fileName}`, 'utf-8'))
      .catch((e) => {
        console.log(e);
        const error = e.original.sqlMessage;
        if (error.startsWith('Table') && error.endsWith('already exists')) {
          // If the database is already built add this migration to sequelizeMeta table.
          return;
        }
        throw e;
      });
  }
}
module.exports = {
  migrate,
  getVersion
};

  1. Based on the position of the current filename in the migrations folder, we will get the version number that we need to target.
  2. After getting the version number, we will execute all the .sql files present in the migration/resources/v${versionNumber} directory.

Again commit all the code you wrote using the following git commands


git add .
git commit -m 'add migrations utils'

Quick interruption: Discover why top C-level execs are subscribing to LeadReads. Dive into the world of digital products – you'll be glad you did!

Join here.

Creating a migration file

Sequelize stores the name of the migration file in the SequelizeMeta table to remember which migrations it ran. So don’t rename the files once you’ve run the migrations.

Run the following command to create a new migration file


npx sequelize migration:generate --name init-db

The numbers represent the current date. This makes sure that the migrations are run in order of creation. Hence the file name will be different from the one that you generate. Copy the snippet below into the newly created file in the migrations folder


module.exports = {
  up: (queryInterface) => {
    const { migrate } = require('./utils/index');
    return migrate(__filename, queryInterface);
  },
  down: () => Promise.reject(new Error('error'))
};

Commit your code using the following git commands


git add .
git commit -m 'add migrations to initalise db'

Writing a lambda to handle database migrations

If the aurora serverless cluster is inaccessible via the internet then how can the Lambda access it?

  1. Deploy the AWS Lambda in the same VPC as your database

https://github.com/wednesday-solutions/aurora-serverless-cluster-migrations-starter/blob/develop/serverless.yml#L18-L24

2.  Create a Nat Gateway in a public subnet


https://github.com/wednesday-solutions/aurora-serverless-cluster-migrations-starter/blob/develop/serverless.yml#L30-L33

3. Create an Internet Gateway and a route pointing to it


https://github.com/wednesday-solutions/aurora-serverless-cluster-migrations-starter/blob/develop/serverless.yml#L29-L33

For details on its working please take a look at this article: https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function

Let’s start writing our Lambda function!

We should create a folder for each of our lambdas. I like to categorize functions by the operation they perform.

Step 1: Create the following folder structure.


mkdir -p functions/database/migrations

Step 1

Create a new project in that directory


cd functions/database/migrations
yarn init -y

A package.json file will be created.

Step 2

Create the index.js file


touch index.js

Copy the snippet below in the newly created file


import 'source-map-support/register'; //1
/**
 *
 * DatabaseMigrations
 *
 */
import shell from 'shelljs'; //2
exports.handler = async (event, context, callback) => {
  console.log(JSON.stringify(event));
 // 3
  shell.exec(`node_modules/sequelize-cli/lib/sequelize db:migrate --config config/config.js`);
};

  1. Registering the source map makes it easier to debug the application in production.
  2. The shelljs library allows us to run shell commands from nodejs. You can read more about it here.
  3. Invokes the sequelize cli to run the database migrations.

Commit your code using the following git commands


git add .
git commit -m 'add database migrations lambda'

Registering Lambdas with the serverless-framework

You need to register the AWS Lambdas with the serverless framework.

Step 1

Create a folder for lambdas in the resources folder.


mkdir -p resources/lambdas

Step 2

Create a functions.yml


touch resources/lambdas/functions.yml

Step 3

Copy the following snippet in the newly created file


databaseMigrations:  #1 
  handler: functions/database/migrations/index.handler #2
  role: LambdaServiceRole #3

  1. Name of the Lambda
  2. Path to the handler
  3. IAM Role of the Lambda
Step 4

You need to register the functions in the serverless.yml.

Paste the following snippet above the custom section in the serverless.yml


functions: ${file(./resources/lambdas/functions.yml)}

Commit your code using the following git commands


git add .
git commit -m 'register the Lambdas in the serverless.yml'

Writing and registering a post-deployment script

The database migrations need to be run after a successful deployment.

Step 1

Create the post-deployment.js file


touch scripts/post-deployment.js

Step 2

Copy the following code into the newly created file


function migrate(serverless) {
  // 1
  return `npx sls invoke --function databaseMigrations --stage=${serverless.variables.options.stage}`;
}
module.exports = migrate;

On successful deployment, we will invoke the databaseMigrations Lambda function.

Step 3

We need to tell the serverless framework to run the post-deployment.js file after a successful deployment. Paste the following code snippet in the scripts section


custom:
 scripts:
     hooks:
       'aws:deploy:finalize:cleanup': ${file(./scripts/post-deployment.js)}

Commit your code using the following git commands


git add .
git commit -m 'Add the post-deployment script'

Updating the webpack config

To be able to run the migrations from the Lambda should have the following dependencies forcefully included

  • sequelize-cli
  • sequelize
  • pg

Copy-paste the snippet below in the serverless.yml custom.webpack section to facilitate this


webpack:
    includeModules:
      forceInclude:
        - sequelize-cli
        - sequelize
        - pg
    excludeFiles: ./**/*.test.js

Commit your code using the following git commands


git add .
git commit -m 'Update webpack config'

Viola, you’re done!

The starter project already includes a Continuous Deployment workflow. You will need to add the Github secrets and push them to the master branch.

Now sit back and watch the magic unfurl!

Where to go from here

A managed relational database service that scales up and down on-demand allows you to focus solely on writing the business logic for your application. Using Aurora Serverless is a move in the right direction.

To see how you can resolve mutations directly off-of the database in your AWS AppSync application that uses Aurora Serverless as a data source please take a look at this repository.

I hope you enjoyed reading this tutorial on running migrations for an aurora serverless cluster. If you have any questions or comments, please join the conversation on twitter.