We’re here to work with you at all stages.

View all services
Build products using the latest engineering practices and designs that aren’t just functional but beautiful with Launch.
Learn more
Rethink how your product delivery teams build and design your products. Architect to build the building blocks that allow experimentation with Amplify.
Learn more
Gain market share by designing and building product features. Gain velocity by embedding our experts in your team with Catalyse.
Learn more
Take control of your cloud costs and technical debt, and add coverage for DevOps with Control.
Learn more

Articles

From user research, digital strategy to solving bold engineering problems. Our team specialises in providing a suite of services that take an idea from a rough sketch to an enterprise grade product.
View all articles

Tutorials

Learning new technologies and frameworks ensures we are ahead of the curve. Here is a collection of step by step tutorials about things we've learnt. Learn with us!
View all tutorials

Products

We love open source, and we love giving back. Take a look at our open source products and how we're pushing the bounds of Engineering excellence one product at a time
View all products

Culture

We believe the best digital products are built by a diverse and skilled team. We’ve created a safe inclusive workspace, and we believe in diversity. We are a group that believes in software development and design is a craft. This is what unites us.
Learn more

Mission, Vision & Purpose

Our team is diverse. Each coming from a different background and beliefs. We think of product development & design as a craft. We love to learn new ways of improving our craft - be it learning new frameworks, or adding new specialties.
Learn more

White Papers

We believe the best digital products are built by a diverse and skilled team. We’ve created a safe inclusive workspace, and we believe in diversity. We are a group that believes in software development and design is a craft. This is what unites us.
Learn more

Wednesday Wisdom

Our team is diverse. Each coming from a different background and beliefs. We think of product development & design as a craft. We love to learn new ways of improving our craft - be it learning new frameworks, or adding new specialties.
Learn more

White Papers

We believe the best digital products are built by a diverse and skilled team. We’ve created a safe inclusive workspace, and we believe in diversity. We are a group that believes in software development and design is a craft. This is what unites us.
Learn more

Wednesday Wisdom

Our team is diverse. Each coming from a different background and beliefs. We think of product development & design as a craft. We love to learn new ways of improving our craft - be it learning new frameworks, or adding new specialties.
Learn more
View all articles
AWS Glue: Unleashing the Power of Custom Code and Continuous Deployment
September 5, 2023
Mohammed Ali Chherawalla
CTO

In the realm of data engineering, AWS Glue has emerged as a powerful, fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load your data for analytics. But what if we told you that you could harness even more power from this service by using custom code and continuous deployment? In this tutorial, we'll show you exactly how to do that.

What is AWS Glue?

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. It provides all the capabilities needed for data integration so you can start analyzing your data and putting it to use in minutes instead of months.

What use cases does it handle very well?

AWS Glue shines in scenarios where you need to clean, enrich and move data across various data stores. It's especially useful when dealing with large amounts of disparate data, where manual coding would be time-consuming and error-prone.

What are the limitations of the visual editor?

While the visual editor in AWS Glue is a great tool for building ETL jobs, it does have its limitations. It may not provide the flexibility needed for complex transformations or specific use cases. Additionally, it might not be the best fit for developers who prefer coding over visual interfaces.

Why should I use a custom code?

Custom code allows you to tailor your ETL jobs to your specific needs, providing flexibility and control that the visual editor might not offer. It enables you to handle complex transformations and unique use cases, making your ETL jobs more efficient and effective.

What we’re going to do

In this tutorial, we'll walk you through the process of setting up a continuous deployment (CD) pipeline for your AWS Glue job using GitHub Actions. We'll also show you how to automate the building of a library, which will be pushed to an S3 bucket. This library will then be used within the Glue job.

Prerequisites

Before we start, make sure you have a basic understanding of the following:

  • Github Actions
  • YAML
  • AWS CLI

How we’re going to do it

Here's a step-by-step guide on how we'll proceed:

Step 1: Configure local setup for AWS Glue using Jupyter notebooks

We'll start by setting up your local environment for AWS Glue using Jupyter Notebooks. This will allow you to write, test, and debug your Glue scripts locally.

Step 2: Set up a CD pipeline for our Glue job

Next, we'll set up a CD pipeline for our Glue job using GitHub Actions. This will ensure that every time there's a merge to the dev branch, the script in AWS will be updated.

Step 3: Create a library

After that, we'll create a library that will contain common functionalities used in our Glue job.

Step 4: Automate Building of the library in GitHub Actions

We'll then automate the building of the library using GitHub Actions. This will ensure that the latest version of the library is always available for our Glue job.

Step 5: Run this on AWS

Now that we have our library ready, we’ll run our Glue job on AWS. We’ll do this by creating a pull request on GitHub, or if we’re confident, pushing it to the main branch directly. Once our changes are reflected on AWS, we’ll hit run, either via the notebook or from the actions drop-down.

Step 6: Use common functionality from the library

Finally, we'll show you how to use the common functionalities from the library in your Glue job. This will help you keep your Glue scripts clean and efficient.

Now that you've reached this far, are you ready to dive into the step-by-step tutorial and start building your continuous deployment pipeline for AWS Glue? Click here to access the comprehensive guide.

Wednesday is a boutique consultancy based in India & Singapore.

Let's talk

Wednesday is a boutique consultancy based in India & Singapore.

Let’s talk

2023