Data Masking with GitHub Actions Part 2 - Masking Algorithm

Estimated: 30 mins

Bytebase is a database DevSecOps platform designed for developers, security, DBA, and platform engineering teams. While it offers an intuitive GUI for managing database schema changes and access control, some teams may want to integrate Bytebase into their existing DevOps platforms using the Bytebase API.

In the previous tutorial, you learned how to set up a GitHub Action that utilizes the Bytebase API to define data masking policies. In this tutorial, we will explore how to customize both the masking algorithm and semantic types.


This is Part 2 of our tutorial series on implementing automated database masking using GitHub Actions:

Overview

In this tutorial, you'll learn how to automate database masking algorithms and semantic types using GitHub Actions and the Bytebase API. This integration allows you to:

  • Manage data masking rules as code
  • Automatically apply masking policies when PRs are merged

Here is a merged pull request as an example.

The complete code for this tutorial is available at: database-security-github-actions-example

This tutorial skips the setup part, if you haven't set up the Bytebase and GitHub Action, please follow Setup Instructions section in the previous tutorial.

Masking Algorithm

You may customize your own data masking algorithm with the help of a predefined masking type, such as Full mask, Range mask, MD5 mask and Inner/Outer mask.

In Bytebase console

Go to Data Access > Data Masking, click Masking Algorithm and click Add. You can create a new masking algorithm with a name and description, and later it can be used in the definition of semantic types.

bb-masking-algorithm

In GitHub Workflow

In the GitHub workflow bb-masking-2.yml, find the step Apply masking algorithm, which will apply the masking algorithm to the database via API. All the masking algorithms should be defined in one file in the root directory of masking/masking-algorithm.json. The code it calls Bytebase API is as follows:

response=$(curl -s -w "\n%{http_code}" --request PATCH "${BYTEBASE_API_URL}/settings/bb.workspace.masking-algorithm?allow_missing=true" \
--header "Authorization: Bearer ${BYTEBASE_TOKEN}" \
--header "Content-Type: application/json" \
--data @"$CHANGED_FILE")

By changing file masking/masking-algorithm.json, you can apply the masking algorithm to the database. Go to Bytebase console, click Data Access > Data Masking, go to Masking Algorithm page, you can see the masking algorithm is applied to the database.

Semantic Type

You may define semantic types and apply them to columns of different tables. Columns with the same semantic type will be masked with the same masking algorithm. For example, you may define a semantic type mobile and apply it to all the columns of phone number. Then you can define a masking algorithm range 4-10 for the partial level masking for semantic type mobile.

In Bytebase Console

Go to Data Access > Data Masking, click Semantic Types and click Add. You can create a new semantic type with a name and description, and select the masking algorithm.

bb-semantic-types

In GitHub Workflow

Find the step Apply semantic type, which will apply the semantic type to the database via API. All the masking algorithms should be defined in one file in the root directory as masking/semantic-type.json. The code it calls Bytebase API is as follows:

response=$(curl -s -w "\n%{http_code}" --request PATCH "${BYTEBASE_API_URL}/settings/bb.workspace.semantic-types?allow_missing=true" \
   --header "Authorization: Bearer ${BYTEBASE_TOKEN}" \
   --header "Content-Type: application/json" \
   --data @"$CHANGED_FILE")

By changing file masking/semantic-type.json, you can apply the semantic type to the database. Go to Bytebase console, click Data Access > Data Masking, go to Semantic Types page, you can see the semantic type is applied to the database.

Next Steps

Now you have successfully applied data masking algorithm and semantic type using GitHub Actions and Bytebase API. In the next part of this tutorial, you'll learn how to use data classification and global masking with GitHub Actions. Stay tuned!

Edit this page on GitHub

Subscribe toΒ Newsletter

By subscribing, you agree with Bytebase's Terms of Service and Privacy Policy.