ODE – Migrating Configuration Data

by | Aug 7, 2015

ODE – Migrating Configuration Data

by Aug 7, 2015

We have recently been discussing various ways we can promote our configuration data for Optimal Data Engine (ODE) from one environment to the next.  Our config data is the heart and soul of ODE, it is the foundation for the entire engine.
The config in ODE is the relational data model we have built that holds all of the configuration required to make ODE run.  It includes definitions for all source and targets, as well as any mappings.  Being such an integral part of ODE, it is essential we are able seamlessly promote new or updated configuration code from one environment to the next.  Ultimately, it comes down to making this process as clean and easy for our customers as possible.  We don’t want them to have to deal with the nitty gritty of a code release.  We want to package our config up in a nice little bundle, deliver it, and have them install it without breaking a sweat.
Over a number of discussions a list of different options was collected on potential ways to capture changes and migrate the config code:

  • Compare scripts between environments to identify differences
  • A flag column to identify new config entries to be promoted
  • A back-end logging mechanism
  • A trigger based mechanism
  • A SQL merge statement
  • SQL insert, update and delete scripts
  • Version or release numbers to indicate a group of changes to be promoted
  • Version or release dates to indicate a range of changes to be promoted
  • Changes pushed as code into GIT and then pulled into the new environment

It obviously wasn’t as simple as picking one item off the list.  Many options had different merits, and the end result ended up being a bit of a mash between the options and they continue to evolve as discussions continue.  Two key challenges became evident during these discussions.  Firstly, we need to get all our ducks in a line so we know what we are releasing, and that we are also releasing it in the right order.  Secondly, we need to devise a simple solution for packaging it all up and releasing it into the next environment.  None of this should impact the customer, or cause them any undue stress.

Challenge One – Versioning

Multiple development streams can occur at any one time on the config and this means each section of code can be at a different development phase to another.  We need a way to distinguish between what is ready to be released and what is still being developed and tested.  We also have to take into account the various dependencies that may exist between different pieces of config and ensure that piece one gets released before piece two etc.  This makes it unrealistic to release the entire config code, as it currently stands, into the next environment.  We need a process to select and choose what we want to release and when we want to release it.
In order to successfully do this, we need to develop a tightly controlled versioning process, firstly to keep track of what has already been released, and secondly to bundle together specific changes into releases while also taking into account dependencies between pieces of config.  The simplest approach to this is to include a version column on all of our config tables.  This could also be called a release number.  We toyed with the idea of using version date, which would enable us to select all config after a specified date, however a version number gave us slightly more flexibility as we could distinguish between two new pieces of config that may have the same version date.  The version number will be controlled by the developer, instead of a trigger for instance, as we need control of what changes go in what release.  While a trigger may give us automation, it limits us in distinguishing between concurrent changes that need to be implemented separately.
In some cases, our config data may be created or updated on a regular basis.  Decision tables are a prime example of this.  Keeping control of versioning and when to release new rules from this table may become rather chaotic.  For this challenge, we have discussed the idea of an “in progress” table where new rules and mappings in development are first created in their “in progress” table, only moving to the final decision table once they are ready for release.
It is essential we keep a history of all changes that have happened in the config tables.  For this, we can use a trigger that automatically populates a history table.  This provides us with a number of benefits including being able to determine if a row has been inserted, updated or deleted, and also gives us a base for rolling back a release.  An alternative approach would be to create a view over the top of the history tables, with only the current rows.  This means we would only have to maintain one table.
We also need to know what is the current release in subsequent environments.  We can do this by implementing an audit table that keeps track of what release has been implemented in an environment.  This table can track valuable information such as who commissioned the release and when the release happened.

Challenge Two – Packaging

The next step is determining how we are going to package the code so we can effortlessly promote and execute the code without unnecessary manual steps.  We decided the best approach for this is to create a script that automatically generates a SQL merge statement containing any data pertaining to that release number.  Using simple dynamic SQL, we can extract all the data we need from the config tables by filtering on the release number.  Each version number encompasses one release, so each row that has a version of 5 will be released as a whole.
This code could then be pushed to Git, ready for another developer to pull it down and execute in the subsequent environment.  A SQL merge statement gives us the flexibility to handle inserts, updates and deletes while also being able to package it all up nicely into a single piece of code.
While this approach makes it easy for a customer to accept or install a new release of config data, it also enables us to rollback any release using a combination of the version number and config history tables.  Should a release fail, or if we decide to “clean out” a development version, we can delete everything with that particular version number and then re-implement any updates using the config history tables.  A delete script can easily be written for any release using the version number as a filter, and the update script from the history table into the current config can be a standardised script.
That’s the original plan formulated, now comes the fun task of putting it all into action.  No doubt we will come across a few hurdles, and we will have to adapt accordingly with some ideas replaced by others.
Until next time, Nic!

Other steps in the ODE Journey

And sometimes a hop and a skip and a jump
Data suggests you should get the COVID-19 vaccine

Data suggests you should get the COVID-19 vaccine

Data suggests you should get the COVID-19 vaccine Data for this New Zealand focused blog was collected from various worldwide sources between the 18th and 31st August 2021. A list of sources can be found at the end. NZ is currently vaccinating solely with the...

read more
Performing MERGE in PostgreSQL

Performing MERGE in PostgreSQL

photo by kolibri5 Some RDBMS support SQL statement MERGE, which also referred as upsert. This is a very common task in data warehousing: we compare the source dataset with what has been recorded in DW, insert new values and update those which have changed since the...

read more
It Is Ok To Make Something For A One Off.

It Is Ok To Make Something For A One Off.

I have done a bit of web scraping lately both to automate some of the tasks I do regularly and some to get data for projects. I have learned one big thing from doing this it is ok to do something for a one off ... sometimes. There are times when working on scripts...

read more
Word Clouds with Python.

Word Clouds with Python.

I have been using python for a while now and have used it for a number of different projects, but this one surprised me. Word clouds are something I have not used before. Then, I had the idea of using it for a project a while ago and struggled with finding a way to...

read more
Motivation Via Data

Motivation Via Data

How to Use Data to Motivate.I have found myself not motivated to get up and out walking for a while now and know I need to do something about it. However,  even with this realization that I have not done anything about it for the last few weeks. So thought I would try...

read more
Working With Typography. 101

Working With Typography. 101

We use words each day of our lives it is only when we use these same words on a screen that we think about how we display them. The ways you can play with typography are endless. You can go from the simply use of contrasting fonts for the body text and headings to the...

read more
Adobe How To – Photoshop Part 2

Adobe How To – Photoshop Part 2

Carrying on from part 1 where I covered how to remove objects from images in multiple ways I am going to cover some more things that can help you out in Photoshop first up is one of the big things in Photoshop, that being masking. The 2 main ways to mask that I use...

read more
Google Analytics… Why use it?

Google Analytics… Why use it?

Google Analytics I have been using Google Analytics for a while now mostly to look at the traffic on our site. Why I have been doing this is for a few reasons one of which is just to keep the team up-to-date with our top blogs. Why I do this and why it is important...

read more
Adobe How To – Photoshop Part 1

Adobe How To – Photoshop Part 1

Much like Illustrator Photoshop is a great tool for design. Photoshop is not just for editing photos, but for working with images to create works of art for whatever project you happen to be working on. There are many great tools in Photoshop to help you in your...

read more
Getting Past The Wall.

Getting Past The Wall.

Sometimes the hardest things about design work is not the design itself, but the countless hours where you are trying to think of what to even make. Sometimes it’s easy you sit at your desk and boom you are going. Other days you are sitting there coming up with ideas...

read more
Learning Python

Learning Python

The time has arrived where I've needed to learn more Python for a project I'm working on.  Here, I share some resources, thoughts and tips in case you find yourself also needing to learn Python, too. Python is an interpreted object-oriented language. This means you...

read more
Adobe How to – Illustrator

Adobe How to – Illustrator

Most Adobe programs have some things that are easy to do once you know how to do them, but if you don’t it’s not very intuitive so I thought I would do a list of things and tell or show you how to do them. Paste in place This is a very easy thing to do and can be very...

read more
Git 101

Git 101

Git is an unbelievably useful tool for version control. It is used to keep a backup of your work that you can revert to if needed. Working with git you are able to make branches of your work. A branch is an independent line of development used when working on any new...

read more
How to read and work with data in python

How to read and work with data in python

I've found that most of the tutorials for python data work are in videos.  While this can be good, it's not as easy to skim through for that one thing you are stuck on. So I thought I would cover some basics: how to read a csv and some simple data visualization. Read...

read more
How to install Oracle Apex on Windows EC2

How to install Oracle Apex on Windows EC2

I recently created an Apex application for a fun project, which gave me the chance to install Apex on windows EC2. Here are the steps I followed to install Apex, having decided to use ORDS (Web Listener) to run Apex. The version of Oracle Application Express I used is...

read more
Data Cloud Summit 2020

Data Cloud Summit 2020

At the end of last year, I attended a virtual conference organised by Snowflake for the Asia-Pacific region. The beauty of online event is that you can watch sessions on demand, which is handy as many interesting sessions were scheduled to be streamed at the same...

read more
BEAM*:  Requirements gathering for Agile Data Warehouses

BEAM*: Requirements gathering for Agile Data Warehouses

We've been BEAM*ing for over 7 years now, and really believe it's a solid way to gather business requirements when building Agile data warehouses.  Here's why. What is BEAM✲? BEAM✲ stands for Business Event Analysis & Modelling, and it’s a methodology for...

read more
How to create a timer item in Oracle Apex

How to create a timer item in Oracle Apex

In this blog, I'll walk you through how to create a timer item in your Oracle Apex application. In this example, I am creating a timer clock that starts from 00:00 that counts till the page is active. You can make modifications to your timer item to match your...

read more
First Steps with PostgreSQL

First Steps with PostgreSQL

Most of my career I have spent in big organisations developing and maintaining enterprise-wide solutions. Therefore, so far I’ve mostly worked with MS SQL Server and only touched OracleDB a bit. But for a new, smaller project, I've uses PostgreSQL for the first time...

read more
3 reasons why I like Python

3 reasons why I like Python

3 reasons why I enjoy coding with Python, and what I use it for (beginner)   I can’t be called a code monkey when compared to my colleagues, but recently I’ve found Python to be really helpful with some of my work tasks, so I signed up for a Udemy course to learn...

read more
0 Comments

Trackbacks/Pingbacks

  1. Data Vault blogs you need to read | OptimalBI - […] ODE – Migrating Configuration Data […]
  2. ODE – More on Migrating Configuration Data | OptimalBI - […] on Nic’s excellent blog on Migrating Configuration Data, I would like to elaborate a bit on the style, which he…
Submit a Comment

Your email address will not be published. Required fields are marked *