ODE – Migrating Configuration Data

by | Aug 7, 2015

ODE – Migrating Configuration Data

by Aug 7, 2015

We have recently been discussing various ways we can promote our configuration data for Optimal Data Engine (ODE) from one environment to the next.  Our config data is the heart and soul of ODE, it is the foundation for the entire engine.
The config in ODE is the relational data model we have built that holds all of the configuration required to make ODE run.  It includes definitions for all source and targets, as well as any mappings.  Being such an integral part of ODE, it is essential we are able seamlessly promote new or updated configuration code from one environment to the next.  Ultimately, it comes down to making this process as clean and easy for our customers as possible.  We don’t want them to have to deal with the nitty gritty of a code release.  We want to package our config up in a nice little bundle, deliver it, and have them install it without breaking a sweat.
Over a number of discussions a list of different options was collected on potential ways to capture changes and migrate the config code:

  • Compare scripts between environments to identify differences
  • A flag column to identify new config entries to be promoted
  • A back-end logging mechanism
  • A trigger based mechanism
  • A SQL merge statement
  • SQL insert, update and delete scripts
  • Version or release numbers to indicate a group of changes to be promoted
  • Version or release dates to indicate a range of changes to be promoted
  • Changes pushed as code into GIT and then pulled into the new environment

It obviously wasn’t as simple as picking one item off the list.  Many options had different merits, and the end result ended up being a bit of a mash between the options and they continue to evolve as discussions continue.  Two key challenges became evident during these discussions.  Firstly, we need to get all our ducks in a line so we know what we are releasing, and that we are also releasing it in the right order.  Secondly, we need to devise a simple solution for packaging it all up and releasing it into the next environment.  None of this should impact the customer, or cause them any undue stress.

Challenge One – Versioning

Multiple development streams can occur at any one time on the config and this means each section of code can be at a different development phase to another.  We need a way to distinguish between what is ready to be released and what is still being developed and tested.  We also have to take into account the various dependencies that may exist between different pieces of config and ensure that piece one gets released before piece two etc.  This makes it unrealistic to release the entire config code, as it currently stands, into the next environment.  We need a process to select and choose what we want to release and when we want to release it.
In order to successfully do this, we need to develop a tightly controlled versioning process, firstly to keep track of what has already been released, and secondly to bundle together specific changes into releases while also taking into account dependencies between pieces of config.  The simplest approach to this is to include a version column on all of our config tables.  This could also be called a release number.  We toyed with the idea of using version date, which would enable us to select all config after a specified date, however a version number gave us slightly more flexibility as we could distinguish between two new pieces of config that may have the same version date.  The version number will be controlled by the developer, instead of a trigger for instance, as we need control of what changes go in what release.  While a trigger may give us automation, it limits us in distinguishing between concurrent changes that need to be implemented separately.
In some cases, our config data may be created or updated on a regular basis.  Decision tables are a prime example of this.  Keeping control of versioning and when to release new rules from this table may become rather chaotic.  For this challenge, we have discussed the idea of an “in progress” table where new rules and mappings in development are first created in their “in progress” table, only moving to the final decision table once they are ready for release.
It is essential we keep a history of all changes that have happened in the config tables.  For this, we can use a trigger that automatically populates a history table.  This provides us with a number of benefits including being able to determine if a row has been inserted, updated or deleted, and also gives us a base for rolling back a release.  An alternative approach would be to create a view over the top of the history tables, with only the current rows.  This means we would only have to maintain one table.
We also need to know what is the current release in subsequent environments.  We can do this by implementing an audit table that keeps track of what release has been implemented in an environment.  This table can track valuable information such as who commissioned the release and when the release happened.

Challenge Two – Packaging

The next step is determining how we are going to package the code so we can effortlessly promote and execute the code without unnecessary manual steps.  We decided the best approach for this is to create a script that automatically generates a SQL merge statement containing any data pertaining to that release number.  Using simple dynamic SQL, we can extract all the data we need from the config tables by filtering on the release number.  Each version number encompasses one release, so each row that has a version of 5 will be released as a whole.
This code could then be pushed to Git, ready for another developer to pull it down and execute in the subsequent environment.  A SQL merge statement gives us the flexibility to handle inserts, updates and deletes while also being able to package it all up nicely into a single piece of code.
While this approach makes it easy for a customer to accept or install a new release of config data, it also enables us to rollback any release using a combination of the version number and config history tables.  Should a release fail, or if we decide to “clean out” a development version, we can delete everything with that particular version number and then re-implement any updates using the config history tables.  A delete script can easily be written for any release using the version number as a filter, and the update script from the history table into the current config can be a standardised script.
That’s the original plan formulated, now comes the fun task of putting it all into action.  No doubt we will come across a few hurdles, and we will have to adapt accordingly with some ideas replaced by others.
Until next time, Nic!

Other steps in the ODE Journey

And sometimes a hop and a skip and a jump
Qlik Replicate – Task Metadata

Qlik Replicate – Task Metadata

Recently I was working for a client and had a specific issue which led me to probe around various aspects of Qlik Replicate tasks, attempting to understand how Qlik Replicate works. While I did not manage to achieve what I was specifically attempting I did have a look...

read more
SAS Functions Book

SAS Functions Book

Over the years, we've been building up our comic-style SAS function one-page guides, drip-feeding them to you.  Well, guess what?!  We put them all together and made a book so they're all in the one place, to make referencing them easier for you. Thank you...

read more
Zen and the art of database model maintenance

Zen and the art of database model maintenance

So there you are having created a work of beauty in Sparx Enterprise Architect that models all of your databases, and then the fly lands in the ointment.  There's a new version of "stuff" coming down the pipeline and you need to update the model or things are...

read more
Checking the File Path from Database in PowerShell

Checking the File Path from Database in PowerShell

On one of my recent projects I was analysing my customer's data for migration. Their legacy application could handle documents. They were stored on the organisation's shared folder, and the file path was recorded in the database with the rest of data. As a part of the...

read more
The best lockdown baking recipes

The best lockdown baking recipes

The nation has been hit by a baking craze!  Lockdown gave some of us more time to do some of the simple things in life, like baking, and, if you were lucky enough to get flour and yeast you could create some delicious treats! I've always enjoyed baking, so loved...

read more
Snowflake and Qlik: End-to-End BI Solution

Snowflake and Qlik: End-to-End BI Solution

Recently I watched a webinar organised by Snowflake, it was called "Data Warehouse Automation, Ingestion and Industry Leading Analytics with Snowflake and Qlik". In the past I could never find time for webinars, and it's amazing how life can give you opportunities to...

read more
Being social while social distancing

Being social while social distancing

The world has been social distancing for some time now, and will be for a bit longer, depending where you live. To help put off or at least mitigate going stir crazy I thought I would share what I have been doing to keep up some sort of social life.   Before the lock...

read more
Our How To blogs

Our How To blogs

We do cool sh!t with data and over the years we have shared a lot of blogs to help you do cool sh!t with data too.  If you want to jump right in and see if we have something that piques your interest here are all our how to blogs.  If you...

read more
Free ETL tools in 2020

Free ETL tools in 2020

Traditionally business intelligence is an enterprise solution, as only big businesses have multiple sources that require integration and analysis. It is assumed that the rest are small fishes that could be able to analyse their data with simpler accessible tools or...

read more
SAS Utilities: cleanwork

SAS Utilities: cleanwork

When a user starts and runs code in a SAS session, a number of temporary folders and files are created in the SASWork and SASUtil locations.  If that session completed and exits normally these folders and files are removed, however some SAS sessions exit abnormally,...

read more
1080 By The Numbers

1080 By The Numbers

If you have lived in New Zealand for long enough, I am sure you would have heard of the poison 1080, used by the Department of Conservation (DOC) as a form of pest control. 1080 can be used in bait stations or by airdrop, which is used for many of the remote and hard...

read more
Keeping your team connected when they’re working remotely

Keeping your team connected when they’re working remotely

Lockdown started on the March 26 here in New Zealand, and following the Prime Minister's announcement on April 20th, we're looking at just an extra 2 business days in Level 4, before we move to Level 3 after the ANZAC long weekend. Level 3 means that...

read more
Adobe equivalent Free Open Source Software (Pt2)

Adobe equivalent Free Open Source Software (Pt2)

In part two of my Adobe equivalent Free Open Source Software (FOSS) blog I thought I would look at the video editing pipeline and find some free alternatives for your video editing needs. You can read Part One which covers Photoshop, Illustrator and InDesign here....

read more
Difference between ETL and CDC

Difference between ETL and CDC

Reporting and analytics computations are usually very resource-consuming, therefore they are never executed on the same server where the crucial business application is running. Instead, data is copied into dedicated servers to be dissected and studied for insights....

read more
Keeping your brain busy while in lock down

Keeping your brain busy while in lock down

Many places all over the world are currently in self-isolation mode in response to the COVID-19 pandemic. In New Zealand, lock down started on Thursday March 26, which means that some people are able to work from home. At OptimalBI, we are working from home and on the...

read more
Check out our new website!

Check out our new website!

Woohoo we've got a new website!  If you're reading this blog you're already on it and we reckon you should click a few extra buttons than you were planning to and go check it out... Data really does make anything possible, and we're here to help you harness data...

read more
Planning for the COVID-19 Pandemic at OptimalBI

Planning for the COVID-19 Pandemic at OptimalBI

We find ourselves in unprecedented times. All across the globe the impact of COVID-19 is being felt, it’s heart-breaking reading the stories of loss, the impact on societies, businesses and individuals as a result of measures put in place to restrict the momentum of...

read more
Building Services and Applications for Generic Data

Building Services and Applications for Generic Data

One of the things many developers love about MongoDB and other NoSQL databases is the ability to store flexible objects in their native format. When you walk around the office singing its praises, experienced developers chime in and tell you that this is a quick path...

read more
What is data literacy and why should you care?

What is data literacy and why should you care?

You might have heard this term thrown about a bit recently.  Perhaps it's because you've heard politicians referring to it when discussing educational reform, or you've noticed how your job has you doing more with spreadsheets than you did five years ago.  What ever...

read more
What is algorithmic bias and why should you care?

What is algorithmic bias and why should you care?

What is algorithmic bias? A systematic error in computer algorithms which leads to unfair outcomes and giving privileges to one of the groups of users over all others. An algorithm is a set of instructions designed to perform a specific task.  This can be a simple...

read more
0 Comments

Trackbacks/Pingbacks

  1. Data Vault blogs you need to read | OptimalBI - […] ODE – Migrating Configuration Data […]
  2. ODE – More on Migrating Configuration Data | OptimalBI - […] on Nic’s excellent blog on Migrating Configuration Data, I would like to elaborate a bit on the style, which he…
Submit a Comment

Your email address will not be published. Required fields are marked *