I started working at OptimalBI a few months ago. My area of expertise is Data Warehouse development, using Microsoft SQL Server. I was, therefore, a good candidate to test how difficult it is to start using ODE (Optimal Data Engine).
ODE is an open source application that can be used to build an automated data warehouse environment. The core of ODE is built around the Data Vault approach. Despite having experience in Data Warehousing, Data Vault was new for me. First, I had to learn Data Vault theory. I got it quite quickly as there is no major conceptual change to what I already know, it’s just a new way of organising data structures. So, I was ready to start doing Vault things myself quite quickly.
I was told that ODE is a cool tool which builds Data Vaults. I thought: “Wow, I’ll just click the big button and it’ll do everything for me!” Not really. It’s still just a tool, and as with any tool, the results are dependent on whoever is using it. ODE is a tool which helps to build a Data Vault quickly. It does the dirty jobs while developers are still doing the cool stuff; development!
Development with ODE is config-driven. First, you need to understand the business concepts; then you turn them into Data Vault ensembles, which is called modelling; then you describe a model to ODE using simple SQL commands. There are lots of them, as you need to describe not only objects but relationships between them as well. Once the ODE configuration engine has the data, it can create the Vault.
I’ve built my first Data Vault using the AdventureWorks database as a source with some help from the ODE.ninja website. I’ve picked up how to create hubs and satellites very quickly. ODE.ninja describes it well. Links were not so easy, I had to ask lots of questions. Don’t worry, I’ve put my experience together in my first ODE.ninja post, so you don’t need to struggle. Then I hooked up the new Vault objects to the data sources. I didn’t need to schedule a source update and I didn’t need to release my configuration, so after a few more iterations I could create Data Vault tables very quickly.
Since then, I’ve also experienced using ODE in a work environment with the real data. Currently, I’m a member of an Agile BI team. Our project’s target is to build a Data Vault with ODE to replace an existing data warehouse. So far we have had created some ensembles, identified the way to develop and release Data Vault objects in a painless way and wrote some standards to keep everything consistent. We even implemented some business rules and built two reports. ODE has proved to be a usable tool.
Things I like in ODE:
- Current flags, start and end dates and source system ID. ODE does it all. There’s no need to implement this logic every time or even copy and paste a piece of code for it. This saves a lot of time.
- Incremental releases. ODE is nicely aligned with the Agile methodology. Every object you describe in ODE configuration has a release number. This allows the release of small portions of development to another environment.
- It’s easy to find out how the Data Vault objects are configured from the configuration. This is important when you deal with multiple source systems and complicated logic.
- It’s free and it’s a good tool to learn Data Vault!
Things I’m missing from ODE:
- A GUI. Currently, ODE doesn’t have one. To get the ensemble into the Data Vault, I need to run a number of different stored procedures. I also need to pick up a numeric key from the output to use it as an input to another stored procedure. Luckily, I’m good at SQL, so it’s not a problem, however, ODE would be much better with it.
- There is no easy one-page manual. ODE.ninja helps a lot, but not all topics are covered.
- The business logic should be implemented in the staging layers via stored procedures. Currently, it requires good SQL programming skills.
- While the configuration release is easy, all the database code (views, stored procedures, etc) is not so easy to promote between the environments.
Overall, once I got used to ODE, I found that Data Vault development is more fun than old star schemas!
Kate – Data Masseuse
Kate blogs about the nuts and bolts of data warehouse development