This week I had an amazing opportunity to learn Data Vault modelling from the DV guru Hans Hultgren here in New Zealand, and I didn’t miss it!
The Data Vault (DV) methodology is based on the idea of Data Warehouse ensemble modelling. Ensemble modelling has only three types of tables: hubs, links, and satellites. These are enough to build a scalable and reliable Enterprise Data Warehouse.
And it’s not something I’ve been just told, I have seen how it works, on a DV project. In our project team, we had a few modelling workshops where we designed new ensembles. After we designed and implemented Data Vault objects, we were able to produce a number of reports very quickly and had a positive response from a variety of business users. But I really wanted to consolidate my knowledge, learn some advanced techniques and get the answers to a couple of hard questions I’ve got; that’s why I was really glad to meet Hans.
Hans explained that there are many different ways to model ensembles, and Data Vault is just one of them. He thinks that DV is the best, but also he doesn’t mind if after the course we pick another technique or even invent our own based on what we’ve learned. All Data Warehouses are different, so are the people who designed them. The only important thing is to have just one ensemble modelling style within the Data Warehouse.
Apart from all the theory we’ve learned in the three days, there was a nice opportunity to practice modelling in teams. Hans gave us one business case per day. We designed a draft of Business Vault together in teams of three to five people, and I think this is how it really should work. It probably will take many years of practice before I would be able to design a perfect Data Warehouse on my own, but modelling as a team gives the benefits instantly. We all had nice ideas on modelling based on what we’ve learned during the day and discussed which was the best solution for the hard parts. Based on the review comments from Hans, I think that there is no completely wrong answer in Data Vault world. In an Agile environment, the team could start from any close-enough solution and optimise it later.
After the course, we were left with two books. One of them is Hans’ book on Data Vault. I guess, there are even more answers there; I’ve seen a chapter called “Advanced Concepts”, and that’s what I’m going to read next. The other book is more like a course book. Hans asked us to make notes in it during the course. I enjoyed his approach, as it’s really easy to keep all the notes aligned with what Hans told and showed us and it will be much easier to recall the course at any time.
I found out that Hans’ course slightly changes over time. He always adds something to it once he finds a better solution than he used before. He used to say that it doesn’t matter if you use a concatenated key or multipart business key when just one field cannot guarantee the enterprise-wide business key uniqueness. Now he says that concatenated keys have proved to be more scalable and provide a future-oriented solution. Hans said that the Data Vault methodology develops constantly. Many companies and organisations are using Data Vault in Australia and New Zealand, and provide a lot of feedback, so there is a good community of people interested in keeping it one of the best of modern Data Warehouse methodologies.
I’d like to say a big thank you to Hans for teaching me this year. We all are looking forward to seeing him again in New Zealand soon!
Masseuse of all the Data – Kate
Kate blogs about the details that make the Data Warehouses work.
We run regular Data Vault course for business analysts, data architects, and business intelligence developers in Wellington and Auckland. Find out more here.