Here at OptimalBI, Data Vault is a real buzzword. We love this methodology. First, we became experts, now we help our clients implement it on their systems, and we teach other enthusiasts. Not heard of Data Vault, or still unsure what it exactly is? My blog is here to help.
Data Vault is a methodology to build Data Warehouses. The latter is a keystone of Business Intelligence. It’s all about making sense from the collected data.
A Data Warehouse is a store for the data. Once stored, it’s available for data visualisation, analytics, data insights, and your favourite, reports. However, a Data Warehouse is not a simple database; it’s a database or set of databases designed for many purposes. It should store data and its history; it should integrate the data from all of the organisation’s sources and conform it; it should apply organisation-wide business rules to the stored data and its aggregations; also it should be made easily available for any type of visualisation and analysis tool. Data Vault is a methodology which is designed to support all of these purposes effectively.
Why is Data Vault so hot? It was developed to take the best from existing Data Warehouse techniques and make them better. It is as robust and stable as any other well-known Data Warehouse methodology but is also flexible, adaptable and repeatable. Data Vault can therefore be changed as business requirements change, whilst keeping old business rules for reporting. In addition, fields can be added separately depending on the priority and rule’s complexity without interruption to the rest of the Data Warehouse.
How do you know if your database is a Data Vault? It’s simple. Data Vault has three types of tables:
- Hubs which represent core business concepts.
- Links which are relationships between these concepts.
- Satellites which are storages of all the attributes.
The process of decomposing the source data to these three types of tables while keeping the natural relationship between the data is called Unified Decomposition. Altogether a core business concept with its attributes is called an Ensemble. That’s why the process of designing the Data Vault architecture is called Ensemble Modelling.
I’ve heard about Big Data; does Data Vault work with it? Big Data usually refers to huge volumes of unstructured data. If it’s structured, then it’s just a volume. Data Vault should be able to consume any amount of structured data, depending on implementation, but even if it’s true unstructured Big Data, Data Vault could work with it via a Shuttle construct. In this case Big Data remains separate from the Data Vault, but these two can be easily reported on and analysed together.
I hope that’s enough to start your journey to the world of Data Vault. Check our other blogs about Data Vault and our engine for simple and controllable Data Vault implementation.
Masseuse of all the Data – Kate
Kate blogs about the details that make the Data Warehouses work
Read Kate’s blog Data Vault Certification to find out what she thought of Hans Hultgren’s Data Vault course in Wellington.
You can book your place on Hans Hultgren’s next Data Vault Certification course this March in New Zealand here.