Good luck parsing these rules into your data warehouse! Source: Wikipedia
You may have noticed some posts appearing about Optimal Data Engine (ODE), our open-source and language-agnostic tool for better data warehousing. Aside from the technical breakthroughs, what made our ODE journey possible were new ways of engaging with stakeholders to understand what they really want. The key moment was our discovery of BEAM✲ for interactive “modelstorming” (data modelling + brain storming) to build requirements that stakeholders and data people both understand.
We’re left with a problem though: once we know what stakeholders want from their data you still have to put that together! Sometimes it’s as simple as loading a column from a source system, but often there are complex business rules that determine how a column is constructed. Documenting these has always been a pain, and we’ve found neither tools nor methodologies to make this as easy as BEAM✲ made overall requirements.
Getting stakeholders to talk about business rules is easy, but wrangling that discussion into useful artefacts for building rules is hard. We could just dictate they communicate better, but the first move is incumbent on us: we have to clarify our expectations of what a good business rule should look like. And to know what to say, we need some minimum conditions for a good business rule when it reaches code.
Based on discussions around the office, my own experience with rules, and a bit of reading, here’s a start:
- Code business rules to cover all cases, including errors; convert complex conditions like ranges to Y/N conditions.
- Choose a “failsafe” default output of every rule that is acceptable to the business.
- If the above are not possible, maintain the audit trail to make your choice of what to do is transparent.
None of this is an “official position”, I’m just throwing out ideas to bring the technical and business understandings of “good business rules” closer together, with some easy methods to close the remaining gaps. The same design pattern (“close communication gap so the tool is easy/obvious”) worked for Lawrence Corr and BEAM✲, after all.
How can you help us? If you know of any techniques that achieve this mutual understanding of business rules, please let us know in the comments below!
What is a business rule?
In general, business rules “define or constrain some aspect of business and always resolve to true or false”, according to a pretty common definition. In the world of data warehousing, business rules implement those definitions or constraints on data, ie the values of some attributes (columns) of some records (rows) in a table.
In data terms, a business rule takes a set of inputs from existing data and generates new data based on the possible combinations across the inputs. While we often think of business rules as changing or transforming data, their impact on existing records in a data warehouse (an organisation’s persistent memory) is a separate consideration. For example, to maintain the ability to audit every record, a data quality business rule might retain “incorrect” values even as it presents “corrected” values to users.
A trivial example to illustrate the problem and potential solutions
A simple example of a business rule, as a business user might state it, is: “any customer with a balance across all accounts greater than $1 million is a high-value customer.”
If we want to calculate a “high-value customer” flag in a data warehouse, we need the following data inputs:
- An ID for each customer
- Their total balance across all accounts
To implement this business rule using these inputs, we could create (with stakeholders) a decision table recording all the possible combinations of inputs and their outcome in terms of the rule, for example:
Total balance > $1 million | High-value customer
Y | Y
N | N
Let’s complicate things a little. Imagine the VIP department adds another category of customer, called super-high-value, so the rule now reads: “any customer with a balance across all accounts greater than $1 million is a high-value customer; any customer with a balance across all accounts greater than $2 million is a super-high-value customer.”
We might choose to alter our decision table so it looks like this:
Total balance > $1 million | High-value customer | Total balance > $2 million | Super-high-value
Y | Y | Y | Y
N | N | N | N
Y | Y | N | N
N | N | Y | Y
In terms of our input data (the two balance comparisons) we have covered all of the cases, but you might spot the problem: the last case doesn’t make a lot of sense: how can a customer have total balances over $2 million but not also have balances over $1 million? While we’d like to say “this never happens in the real world” we’ve all seen data that didn’t make sense.
How to fix a bad business rule
We could ignore this problem and try to avoid the consequences, but better to defend in advance. Below I outline how you might improve the coding of business rules, but you’ll obviously need further business input to do a good job. Working out a good process to capture that business input is our next job…
1) Check the business logic covers all the cases
As always with data problems, the best place to fix a business rule applied on data is at the source, which in this case means checking whether the decision table fully encodes the logic in the rule as stated by the business. Doing this, you might get them to say “any customer with a balance across all accounts greater than $1 million but less than $2 million is a high-value customer; any customer with a balance across all accounts greater than $2 million is a super-high-value customer. If a customer has a balance greater than $2 million but less than $1 million, that is a data error and they should not be flagged as either.”
This might change the decision table to look like this (changes from above in red):
Total balance > $1 million | High-value customer | Total balance > $2 million | Super-high-value
Y | N | Y | Y
N | N | N | N
Y | Y | N | N
N | N | Y | N
What we did here was to check what happens in the case where total balances are greater than $1 million but less than $2 million, and in the process also covered off the (admittedly unlikely) case that “shouldn’t ever happen”. Again, we’ve all been told “this column only ever has values A and B”, only to have “C%)^p” turn up one afternoon. The best alternative is to code your business rule to explicitly cover all the possible combinations of every rule. While it seems like a lot of work (you’ll have 2n rows in a decision table based on Y/N conditions, where n is the number of conditions) it will pay off in the long run.
If your business rule is based on Y/N conditions, an easy check is the number of rows according to the formula above. If you have business rules based on ranges, in many cases it’s probably a good idea to represent these as combinations of simple Y/N conditions, as I did above for “customers with balances between $1 million and $2 million”. Then you need to check you have fully represented all possible cases feeding the range calculation, which should be easy enough with a reusable unit test for each data type (eg negative to positive infinity for reals).
2) Explicitly code in a “failsafe” default
You might have a business rule that is ambiguous, or perhaps it is safe to assume that all business rules are ambiguous to a degree, at least until we find the equivalent of BEAM✲. Where turning a rule into a set of Y/N conditions is not feasible or desirable, you are in effect committing to either incomplete coverage of the cases, or to greater reliance on more complex conditions to cover all the cases. There is a risk in both.
If you must do this, the business rule should specify a “safe” default outcome. Imagine if it is possible for “unknown balance” to make it into the table of customer total account balances discussed above. While these should fail the “greater than” tests, perhaps your software freaks out comparing “unknown” to an integer and returns “unknown”.
In that case, work with the business to define a safe default so any unspecified cases do not escape processing:
Total balance > $1 million | High-value customer | Total balance > $2 million | Super-high-value
Y | N | Y | Y
N | N | N | N
Y | Y | N | N
N | N | Y | N
? | N | ? | N
Here we added another row to the table to deal with an “unknown” result being returned from our Y/N conditions (I have assumed these only fail in parallel, but they might not). The business and technical outcomes are now aligned: whenever something weird happens, we default towards safety as defined by the business.
The business value of doing this extra work is that you proactively protect against risk. What if marketing tacked “and send all our super-high-value customers a gold watch!” on to the end of the rule above. We’d want to be quite sure that a failure to apply the business rule didn’t result in ambiguous output to another business event.
3) Treat failures and exceptions as a data source
If you can’t do either of the above (ie “fix in advance”), an acceptable approach is to not apply the business rule to the failing cases, leaving them as “unknown” or missing. The question then becomes how to “correct” this problem. In some cases, there might be a case for correcting records directly inside the data warehouse, but a safer approach is to store both the original failed application of the business rule, as well as the correction. This means knowledge workers are less likely to be held accountable for failing to take business action (eg “send a gold watch to customer X”) when, for a time, we had no idea customer X was super-high-value. In other words, we maintain audibility.
You still have a choice to make when taking this approach: where do you go and “correct” the data presented to knowledge workers out of the data warehouse? It is quite possible that when you send unknown cases to a holding area and continue to load the offending tables, the original problem (eg missing balance for a customer) is corrected in the source system while you figure out how to alter the business rule.
This creates the possibility of collision when both incoming corrections try to become the “definitive” record. In an asynchronous load system (Data Vault, for example) it is an article of faith to “keep loading”, so we expect this problem in Optimal Data Engine. There are several ways to deal with this architecturally; all involve giving one type of correction priority over the other (we’ll publish our technical solution when we agree to it). Where possible, we should avoid this trade-off by working with stakeholders to implement option 1 or 2 above.
Summary
To recap, the ways I see to safely implement business rules on data are (in decreasing order of resilience and increasing order of eventual complexity and technical pain):
- Code the business rule to cover all cases, including unknown/error results of tests on each condition. Where possible, convert complex conditions (like “between 10 and 20”) to a set of Y/N conditions (“>10”, “<20”) to cover all the cases.
- Choose a “failsafe” default output of the rule that is acceptable to the business, which will be used whenever the rule returns an ambiguous result.
- If the above are not possible, preserve the failing application of the business rule to each record and make an intentional choice about how to feed corrections back in, avoiding collisions with other potential resolutions of the failure.
All of these presume we have a deep understanding of the actual business rule, which of course may not be the case. Hopefully, the discussion above gives you some ideas to help make the issue of incomplete coverage or a missing “failsafe” obvious to stakeholders, so they can make an informed choice. As soon as we find a solid technique for documenting those subsequent decisions (or build it ourselves), we’ll let you know!
Keep asking better questions,
Shaun – @shaunmcgirr
Shaun blogs about analytics, machine learning and how data solves problems in the real world.