A data warehouse, by its very nature, creates a security conflict. On the one hand, the goal of every data warehouse is to make valuable data accessible. Most of the hard issues in data warehousing have to do with how to make the data more understandable, more available, and easier to access.
As Kimball pointed out almost two decades ago, there is a constant tension between securing data and making information accessible. I encounter this trade-off at every client. Unfortunately my clients are usually addressing this dispute as a cold war. Typically the Business Intelligence (BI) practitioners have an informal exemption from ordinary security policy, but the security team would prefer to shut BI down.
A basic principle of information security is Least Privilege:
Each user must be able to access only the information and resources that are necessary for [their] legitimate purpose.
Often least privilege is misinterpreted as “data that is not shared is never lost”. I am accustomed to security concious people objecting that the warehouse is going to give people information that they did not have before. That is correct: There is an information security risk with giving people access to data.
However, experts agree that information security needs to be based on risk analysis. And that risk trade-off is clear: bad decision making poses a greater risk than possibly leaking data. But in most organisations the information security risk is owned by a different group than the one responsible for improving performance.
These decisions always revolve around the words “legitimate purpose” in the definition of Least Privilege. Decision makers need to have the right information available at the right time to inform the right choices. In turn, those people have to take individual responsibility to use information appropriately. I am not aware of any other credible options. People should not be browsing data for their amusement. They should, however, be allowed within reasonable limits to decide for themselves what information to use to make decisions.
I recommend the following for data warehouse security.
Decide on reasonable limits; some data should be guarded closely. My opinion is that most organisations have very little data that should not be made available to all of their employees. Once you have classified your data, hide the sensitive data attributes rather than entire rows. Kimball identifies a problem with filtering rows: by hiding an entire row you’ve altered the aggregates. This means that different people see different totals, and possibly get different forecasts. Filtering rows damages the common warehouse principle of single-version-of-the-truth.
Assign your decision makers into groups and apply all security at the group level. Groups usually have subsets or supersets of other groups privileges, but that isn’t always the case. Human resources staff need different access than the board of directors, not less or more access.
Present data to your decision makers in a tool that allows you to track who accessed what. The user should have to authenticate in a way that identifies exactly who they are, not which to group the belong. Any web-based tool will to this automatically. If you are using spreadsheets, make those spreadsheets fetch live data from a database so the spreadsheet is a report definition rather than a data store.
It is good practise to audit access to your warehouse. However, I recommend aggressively managing those access logs. Left unchecked I have seen the access logs grow to exceed the size and cost of the warehouse proper. Sophisticated log processing is one answer, but the best return-on-investment comes from a short on-line retention period followed by archival storage.
Don’t try to fly under the security radar. Communicate with the security people early and come to an agreement. Work to adjust security policy so that decision makers have access to the best available data.