People talked a lot about Data Science in my coaching and mentoring sessions last year, there was a lot of confusion! Specifically about the technical skills needed to be a data scientist – my last blog covered the Computer Science t-skills, this blog looks at the Statistical t-skills.
This blog will be relevant for those of you starting in your first business intelligence management role, or wanting to become a top business intelligence professional.
We can’t assume university level Statistics or Mathematics in business intelligence, lots of analysts are in the team due to natural untrained ability or interest. It is a good idea to regularly assess statistical t-skills and set development plans.
Before starting an assessment, decide which statistical t-skills are most important given the analysis and reporting needed for your businesses strategy. I suggest the following 10 statistical t-skills as a starting point, feel free to amend as required:
- Data Cleaning: the ability to identify incorrect data within a dataset and work with the appropriate people to get the source data corrected (cleaned).
- Data Shaping/Modelling: This refers to building hierarchical relationships between different datasets, with a view of creating a full relationship model for the full data structure (may be over multiple data sources).
- Data Profiling: assess existing data and produce summary statistics giving insight into data behaviour and possible uses for the data.
- Exploratory Analysis: the analysis into what the data can tell, it’s characteristics, relationships, and which can be represented visually. I include market segmentation in an exploratory analysis.
- Explanatory Analysis: the inverse of exploratory analysis, often initiated be a single information request, analysis into the cause of a data characteristic or relationship.
- Forecasting: predictions of the future using historic and current data as the base.
- Broad Tool Skills: the ability to utilise the available toolsets broadly i.e., able to perform all the above t-skills in one suite of products. The analyst has a full understanding of the available toolset’s functionality and ideally has a full understanding of more that one toolset’s functionality.
- Hypothesis Testing: analysis over a subset of data to look for evidence that a stated condition is true for the full population dataset.
- Documentation: able to document the process followed to complete a piece of analysis, to enable understanding and instruct another independent colleague to repeat the process if needed.
- Communication: use effective communication (non-verbal, listening, stress management, assertive communication, etc.) with key stakeholders, business colleagues and the interested public.
Use the Statistics_T_Skills_template to assess the depth of knowledge in each t-skill. The following guidelines may help:
- The basic level implies that the individual has started to learning the t-skill but spends most of their time searching training guides, manuals or blogs on the internet (at the very worst the individual asks colleagues at every step of the process without doing any research themselves)
- The intermediate level implies that the individual has learned a few of the key deliverables of each t-skill, however, when asked to complete a piece of work a bit out of the ordinary they will return to searching training guides, manuals or other on-line advice and information
- The advanced level implies that the individual has a good grasp of the t-skill, if asked to complete a complex piece of work they will need to search training guides, manuals or other on-line advice and information, and
- The expert level implies the individual has a full understanding of the t-skill and has had substantial experience in delivering work using the t-skill.
A development plan should focus where t-skill depth is low or the individual wants to improve.
There is one warning, not all people are self-aware, either exaggerating or under-representing their skills. Check out Wikipedia for more information on this effect.
Counter the effect with mentoring, peer review and robust sign-off processes.
If you are looking for excellent customer segmentation training you cannot beat Professor Goutam Chakraborty. In New Zealand SAS New Zealand arranges his training. If you are fortunate to find a suitable course run by Professor Chakraborty I recommended you send at least one analyst.
Good luck. Mel.
“Practice makes Permanent” – whether good or bad habits, why not make them good.
It would be great to hear any feedback you have regarding this blog in the comments below.
We run regular Defining Data Requirements courses to bridge the technical-business divide when gathering data requirements.