Not a day goes by it seems when I don’t get an email from a vendor espousing the virtues of their analytical software, providing white papers on how others have benefited from their product. With so many pushing their wares how do you figure out which analytical software is right for you?
Here are a few things to look for when you’re kicking the tyres.
What do you want to be able to do?
Before looking at all the shiny new toys out there, the very first thing you need to do is to determine what your business actually needs now and what you anticipate you will need in the future. This should align with your business strategic plan. By not having a comprehensive understanding of your business objectives you run the risk of spending up large on something that looks very cool, sounds very cool, but ends up a white elephant.
Does it play nicely with others?
Is it able to talk/connect with your current data-warehouse/database/data-marts? You need to make sure that it can take datasets created by whatever your organisation has e.g. SAS, Oracle as well as common external sources such as Excel, CSV, and Access. Can you incorporate the output such as model scoring back into your current systems? You also need to think about the security within your organisation, is the software compatible, will the organisation block open-source options?
Show me the money!
As with all things IT, take care to really know what it is costing you. Are you paying a one-off or annual license fees? Is there a per-user cost? What hardware is also required? (That can add a few dollars!) What sort of support does the vendor offer and how much do they charge for a callout (it’s also a good idea to know whereabouts the support is located!). Open-source doesn’t always equate to free. Look at the not so obvious cost such as the price of specialist skills – do you need high cost advance coders with a PHD? It may be that the slightly higher priced software that is a little more user friendly is cheaper in the long run!
It’s not rocket science … or is it?
You need to take into account the skill set of the people who will be using it. How user friendly is it? If you have people adept at coding then you may not care so much if the software is point and click or not. Do you have hard core statisticians/data scientists or more generalist analysts? You may need a more wizard orientated software with prebuilt automated modelling tasks for the generalist, whereas the hard core statisticians may need something they have more control over. It may be that you need a combination of both.
What sort of training is needed, how often is offered and are there a variety of vendors that can do the training?
A picture tells a thousand words.
For any piece of analysis from reporting to dash-boarding to predicting you need to be able to explore the data – upside down, inside out. To do this you need good visuals. Does it include the core charts such as box plots, scatter graphs, time series, and bar charts? Consider if you really need the whizzy animation type charting – if you are only giving static reports and have small datasets you may not need to pay the extra! Before you get carried away with the ability to display on tablets consider whether or not enough people have work tablets to actually view the charts and if the software works with all tablets (not just ipads).
Also think about how much data you have. If you are trying to visualise millions of rows of data then you need software that can actually display the data in meaningful way (such as the use of heat maps within a scatter graph) within a reasonable time (seconds vs. hours).
What is the quality of the outputs like? If you are you going to use them in presentations and reports then you will want high quality graphics. If however you are just after visuals for exploratory work then fast lower quality may be more desirable.
Lego blocks
Is the software modular so you can add on/take off as your business needs change? Or do you get the whole kit and caboodle even if you only want some of the capability. Are the components within a block logical and complete? Do the modules contain almost all of what you need but you have to get another module (at extra cost of course) for one more technique sitting within a bunch of other stuff that is not required? Really knowing your business and what you want to be able to do with analytics helps here.
The Analytics
Above are some of the main statistical/data mining techniques you want to think about. If you need predictive analytics, check out what model fit diagnostics the software includes (you want to be able to see the misclassification rate, ROC curve, Lift, Average Squared Error, how the residuals are behaving at a bare minimum). Are they displayed in such a way they are easy to understand and compare multiple models? You’ll also want the ability to over-sample as well as being able to easily divide your data into train/validate/test datasets. Consider also how easy it is to cleanse/ transform/ create new variables and add in new data. Can you do it within the software or do you need to do it in another application and then import/re-connect to the data?
Whatever you do keep your head and stay focus on what you actually need. Try not to be blindsided by the shininess of the software! Most vendors offer a trial – you will get bombarded with follow up sales pitches – but this is a small price to pay to have a play with the software with your data and your analysts. Be wary of those that only ever show you pre-determined demos – it’s very easy to make something look sleek and sexy with only a few thousand of very nicely behaved data records!
Michelle