I came across this gem when researching a Bayesian versus Frequentist (the typical stuff you get taught at high school and in most university statistic courses) argument for my Bayesian statistics course.
Essentially the differences are that for Bayesians it’s all about capturing degrees of belief in your model rather than the examination of long run expected frequencies. Not surprisingly Frequentist’s find the Bayesian approach far too subjective for their liking [I’ve popped a few links to a number of different blogs/articles on this age old debate at the end of this blog].
Anyway, the “Hammers are not superior to Screwdrivers” quote above is a good mantra not only for the Bayesian versus Frequentist debate but for all the various methodologies/techniques within both of these, machine learning techniques and even what software to use.
When it comes to analytics you really need a range of techniques/methodologies to choose from and apply appropriately. The objective of the analytical problem you are trying to solve, the type of data including how it behaves and what software you have all interact together and a good analyst will be able to select the approach that will enable the best possible outcome. A great analyst will have more than one option for each problem in many cases, enabling a more comprehensive decision to be made.
Just like the purpose of a hammer is very different to a screwdriver, so too are each of the analytical methodologies and as such should be evaluated on what is trying to be achieved within the business constraints, the type of data you have and how its behaving, what tools you have available to you within the software you have.
For example linear regression is often a good choice for problems when you want to predict a continuous variable with explanatory variables.However it requires the spread of the variable being predicted to be the same no matter the explanatory variable (or if you want to sound very cool linear regression requires homoscedasticity!!). [Also the explanatory variables shouldn’t be related to each other and the errors need to be identically and independently distributed.] So if your data doesn’t follow these assumptions your model may not be too chipper – but it’s not the technique that’s the issue it’s simply the wrong one for the data. It’ll be like trying to use a hammer to screw in a screw – the hammer isn’t useless it just wasn’t designed to screw in screws!
Back to the Bayesian versus Frequentist minefield – it all really depends on what you are trying to do with what. If you are after a way to summarise uncertainty, Bayesian is a reasonable way to go. If you have sparse data but a lot of expert knowledge that you can capture within a prior distribution, then again Bayesian is a good option. However if your data has come from a repeated sampling setting then you’ll want to use frequentist methods as they automatically take survey design into account. If you have millions of records in your data and no prior knowledge then, depending on your data and objective, frequentist methods and machine learning methods should be considered.
Just like a joiners’ tool-belt you want to have at least a range of the bare necessities and over time augment it with more sophisticated specialised tools. No matter what, you really need to ensure you are not trying to hammer in a nail with a screwdriver.
And if you want to get hammered here’s a great recipe for a screwdriver!
Examples of the frequentist/Bayesian debate: