"If the only tool you have is a hammer, it's hard to eat spaghetti."
David Allen - Author of Getting Things Done
With the abundance of analytical methods and buzzwords in the market, here you will find the fundamental methods simply explained.
By learning the capabilities of the tools, you will understand how to use them as you come across a business problem.
Slicing and Dicing
The foundation of data mining. To slice and dice is to segment a body of information into smaller parts or to examine it from different viewpoints so that you can understand it better. The goal is to iterate through as many segments and pieces of data as possible until the level of information needed is met or the analyst has uncovered valuable information. There is no mathematical or scientific models applied, but it's the analytical way of thinking, visualizing, and restructuring the data. Patience is key. Many business questions can be answered through the fundamentals of slicing and dicing.
Decision Trees are one of the most under utilized yet powerful machine learning methods available. From a business context, one big advantage of the decision tree model is its transparent nature. Unlike other decision-making models, the decision tree makes explicit all possible alternatives and traces each alternative to its conclusion in a single view, allowing for easy comparison among the various alternatives. The use of separate nodes to denote user defined decisions, uncertainties, and end of process lends further clarity and transparency to the decision-making process.
Regression analysis is a statistical tool used for the analysis of relationships between variables. Usually seeking causal effect of one variable upon another — the effect of a price increase upon demand, for example, or the effect of changes in the money supply upon the inflation rate. As one of the oldest statistical tool used in data mining and machine learning, it is also one of the simplest and informative tools out there. From a business context, it allows you to understand the impact on your business when you account for multiple influences.
Similar to linear regression, Logistic regression is a statistical method for analysis of relationship between variables, the difference here is that the outcome is a probability based on a binary outcome (1 / 0, Yes / No, True / False) given a set of independent variables. The machine learning model has been one of the most influential models in modern times as data continues to grow. It's often used in ways where a company wants to predict the likelihood of a customer purchasing a specific product, clicking on an ad, watching a type of movie, returning as a customer, and etc..
Association Rules (Market Basket Analysis)
Market Basket Analysis is one of the key techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it allows retailers to identify relationships between the items that people buy. Association Rules are widely used to analyze retail basket or transaction data, and are intended to identify strong rules discovered in transaction data using measures of interestingness, based on the concept of strong rules.