10 Kinds of Analysis That Complement Data Mining
You don’t have to be an expert in every data mining technique, of course, but a little knowledge about other tools and approaches can prepare you well for new challenges. This list introduces you to ten such approaches.
Business analysis is the study of business systems and processes with the aim of improving them. Business analysis can help organizations run more efficiently, comply with the law and other standards for good practices, and avoid costly missteps. Business analysts facilitate organizational change by identifying stakeholder needs and evaluating the feasibility of alternative solutions to business problems. Many are experts in information technology and organizational structure.
As a data miner, your first encounter with a business analyst might come when your organization decides to explore data mining. The business analyst might take the lead in identifying how data mining can be applied in your organization, how to integrate data mining with information technology functions, and how to ensure that data mining does not interfere with everyday operations.
Shoppers make choices, balancing preferences for particular features with the limitations of the products available and their shopping budget. Think about the other side of that process. If you are a product manager or marketer, to attract customers, you need information about the features they find most appealing and the prices the market will bear.
This is the role of conjoint analysis, a technique for obtaining information about consumer preferences. In conjoint analysis, data is collected from individuals asked to evaluate a variety of theoretical product options. These studies can vary from simple (such as the ones that ask respondents merely to rate or rank each option) to complex (such as studies that use special adaptive software that modifies options as the interview progresses).
Design of experiments
If you’re a data miner, when it comes to data, you take what you can get. Your data may be collected in the course of routine business or through another preexisting channel, but that’s not always sufficient. Sometimes you need specific kinds of data, or data that fulfills certain conditions, and that’s where experiments come in.
If, like most data miners, you aren’t trained in the design of experiments or in strict statistical methods for analysis of the results, this is the time to bring in a statistician. A poor design can easily sink an experiment — by introducing error, for example, or by altering the meaning of the results so drastically that the experiment fails to say anything of consequence about your theory.
Marketing mix modeling
Because so many advertising options are available — TV, radio, print, online, and more — it’s not always easy to figure out which combination of media provides the best value for your needs. With this in mind, marketers use marketing mix modeling to gain an understanding of what’s working and how best to allocate their spending.
Marketing mix modeling uses statistical analysis on sales and marketing data to evaluate different marketing approaches and to optimize a company’s advertising choices.
Consider that you have 3,000 products in 12 warehouses and 800 orders to deliver those products to customers in 14 states using a mix of your own trucks and any of 22 supplemental delivery services by Thursday. You must find the most cost-effective way to get everything where it needs to go, on time. For a complex problem like that, your best approach is to do operations research.
Operations research applies mathematical optimization, simulation, and other methods to identify ways to obtain maximum value from available resources. It’s widely used in industries that have complex logistic challenges, such as transportation and the military. It’s quite different from data mining because much of the process involves no data and is based entirely on theory.
Here’s a little-known fact: Two completely different categories of analytics exist, each called reliability analysis. Here’s the story on each of them:
Engineering: In engineering, reliability analysis is exactly what the name suggests: the study of making products and their parts consistently perform as expected. It draws on mathematical modeling methods such as probabilistic risk analysis, finite element analysis, and simulation to predict how systems will function in a variety of conditions.
Psychometrics: In psychometrics, reliability analysis refers to consistency in a measurement. A measurement is said to be reliable if it produces the same result time after time. This type of reliability analysis is most often used in the development and evaluation of standardized tests.
Statistical process control
It’s commonly understood that the first step to better quality is to make your processes predictable and consistent. It’s a little like learning to cook something new. First you learn to make the recipe properly, and then you make small changes and see whether you can improve it.
Statistical process control formalizes that approach, using statistical measures developed for the purpose and special graphs called control charts. It is a longtime staple of manufacturing industries, and is coming into widespread use in healthcare. Although it’s also applicable to many service industry applications, it’s not often used there.
Social Network Analysis
Many people refer to Facebook, Pinterest, and other, similar institutions as social networks, but in fact, these are platforms — that is, communication tools designed to facilitate interaction among people. The social network is the people! So you, your best friend, and all your old school pals form a social network, a group of people connected by interaction, acquaintances, or other means.
Social network analysis, then, is the branch of mathematics that aims to understand the behavior of these interconnected groups of people.
Structural equation modeling
Human behavior is complex, involving many elements, including some that can’t be directly measured. Consider the process that establishes a consumer’s satisfaction level with a store. Many factors come into play: the consumer’s perceived need for the store’s product, the customer’s attitude toward the store’s atmosphere, memories of past experiences in this store and others, the weather, and so on.
If you could develop a model of that process, you could understand what factors cause consumers to be satisfied or dissatisfied and see how you might influence them to improve customer satisfaction. That’s the role of structural equation modeling (sometimes called path modeling or causative modeling).
Data mining and other techniques designed to explore relationships among variables enable you to discover a wealth of useful information from Internet activity data.
You may have a need for some basic reports that summarize activity at a very simple level, such as tabulations of total downloads for various types of content, graphs of activity by time of day, or maybe a little bit of A/B testing (a test you can use to compare different versions of marketing materials and find out which works better). This is the common meaning of web analytics.