Variables in Customer Analytics

By Jeff Sauro

A variable is a characteristic of a product or service that varies, which can often be manipulated. For example, price, delivery time, and color are product variables. Customer variables can include gender, income, geography, new customer versus existing customer, and type of industry, to name a few.

When you look at product and customer variables, you can understand how different product attributes attract more or less sales and how different customers respond to different products and feature combinations.

There are two types of variables:

  • Dependent variables are usually the things you care about but can’t affect directly, such as profitability, customer satisfaction, and customer loyalty. You can influence dependent variables by changing the independent variables. An example of this relationship is shown in the figure.

  • Independent variables can be directly controlled or manipulated.

  • For example, independent variables include price, features, advertising, and usability.


Often, independent variables correlate with dependent variables, but the correlation doesn’t equal causation. Other variables that you’re not measuring can “mediate” or be responsible for the relationship. For example, higher sales (dependent variable) might be attributed to a new marketing campaign (independent variable) but the increase is actually just due to a growing economy that’s helped all businesses (mediator variable).

You can think of independent variables as the ingredients you use to cook a stew. The soup is the dependent variable (what you care about), but adjusting the ingredients and their combinations is what you can control.

Variables often come in the form of words instead of numbers — for example, new customer or existing customer, male or female, high income or low income. To make analysis of these qualitative values easier, you can code them into dummy variables by assigning them a number (for example, new customers get coded a 1 and existing customers get coded a 0; men get coded as 1 and women coded as 0 [or vice versa]).

With your variables coded as 1s and 0s, you can compute the percentage of customers with each variable.