How To Standardize/Normalize Variables When Creating Segments

From Q
Jump to: navigation, search

When creating segments using Numeric Questions, in some situations it can be useful to standardize (normalize) the variables prior to doing the analysis. For example, if one question is on a 10 point scale and another is on a 5 point scale, in cluster analysis, the data on the 10 point scale will usually dominate the analysis, all else being equal.

The two main ways of creating segments in Q are:

Standardizing data with latent class analysis

By default, Q's latent class algorithms automatically normalize data between questions. For example, if you have one question with a 5 point scale and another with a 10 point scale, the mathematics of latent class analysis implicitly treats both questions as if they were on the same scale. You can modify the extent of importance of a particular question in the analysis using Question weights

However, within a particular question the data is not automatically normalized, which means that within a question, variables with higher standard deviations will, all else being equal, be more influential. You can, however, get Q to also standardize within questions. This is done from with Segments by selecting Advanced and changing the Distribution of segments to Multivariate Normal - Diagonal.

Standardizing data with cluster analysis

To standardize the data prior to cluster analysis, it is necessary to standardize the variables. See How to Create a Variable with Standardized Values.

Further reading: Market Segmentation Software