How to Create a Variable with Standardized Values

From Q
Jump to navigation Jump to search

Sometimes it is necessary to transform the values of a variable so that the mean of the data is 0 and the standard deviation is 1. Such a variable is normalized or standardized. This can be achieved by creating a JavaScript Variable to do the transformation, and the method is described below.

Note that if you want to standardize the data from several variables by case you should instead use the QScript Create New Variables - Scale Variable(s) - Standardize Within Case

JavaScript Variable

To make a variable with standardized values, you can use the following method:

  1. Go to the Variables and Questions tab.
  2. Right-click and select Insert Variables > JavaScript Formula > Numeric.
  3. Select the option Access all data rows (advanced).
  4. Copy and paste the code below into the Expression.
  5. Change the target variable Q9_A_5 to be the Variable Name of the variable that you want to standardize.
  6. Set an appropriate Label and click OK.
// Get the values from the target variable
var _starting_vals = Q9_A_5;

// Remove any missing values
var _vals = _starting_vals.filter(function (_v) { return !isNaN(_v); })

// Get mean and standard deviation
var _sd = jStat.stdev(_vals, true);
var _mean = jStat.mean(_vals);

// Generate the standardized value for each of the original values and return the results
var _results = _starting_vals.map(function (_v) { return (_v - _mean) / _sd; } );
_results

Notes:

  1. The formula calculate the mean and standard deviation of the values in the variable, and then produces a new set of values by subtracting the mean and dividing by the standard deviation.
  2. The formula above uses the Sample Standard Deviation.
  3. If you wish to use the Population Standard Deviation instead, change the line _sd = jStat.stdev(_vals, true); to _sd = jStat.stdev(_vals, false);.
  4. The Access all data rows (advanced) option used above allows the formula to look at all of the respondent data for this variable instead of processing each value individually.
  5. If you want to repeat this process for many variables, then create the first one as above, and then apply Use as Template for Replication.
  6. If the values of the source variable change, for example by recoding or adding new data to the project, then the new standardized variable will update automatically.