How To Create a Unique ID from an ID Variable That Contains Duplicates

From Q
Jump to: navigation, search

This page explains how to use JavaScript to create a unique ID variable from an ID variable that contains some repeated values. In some cases the data file may be missing a variable which identifies each respondent uniquely, because some respondents have data in more than one row of the file, but there is still a need to have a variable which identifies the cases uniquely (for example to set the Case IDs in the Data tab). Here, we make a new variable where each occurrence of an ID value after the first has a number appended to it. For example, if the ID 100355 appears twice, the new variable will have values of 100355 and 100355_1 in the corresponding rows. This process requires you to make a new copy of your SPSS data file.

  1. Select File > New Project.
  2. Select File > Data Sets > Add to Project > From File.
  3. In the Data Import Window:
    1. Select Use original data file structure.
    2. Untick Tidy Up Variable Labels and Strip HTML from Labels.
    3. Click OK.
  4. Go to the Variables and Questions tab.
  5. Right-click the ID variable and select Insert Variables > JavaScript Formula > Text.
  6. Tick the box Access all data rows (advanced).
  7. Paste the code below into the Expression.
  8. Change CustomerID in the first line to the Variable Name of the ID variable you are using.
  9. Change the Name to NewID and Label to New ID, and click OK.
  10. Use Tools > Save as SPSS/CSV File and save a new SPSS file. This saves the data with the new variable included.
var _old_ids = CustomerID;
var _new_ids = (x) { return x.toString(); });
_new_ids.forEach(function (_i, _ind) {
    if (_new_ids.indexOf(_i) < _ind) {
        var _counter = 1;
        while(_new_ids.indexOf(_i + "_" + _counter.toString()) < _ind && _new_ids.indexOf(_i + "_" + _counter.toString()) > -1)
            _counter ++;
        _new_ids[_ind] = _i + "_" + _counter;

See Also

Personal tools