How to Randomly Select a Sub-Sample

From Q
Jump to: navigation, search
  • Create a new JavaScript Variable.
  • In the dialogue that appears, select Access all data rows (advanced) and paste the code below into the Expression field.
  • In the code, on the first line, change the value assigned to required_sample_size to your required sample size.
  • Click OK to create the variable.
  • On the Data_tab, ensure that your unique ID variable has been selected in the Case IDs drop-down in the top-left.
  • On the Variables and Questions tab, right-click on the new variable and select Copy and Paste Variable(s) > As Values.... This fixes the data that has been selected by the random sampling formula.
  • Hide the original JavaScript variable by selecting the yellow H in the Tags column.
  • Use the new variable as a Filter by selecting the yellow F in the Tags column.
var required_sample_size = 200;

//generating an array of random numbers
var rnd = new Array(N);
var orig_rnd = new Array(N);
for (var i = 0; i < N; i++){
    r = Math.random();
    rnd[i] = r;
    orig_rnd[i] = r;
}
//Finding the cut-off
rnd.sort();
var cutoff = rnd[required_sample_size];
//creating filter variable
var result = new Array(N);
for (var i = 0; i < N; i++)
  result[i] = orig_rnd[i] < cutoff;
result

It is important to note that this code samples without replacement.

In the above code, N is a reserved variable which gives the number of cases in the data file.

See Also