Data - Data Set - Stack

From Q
Jump to navigation Jump to search
This page is currently under construction, or it refers to features which are under development and not yet available for use.
This page is under construction. Its contents are only visible to developers!

Stack data sets in the Displayr cloud drive. To use this feature, the data set to be stacked needs to be uploaded to the cloud drive (accessed via the user icon button > Displayr cloud drive). The stacked data set is also written to the cloud drive.

Specifying stacking can be easy with the use of common labels, which are words in variable labels used to identify which variables to stack together. It is often possible to stack an entire data set with a set or more of common labels. Common labels can be manually specified, automatically deduced from the input data set, or deduced from a specified set of variables.

Variables that cannot be stacked using common labels, can be manually stacked either by specifying the names of the set of variables to be stacked, or the names of variables in each stacking observation. A set of consecutive variables can be specified using a range consisting of the name of the first and last variables separated by a dash (-). For example variables Q1_A, Q1_B, Q1_C, Q1_D can be specified as Q1_A-Q1_D. A set of variables with common prefixes and/or suffixes specified using a wildcard character (*). For example variables Q1_A, Q1_B, Q1_C, Q1_D can be specified as Q1_*.

Once the stacked data file has been created in the cloud drive, it can be added to a document via Data Sets > Add > Displayr Cloud Drive, and analyses can be performed as with any other data file in Displayr.

Example

The output below shows the variables of a data set that have been stacked using common labels (blue) and stacked manually (pink):

Options

Input data set The name of the SPSS .sav data file in the Displayr cloud drive that is to be stacked.

Stacked data set The name of the stacked SPSS data file to be saved to the Displayr cloud drive. This is optional and if no input is supplied, a name is generated from the input data file name.

Stack with common labels A choice between Automatically, Using a set of variables to stack as reference, Using manually input common labels and Disabled. If Automatically is chosen, a set of common labels is automatically chosen based on the variable labels in the input data set and variables with these common labels are stacked together. For Using a set of variables to stack as reference, see option Reference variables to stack below. For Using manually input common labels, see option Common label below. If Disabled is chosen, no stacking is performed using common labels.

Reference variables to stack These are text input controls shown when Using a set of variables to stack as reference is selected for Stack with common labels. Each text input should contain the comma-separated names of the reference variables to be used to determine a set of common labels which are used for stacking. Variable ranges can be specified with a dash (-) between variable names and wildcards may be specified in names using an asterisk (*). Multiple sets of reference variables can be specified for multiple sets of common labels.

Common label These are text input controls shown when Using manually input common labels is selected for Stack with common labels. These should contain the common labels to be used for stacking. Multiple sets of common labels can be specified.

Manually specify stacking by A choice between Variable (see Manually stacked variable below) and Observation (see Manual stacking observation below). Depending on the variables to be stacked, it can be a lot easier to specify variables using one of the methods compared to the other.

Manually stacked variable These are text input controls shown when Variable is selected for Manually specify stacking by. Each text input should contain the comma-separated names of the variables to be stacked together into one variable. Variable ranges can be specified with a dash (-) between variable names and wildcards may be specified in names using an asterisk (*).

Manual stacking observation These are text input controls shown when Observation is selected for Manually specify stacking by. Each text input should contain the comma-separated names of the variables to be stacked in an observation. Variable ranges can be specified with a dash (-) between variable names and wildcards may be specified in names using an asterisk (*).

Non-stacked variables to include These are text input controls which should contain the names of the non-stacked variables to be included in the final output (they would otherwise be excluded). Variable ranges can be specified with a dash (-) between variable names and wildcards may be specified in names using an asterisk (*).

Include original case variable in stacked data set Whether to include a variable containing the original case numbers.

Include observation variable in stacked data set Whether to include a variable containing the observation numbers.

Automatic updating Whether to automatically update the stacked data set. This is used when the input data set is regularly updated.

Update period The time unit for regular updates. Shown when Automatic updating is selected.

Frequency The multiple of the Update period for regular updating. Shown when Automatic updating is selected.

Start date and time The date and time of the first update in the format dd-mm-yyyy hh:mm or mm-dd-yyyy hh:mm. Shown when Automatic updating is selected.

US date format Whether the Start date and time is expressed in US format i.e. mm-dd-yyyy hh:mm. Shown when Automatic updating is selected.

Time zone An optional time zone for the Start date and time, or else default of UTC applies. Format must be Continent/City, e.g. America/Los_Angeles. See Wikipedia for a list of time zones. Shown when Automatic updating is selected.

Code

form.setObjectInspectorTitle("Stack Data Set", "Stack Data Set");

form.group({label: "Data Sets", expanded: true});
var data_set_name = form.textBox({name: "formDataSet",
              label: "Input data set",
              required: true,
              prompt: "Name of data set in Displayr Cloud Drive"}).getValue();

form.textBox({name: "formStackedDataSet",
              label: "Stacked data set",
              required: true,
              prompt: "Name of stacked data set"});

form.group({label: "Stacking with common labels", expanded: true});
var method = form.comboBox({name: "formCommonLabelMethod",
                          label: "Stack with common labels", 
                          alternatives: ["Automatically",
                                         "Using a set of variables to stack as reference",
                                         "Using manually input common labels",
                                         "Disabled"],
                          default_value: "Automatically"}).getValue()

if (method == "Using a set of variables to stack as reference")
{
    var i = 1;
    while (true)
    {
        let ref_vars = form.textBox({name: "formCommonLabelRefVars" + i,
                                         label: "Reference variables to stack (set " + i + ")",
                                         required: false,
                                         prompt: "E.g., Q1_A-Q1_C or Q1_* or Q1_A, Q1_B, Q1_C"}).getValue();
        if (ref_vars == "")
            break;
        i++;
    }
}
else if (method == "Using manually input common labels")
{
    var i = 1;
    while (true)
    {
        var j = 1;
        while (true)
        {
            let common_label = form.textBox({name: "formCommonLabel" + i + "Set" + j,
                                             label: "Common label " + j + " (of set " + i + ")",
                                             required: false,
                                             prompt: "Common label"}).getValue();
            if (common_label == "")
                break;
            j++;
        }
        if (j == 1)
            break;
        i++;
    }
}

form.group({label: "Manual stacking", expanded: true});
var type = form.comboBox({name: "formManualType",
                          label: "Manually specify stacking by", 
                          alternatives: ["Variable",
                                         "Observation"],
                          default_value: "Variable"}).getValue();

i = 1;
while (true)
{
    let lbl = type == "Variable" ?
                       "Manually stacked variable " + i:
                       "Manual stacking observation " + i;
    let manual_label = form.textBox({name: "formManual" + i,
                                      label: lbl,
                                      prompt: "E.g., Q1_A-Q1_C or Q1_* or Q1_A, Q1_B, Q1_C",
                                      required: false}).getValue();
    i++;
    if (manual_label == "")
        break;
}

form.group({label: "Non-stacked Variables", expanded: true});
i = 1;
while (true)
{
    let included = form.textBox({name: "formInclude" + i,
                                      label: "Non-stacked variables to include",
                                      prompt: "E.g., Q1_A-Q1_C or Q1_* or Q1_A,Q1_B,Q1_C",
                                      required: false}).getValue();
    i++;
    if (included == "")
        break;
}

form.checkBox({label: "Include original case variable in stacked data set",
               name: "formOriginalCaseVar", default_value: true});
form.checkBox({label: "Include observation variable in stacked data set",
               name: "formObservationVar", default_value: true});

// Controls for regular updating
form.group({label: "Automatic Updating", expanded: true});
var updating = form.checkBox({label:"Automatic updating", name:"formUpdating", default_value:false, prompt:"Regularly update the output"}).getValue();
if (updating) {
    var period = form.comboBox({name: "formUpdatePeriod", label: "Update period", 
               alternatives: ["Months", "Weeks", "Days", "Hours", "Minutes", "Seconds"], default_value: "Days", prompt: "The time units for updating"}).getValue();
    var defaultFrequency = 1;
    if (period == "Seconds")
        defaultFrequency = 600;
    else if (period == "Minutes")
        defaultFrequency = 10;
    form.numericUpDown({name: "formFrequency", label: "Frequency", default_value: defaultFrequency,
                        prompt: "The update frequency in units of the update period", increment: 1, minimum: defaultFrequency, maximum: 1000000});
    var start = form.textBox({name: "formStart", label: "Start date and time", prompt: "The first update date and time",
              required: false, prompt: "Default now, or e.g. 31-12-2018 18:00:00"}).getValue();
    if (start != "") {
        form.checkBox({label:"US date format", name:"formUSDate", default_value:false, prompt: "Specify update start date as mm-dd-yyyy"});
        form.textBox({name: "formTimeZone", label: "Time zone", 
                  required: false, prompt: "Leave blank for UTC or enter e.g. America/New_York"});    
    }
}
library(flipData)
library(flipTime)

vars <- ls()
manual.common.labels <- NULL

if (formCommonLabelMethod == "Using a set of variables to stack as reference") {
    n.sets.ref.vars <- sum(grepl("^formCommonLabelRefVars[[:digit:]]+", vars))
    reference.variables.to.stack <- vapply(seq_len(n.sets.ref.vars - 1),
                                           function(i) get0(paste0("formCommonLabelRefVars", i)),
                                           character(1))
} else if (formCommonLabelMethod == "Using manually input common labels") {
    manual.common.labels <- list()
    i <- 1
    repeat {
        n.lbls <- sum(grepl(paste0("^formCommonLabel", i, "Set[[:digit:]]+"), vars))
        if (n.lbls == 1)
            break
        manual.common.labels[[i]] <- vapply(seq_len(n.lbls - 1),
                                            function(j) get0(paste0("formCommonLabel", i, "Set", j)),
                                            character(1))
        i <- i + 1
    }
}

n.manual <- max(as.numeric(substr(vars[grepl("^formManual[[:digit:]]+", vars)], 
                                nchar("formManual") + 1,
                                nchar("formManual") + 5)))
manual.stacking <- vapply(seq_len(n.manual), function(i) get0(paste0("formManual", i)),
                    character(1))
manual.stacking <- manual.stacking[-length(manual.stacking)]

n.include <- max(as.numeric(substr(vars[grepl("^formInclude[[:digit:]]+", vars)], 
                                nchar("formInclude") + 1,
                                nchar("formInclude") + 5)))
variables.to.include <- vapply(seq_len(n.include), function(i) get0(paste0("formInclude", i)),
                    character(1))
variables.to.include <- variables.to.include[-length(variables.to.include)]

# Create regular updating message
if (formUpdating) {
    if (formStart != "") {
        if (formTimeZone == "") formTimeZone <- "UTC"
        UpdateAt(formStart, us.format = formUSDate, time.zone = formTimeZone,
                units = tolower(formUpdatePeriod), frequency = formFrequency, options = "wakeup")
    } else
        UpdateEvery(formFrequency, units = tolower(formUpdatePeriod), options = "wakeup")
}

stacked <- StackData(formDataSet,
                     stacked.data.set.name = formStackedDataSet,
                     stack.with.common.labels = formCommonLabelMethod,
                     manual.common.labels = manual.common.labels,
                     reference.variables.to.stack = reference.variables.to.stack,
                     specify.by = formManualType,
                     manual.stacking = manual.stacking,
                     variables.to.include = variables.to.include)