Correlation - Correlation Matrix

From Q
Jump to navigation Jump to search
VizIcon Correlation Matrix Heatmap.svg

Creates a correlation matrix from variables, questions, variable sets, or a table. See What is Correlation? and What is a Correlation Matrix? for more information about correlation and correlation matrices.

Example

Output Example:
The correlation matrix is displayed as a triangular heatmap:


Options

Input type The type of input to use. A choice between Variables, Questions/Variable sets and Table.

Variables The variables to use in the correlation matrix.

Questions/Variable sets The questions (known as variable sets in Displayr) to use in the correlation matrix.

Table The table to use in the correlation matrix. Correlations are calculated between columns in the table.

Ignore For question inputs, these are the question categories to ignore; for table inputs, these are the columns and rows in the source table to ignore.

Missing data See Missing Data Options.

Correlation type Choose between the standard Pearson's correlation or Spearman's correlation.

Categorical as binary Represent unordered categorical variables as binary variables. Otherwise, they are represented as sequential integers (i.e., 1 for the first category, 2 for the second, etc.). Note that categorical variables in Number - Multi questions are treated according to their numeric values and not converted to binary.

Variable names Displays Variable Names in the output.

Color palette Select colors used in the color scale bar

Minimum/maximum value Set upper and lower bounds of the color scale. Value beyond this range will be set to NA.

Show cell values Whether to display cell values, or if this should be determined based on available space (Automatic).

Show row labels Whether to display row labels.

Show column labels Whether to display column labels.

Additional Properties

When using this feature you can obtain additional information that is stored by the R code which produces the output.

  1. To do so, select Create > R Output.
  2. In the R CODE, paste: item = YourReferenceName
  3. Replace YourReferenceName with the reference name of your item. Find this in the Report tree or by selecting the item and then going to Properties > General > Name from the object inspector on the right.
  4. Below the first line of code, you can paste in snippets from below or type in str(item) to see a list of available information.

For a more in depth discussion on extracting information from objects in R, checkout our blog post here.

Properties which may be of interest are:

  • The correlation values themselves:
item$cor # correlation values
  • The p-values:
item$p # p-values (note that these are computed using taylor series linearization, whereas the standard errors and resulting statistics that appear on standard tables are computed using calibration, so their results can be different).
  • The t-statistics:
item$t # t-statistic


Acknowledgements

Uses the R package survey and the d3heatmap htmlwidget.


Code

var is_displayr = Q.isOnTheWeb();
var allow_control_groups = Q.fileFormatVersion() > 10.9;
var palettes = ["Default", "Blues, light to dark", "Blues, dark to light", "Greys, light to dark", "Greys, dark to light", "Reds, light to dark", "Reds, dark to light", "Greens, light to dark", "Greens, dark to light", "Spectral colors (red, yellow, blue)", "Spectral colors (blue, yellow, red)","Heat colors (yellow, red)", "Terrain colors (green, beige, grey)", "Custom gradient", "Custom palette"];


var heading_text = "Correlation Matrix";
if (!!form.setObjectInspectorTitle)
    form.setObjectInspectorTitle(heading_text, "Correlation Matrices");
else 
    form.setHeading(heading_text);

var data_sources = is_displayr ? ["Variables", "Variable Sets", "Table"] : ["Variables", "Questions", "Table"];
var formInputType = form.comboBox({label: "Data source", 
    alternatives: data_sources,
    name: "formInputType", default_value: "Variables"});

if (formInputType.getValue() == "Variables")
{
    form.dropBox({name: "formVariables", label: "Variables", types: ["V:numeric, categorical, ordered categorical"],
                  multi:true, min_inputs: 2, prompt: "Select at least two Variables", height: 8});
}
else if (formInputType.getValue() == "Questions" || formInputType.getValue() == "Variable Sets")
{
    var questions_label = is_displayr ? "Variable Sets" : "Questions";
    var questions_prompt = is_displayr ? "Select a Variable Set containing at least two Variables" : "Select a Question containing at least two Variables";
    var ignore_prompt = is_displayr ? "Specify Variable Set categories to ignore as a comma-separated list" : "Specify Question categories to ignore as a comma-separated list";
    form.dropBox({name: "formQuestions", label: questions_label, types: ["Question: Pick One, Pick One Multi, Number, Number Multi, Number Grid, Date, Ranking"],
                  multi:true, prompt: questions_prompt});
    form.textBox({label: "Ignore", type: "text", default_value: "NET, Total, SUM", name: "formIgnore", required: false, prompt: ignore_prompt});
}
else if (formInputType.getValue() == "Table")
{
    form.dropBox({ name: "formTable", label: "Table", types:["table", "RItem:matrix,array,data.frame"], multi : false, prompt: "Select a table with numeric values"});
    form.textBox({label: "Ignore", type: "text", default_value: "NET, Total, SUM", name: "formIgnore", required: false, prompt: "Specify table columns and rows to ignore as a comma-separated list"});
}
 
form.comboBox({label: "Missing data", 
    alternatives: ["Error if missing data", "Exclude cases with missing data", "Use partial data"], 
    name: "formMissing", default_value: "Use partial data", prompt: "Options for handling cases with missing data"});

form.comboBox({name: "formCorrelation", label: "Correlation type", default_value: "Pearson",
               alternatives: ["Pearson", "Spearman"],
               prompt: "Specify a correlation measure"});

form.checkBox({ name: "formBinaryCat", label: "Categorical as binary", default_value: false,
                prompt: "Convert categorical variables to dummy binary, else converted to integers"});

if (formInputType.getValue() == "Variables")
{
    form.checkBox({label: "Variable names", name: "formNames", default_value: false,
                   prompt: "Display names instead of labels"});
}

if (allow_control_groups)
    form.group("Appearance");
var qColor = form.comboBox({name: "formPalette", label: "Colors", alternatives: palettes, default_value: palettes[0], required: true});
var colOpt = qColor.getValue();
if (colOpt == "Custom gradient")
{
    if (!allow_control_groups)
        var qGradientLabel = form.newLabel("Gradient start/end");
    var qGradientStart = form.colorPicker({name: "formCustomGradientStart", label: !allow_control_groups ? "" : "Gradient start", default_value: "#5C9AD3"});
    var qGradientEnd = form.colorPicker({name: "formCustomGradientEnd", label: !allow_control_groups ? "" : "Gradient end", default_value: "#ED7D31"});
}
if (colOpt == "Custom palette")
    var qPalette = form.textBox({name: "formCustomPalette", label: "Custom palette", default_value: "#5C9AD3, #ED7D31", prompt: "Enter colors as a comma-separated list. A palette will be constructed by linear interpolation."});

form.textBox({label: "Minimum value", name: "formMinValue", default_value: -1, type: "number", required: false, prompt: "Lower bound of color scale. Values below the minimum will be set to NA"});
form.textBox({label: "Maximum value", name: "formMaxValue", default_value: 1, type: "number", required: false, prompt: "Upper bound of color scale. Values above the maximum will be set to NA"});

form.comboBox({label: "Show cell values", name: "formCell", default_value: "Automatic",
               alternatives: ["Automatic", "Yes", "No"]});
form.comboBox({label: "Show row labels", name: "formRowLabels", default_value: "Yes",
               alternatives: ["Yes", "No"]});
form.comboBox({label: "Show column labels", name: "formColumnLabels", default_value: "Yes",
               alternatives: ["Yes", "No"]});


library(flipStatistics)

WarnIfVariablesSelectedFromMultipleDataSets()

colors <- NULL
if (formPalette != "Default")
    colors <- flipChartBasics::ChartColors(10, given.colors = formPalette,
              custom.color = NULL,
              custom.gradient.start = formCustomGradientStart,
              custom.gradient.end = formCustomGradientEnd,
              custom.palette = formCustomPalette, silent = TRUE)


correlation.matrix <- CorrelationMatrix(input.data = get0("formVariables", ifnotfound = get0("formQuestions", ifnotfound = get0("formTable"))),
    colors = unique(colors),
    colors.min.value <- if (formMinValue == "") -1 else as.numeric(formMinValue),
    colors.max.value <- if (formMaxValue == "") 1 else as.numeric(formMaxValue),
    use.names = if (exists("formNames")) formNames else FALSE,
    ignore.columns = formIgnore,
    missing.data = formMissing,
    spearman = formCorrelation == "Spearman",
    filter = if (formInputType == "Table") NULL else QFilter,
    weights = if (formInputType == "Table") NULL else QCalibratedWeight,
    show.cell.values = formCell,
    row.labels = formRowLabels,
    column.labels = formColumnLabels,
    categorical.as.binary = formBinaryCat)