Correlation - Distance Matrix
Computes a distance matrix comparing variables or cases. If a case contains missing values, it is omitted from the analysis. Weights are not applicable when comparing cases. See What is a Distance Matrix? for more information about distance matrices.
How to Create a Distance Matrix
- Add the object:
- In Displayr: Anything > Advanced Analysis > More > Correlation > Distance Matrix
- In Q: Create > Correlation > Distance Matrix
- Under Inputs > Compare select if you want to compare variables or cases. Please note you can only compare 100 cases or fewer.
- If cases are compared, select the Case label to use. This is the label to use to refer to each case. If nothing is selected, the case index is used.
- Under Inputs > Variables select the variables to compare. At least two variables must be selected.
Example
Distance matrices are displayed as triangular heatmaps. The example below is a dissimilarity matrix using data from a cola consumption survey.
Options
The options in the Object Inspector are organized into two tabs: Inputs and Properties.
Inputs
Compare Whether variables or cases are to be compared.
Case labels If cases are compared, these are the labels to use to refer to each case. Otherwise, the case index is used.
Variables The variables that will be included in this analysis.
Variable names Whether to display Variable Names in the output instead of Variable Labels.
Categorical as binary Represents unordered categorical variables as binary variables. Otherwise, they are represented as sequential integers (i.e., 1 for the first category, 2 for the second, etc.). Numeric - Multi variables are treated according to their numeric values and not converted to binary.
Measure Whether to measure similarities or dissimilarities.
Similarity measure The type of similarity measure to use:
- Correlation Pearson correlation.
- Cosine The cosine of the angle between a pair of vectors which represent a pair of variables or cases.
Distance measure The type of distance/dissimilarity measure to use. The options are (refer to dist for more information):
- Euclidean
- Squared Euclidean The square of the Euclidean distance
- Maximum
- Manhattan
- Minkowski
Minkowski power (p) The power parameter for the Minkowski distance measure.
Data standardization The standardization method. Choices include:
- None No standardization is performed.
- z-scores Values are transformed to have mean zero and a standard deviation of one.
- Range [-1,1] Values are divided by their range.
- Range [0,1] Values are subtracted by their minimum value and divided by their range.
- Mean of 1 Values are divided by their mean. If the mean is zero, the values will be unchanged.
- Standard deviation of 1 Values are divided by their standard deviation.
For methods that require variation in the values (z-scores, Range [-1,1], Range [0,1] and Standard deviation of 1), if there is no variation in the values, they will be set to zero instead.
Standardize by Whether to standardize by variables or cases.
Measure transformation Transformation of the measures. Choices include None, Absolute values, Reverse sign and Range [0,1]. For Range [0,1], measures on the diagonal are ignored in the transformation.
Show cell values Whether to display cell values, or if this should be determined based on an estimate of available space (Automatic).
Show row labels Whether to display row labels.
Show column labels Whether to display column labels.
Properties
This tab contains options for formatting the size of the object, as well as the underlying R code used to create the visualization, and the JavaScript code use to customize the Object Inspector itself (see Object Inspector for more details about these options). Additional options are available by editing the code.
More Information
Acknowledgements
Uses the R package weights and the d3heatmap htmlwidget.
Code
var heading_text = "Distance Matrix";
if (!!form.setObjectInspectorTitle)
form.setObjectInspectorTitle(heading_text, "Distance Matrices");
else
form.setHeading(heading_text);
formCompare = form.comboBox({name: "formCompare",
label: "Compare",
alternatives: ["Cases", "Variables"],
default_value: "Variables",
prompt: "Whether to compare cases (rows) or variables (columns)"});
if (formCompare.getValue() == "Cases")
form.dropBox({name: "formCaseLabels",
label: "Case labels",
types: ["V:text"],
required: false,
prompt: "Labels to use to refer to each case"});
form.dropBox({name: "formVariables",
label: "Variables",
types: ["V:numeric, categorical, ordered categorical"],
multi: true,
prompt: "Select variables to analyse",
height: 8});
if (formCompare.getValue() == "Variables")
form.checkBox({label: "Variable names", name: "formNames", default_value: false, prompt: "Display names instead of labels"});
form.checkBox({ name: "binaryCat", label: "Categorical as binary", default_value: false, prompt: "Code categorical variables as dummy variables"});
formMeasure = form.comboBox({name: "formMeasure",
label: "Measure",
alternatives: ["Dissimilarities", "Similarities"],
default_value: "Dissimilarities"});
if (formMeasure.getValue() == "Dissimilarities")
{
formDistance = form.comboBox({name: "formDistance",
label: "Distance measure",
alternatives: ["Euclidean", "Squared Euclidean", "Maximum", "Manhattan", "Minkowski"],
default_value: "Euclidean"});
if (formDistance.getValue() == "Minkowski")
form.numericUpDown({name: "formMinkowski",
label: "Minkowski power (p)",
default_value: 2,
minimum: 1,
maximum: Number.MAX_SAFE_INTEGER});
}
else
form.comboBox({name: "formSimilarity",
label: "Similarity measure",
alternatives: ["Correlation", "Cosine"],
default_value: "Correlation"});
formStandardize = form.comboBox({name: "formStandardize",
label: "Data standardization",
alternatives: ["None", "z-scores", "Range [-1,1]", "Range [0,1]", "Mean of 1", "Standard deviation of 1"],
default_value: "None",
prompt: "Options to standardize the data"});
if (formStandardize.getValue() != "None")
form.comboBox({name: "formStandardizeBy",
label: "Standardize by",
alternatives: ["Variable", "Case"],
default_value: "Case",
prompt: "Standardize by case (row) or variable (column)"});
form.comboBox({name: "formMeasureTransform",
label: "Measure transformation",
alternatives: ["None", "Absolute values", "Reverse sign", "Range [0,1]"],
default_value: "None",
prompt: "Options to transform the measure"});
form.comboBox({label: "Show cell values", name: "formCell", default_value: "Automatic",
alternatives: ["Automatic", "Yes", "No"]});
form.comboBox({label: "Show row labels", name: "formRowLabels", default_value: "Yes",
alternatives: ["Yes", "No"]});
form.comboBox({label: "Show column labels", name: "formColumnLabels", default_value: "Yes",
alternatives: ["Yes", "No"]});
library(flipDimensionReduction)
WarnIfVariablesSelectedFromMultipleDataSets()
distance.matrix <- DistanceMatrix(QDataFrame(formVariables),
compare = formCompare,
case.labels = formCaseLabels,
variable.names = formNames,
binary = binaryCat,
measure = formMeasure,
similarity.measure = formSimilarity,
distance.measure = formDistance,
minkowski = formMinkowski,
standardization = formStandardize,
standardize.by = formStandardizeBy,
measure.transformation = formMeasureTransform,
show.cell.values = formCell,
show.row.labels = formRowLabels,
show.column.labels = formColumnLabels,
subset = QFilter,
weights = QCalibratedWeight)