Analysis of Variance - ANOVA

From Q
Jump to navigation Jump to search


Analysis of Variance (ANOVA) is a hypothesis testing procedure that tests whether two or more means are significantly different from each other.

How to Create an ANOVA Table

  1. Add the object by selecting from the menu Anything > Advanced Analysis > Analysis of Variance > ANOVAAutomate > Browse Online Library > Analysis of Variance > ANOVA
  2. In Inputs > Outcome specify the outcome variable.
  3. Specify the predictor variables to compare in Inputs > Predictor(s)

Example

The example below comes from a study on cola drinking perceptions and habits. The image below shows the results for a linear regression.

Each source of variance (predictor variables and residuals) is listed on a separate row. The table shows,

Sum Sq The sum of squared deviations from the mean.
d.f. The number of degrees of freedom of the source of variance.
F value The F statistic.
p The p-value - the probability of the observed data, or data showing a more extreme departure from the null hypothesis, when the null hypothesis is true.


When not performing a linear regression, the likelihood ratio chi-square is shown.

Binary Logit ANOVA Results.png

Options

The options in the Object Inspector are organized into two tabs: Inputs and Properties.

Inputs

Outcome The variable to be predicted by the predictor variables.

Predictor(s) The variable(s) to predict the outcome.

Type: The type of regression to perform.

Linear See Regression - Linear Regression.
Binary Logit See Regression - Binary Logit.
Ordered Logit See Regression - Ordered Logit.
Multinomial Logit See Regression - Multinomial Logit.
Poisson See Regression - Poisson Regression.
Quasi-Poisson See Regression - Quasi-Poisson Regression.
NBD See Regression - NBD Regression.

Missing data See Missing Data Options.

Auxiliary variables Variables to be used when imputing missing values (in addition to all the other variables in the model). Only shown when Missing data is Multiple imputation.

Output

ANOVA The ANOVA table as shown in the example above.
Summary The regression coefficients, their standard errors, t-statistics and p-values.
Detail The R output from the regression fitting.

Variable names Displays Variable Names in the output instead of labels.

Robust standard errors Computes standard errors that are robust to violations of the assumption of constant variance (i.e., heteroscedasticity). See Robust Standard Errors. This is only available when Type is Linear.

Filter The data is automatically filtered using any filters prior to estimating the model.

Weight. Where a weight has been set for the R Output, it will automatically be applied when the model is estimated. By default, the weight is assumed to be a sampling weight, and the standard errors are estimated using Taylor series linearization (by contrast, in the Legacy Regression, weight calibration is used). See Weights, Effective Sample Size and Design Effects.

Properties

This tab contains options for formatting the size of the object, as well as the underlying R code used to create the visualization, and the JavaScript code use to customize the Object Inspector itself (see Object Inspector for more details about these options). Additional options are available by editing the code.

Diagnostics

See Regression Diagnostics.

Acknowledgements

See Regression - Generalized Linear Model.

Code

To access the underlying code in Displayr, go to Properties > R CODE.

form.dropBox({label: "Outcome", 
            types:["Variable: Numeric, Date, Money, Categorical, OrderedCategorical"], 
            name: "formOutcomeVariable",
            prompt: "Dependent variable predicted by the regression"})
form.dropBox({label: "Predictor(s)",
            types:["Variable: Numeric, Date, Money, Categorical, OrderedCategorical"], 
            name: "formPredictorVariables", 
            prompt: "Independent variable(s) used to predict the dependent variable",
            multi:true})
var formType = form.comboBox({label: "Type", 
              alternatives: ["Linear", "Binary Logit", "Ordered Logit", "Multinomial Logit", "Poisson", "Quasi-Poisson", "NBD"], 
              name: "formType", default_value: "Linear",
              prompt: "Type of regression"})
var formMissing = form.comboBox({label: "Missing data", 
              alternatives: ["Error if missing data", "Exclude cases with missing data", "Use partial data (pairwise correlations)", "Multiple imputation"], 
              name: "formMissing", default_value: "Exclude cases with missing data",
              prompt: "Treatment of missing data values"})
if(formMissing.getValue() == "Multiple imputation")
    form.dropBox({label: "Auxiliary variables",
            types:["Variable: Numeric, Date, Money, Categorical, OrderedCategorical"], 
            name: "formAuxiliaryVariables", 
            required: false, 
            multi:true,
            prompt: "Additional variable(s) to use when conducting imputation"})
var formOutput = form.comboBox({label: "Output", 
              alternatives: ["ANOVA", "Summary", "Detail"], 
              name: "formOutput", default_value: "ANOVA",
              prompt: "Type of output produced"})
form.checkBox({label: "Variable names", name: "formNames", default_value: false, prompt: "Whether to use variable names instead of labels"})
if((formType.getValue() == "Linear") && formMissing.getValue() != "Use partial data (pairwise correlations)" && formMissing.getValue() != "Multiple imputation")
    form.checkBox({label: "Robust standard errors", name: "formRobustSE", default_value: false,
                   prompt: "Compute standard errors that are robust to violations of the assumption of constant variance"})
var heading_text = (formOutput.getValue() == "ANOVA" ? "ANOVA" : "Generalized Linear Model" ) +": " + formType.getValue();
if (!!form.setObjectInspectorTitle)
    form.setObjectInspectorTitle(heading_text, heading_text)
else 
    form.setHeading(heading_text);
library(flipRegression)
WarnIfVariablesSelectedFromMultipleDataSets()

glm <- Regression(QFormula(formOutcomeVariable ~ formPredictorVariables), weights = QPopulationWeight, subset = QFilter, missing = formMissing, output = formOutput,
                  robust.se = get0("formRobustSE", ifnotfound = FALSE), type = formType, show.labels = !formNames, auxiliary = get0("formAuxiliaryVariables", ifnotfound = NULL))