Marketing - MaxDiff - Hierarchical Bayes

From Q
Jump to navigation Jump to search


Analyse MaxDiff data with Hierarchical Bayes.

Example

In Displayr, to run the MaxDiff - Hierarchical Bayes, select Insert > More > MaxDiff > Hierarchical Bayes.
In Q, select Create > Marketing > MaxDiff > Hierarchical Bayes.

The table below shows the output of an analysis, containing histograms of the estimated parameters of the respondents (blue and red bars correspond to positive and negative parameters respectively):

Data Setup

Two groups of data are needed for the analysis:

  1. The MaxDiff design that outlines which alternatives were shown in which question in which version.
  2. The actual MaxDiff responses to the choices, which has the selections from respondents.


The Design input table needs to be in a form similar to the one shown below. The 'Version' column is optional when there is only one version in the design. The 'Question' column is also optional. However, if these columns are included, they must have the names 'Version' and 'Question'. The columns after this contain the indices of the alternatives presented to the respondents. Below is version 1 of a design with 6 questions, 5 options shown per question, and 10 total alternatives in the design.


The respondent MaxDiff choice selections should have a best/worst variable for each question in the design, and can be stored as either the alternative number or the option number. The alternative number represents the number of alternatives analyzed in the experiment, for example, in the data below, there are 10 possible alternatives in the design. So the values 1-10 are stored in the best/worst variables as shown here:

MaxDiffData1.PNG

When the option number is captured instead of the alternative number, then the number of the option selected in each question is used instead of the alternative number. For example, the data below has a design where 5 alternatives were shown to respondents as the options in each question. So the values 1-5 are stored in the best/worst variables as shown below:

MaxDiffData2.PNG

Note that when the data is setup based on option number, the Question Type for the respondent MaxDiff selections should be set to Number - Multi. If there aren't labels in the design provided, the alternative number can also be used, but needs to be formatted as a Number or Number-Multi Question Type.

Options

The inputs used to generate the Hierarchical Bayes analysis are shown below.

Maxdiffhb.png

Experimental Design

Design location One of,

Use an existing table Select an existing table in the project in Design.
Experimental design R output Select a MaxDiff design created with Marketing - MaxDiff - Experimental Design.
Variables Provide Alternatives (and optionally Design version) where each alternative contains the alternatives presented by question, analogous to the columns of a tabular design.
Provide a URL Enter a URL in Design URL.

Design The design table. This may contain a 'Version' column and if not present there is assumed to be one version. It may also contain a 'Question' column. All other columns are assumed to contain the alternatives presented, hence the number of other columns defines the number of alternatives per question.

Design URL The URL to a CSV file containing the design.

Alternative labels Labels for the alternatives specified in the first column of the data editor. This can be left out in most cases as the alternatives can usually be extracted from the best and worst selections.

Respondent Data

Version A variable which indicates which version of the design was provided to each respondent. Can be left blank if the design only contains one version.

Best selections The best selections for each question. Can be categorical variables with labels matching the labels in the design; categorical variables containing the selected option value (e.g. "Option 1", "Option 2", "Option 3" for a design with three alternatives shown per question); numeric variables taking values from 1, 2, ..., up to the number of alternatives; or numeric variables taking values from 1, 2, ..., up to the number of options shown per question.

Worst selections The worst selections for each question. Should have the same format as Best selections.

Missing data See Missing Data Options.

Model

Type Switch between MaxDiff models: Latent Class Analysis, Hierarchical Bayes and Varying Coefficients.

Number of classes The number of classes in the analysis.

MaxDiff logit Choose between Tricked Logit and Rank-Ordered Logit with Ties. The former is faster but the latter is used in Segments > Latent Class Analysis for MaxDiff in Q.

Questions left out for cross-validation The number of questions to leave out per respondent to be used for cross-validation.

Seed The random seed used to determine the random initial parameters of the model and also used to determine the random questions to leave out for cross-validation.

Iterations The number of iterations used in the hierarchical Bayes analysis.

Chains The number of chains used in the hierarchical Bayes analysis.

Covariates Respondent-specific covariates to be used in the model.

Maximum tree depth The maximum tree depth parameter. Only increase this if warnings about "tree depth" are shown.

Adapt delta The adapt delta parameter. Only increase this if warnings about "low adapt delta" are shown.

DIAGNOSTICS

Parameter Statistics Table Creates a table showing the parameter statistics for the model.

Posterior Intervals Plot Creates a diagnostic plot of the posterior intervals of the hierarchical parameters.

Trace Plots Creates a diagnostic plot of the trace of the parameter estimates.

SAVE VARIABLE(S)

Sawtooth-Style Preference Shares (K Alternatives) Saves variables that contain Sawtooth-style preference shares (K alternatives).

Individual-Level Coefficients Saves variables that contain the individual-level coefficients (utilities).

Preference Shares Saves variables that contain preference share for each alternative by respondent.

Proportion of Correct Predictions Save a variable that contains the proportion of correct 'best' predictions for each respondent.

RLH (Root-Likelihood) Saves a variable that contains the root likelihood for each respondent.

Zero-Centered Utilities Saves variables that contain the zero-centered utilities.

Technical Details

An R package called flipChoice is used to run the hierarchical Bayes analysis. flipChoice uses rstan to fit the underlying Bayesian statistical model, which is itself an R interface for Stan.

Mean and covariance parameters for all but the last alternative are estimated. The respondent parameters for the last alternative are constrained to be the negative of the sum of the other parameters, so that the parameters for each respondent sum to zero.

For further information on hierarchical Bayes modeling, please refer to chapter 5 from Bayesian Statistics and Marketing.

More information

How to Analyze MaxDiff Data in Q
Using Hierarchical Bayes for MaxDiff in Q

Code

var allow_control_groups = Q.fileFormatVersion() > 10.9; // Group controls for Displayr and later versions of Q

if (allow_control_groups)
    form.group({label: "Experimental design", expanded: true})

var dt = form.comboBox({name: "formDesignLocation", label: "Design source", alternatives: ["Use an existing table", "Experimental design R output", "Variables", "Provide a URL"],
                        default_value: "Use an existing table", prompt: "Select design source"}).getValue();
if (dt == "Use an existing table")
    form.dropBox({name: "formDesign", label: "Design", types: ["Table", "RItem:matrix,array,data.frame,table"], required: true, prompt: "Select a design from a table"});
else if (dt == "Experimental design R output")
    form.dropBox({name: "formDesign", label: "Design", types: ["RItem:MaxDiffDesign,data.frame"], required: true, prompt: "Select a MaxDiff design object"});
else if (dt == "Variables")
{
    form.dropBox({label: "Alternatives",
                  types:["Variable: Numeric, Categorical, OrderedCategorical"],
                  name: "formAlternatives", required: true, multi: true,
                  prompt: "Select design alternative variables"});
    form.dropBox({label: allow_control_groups ? "Design version" : "Experimental design version",
                  types:["Variable: Numeric, Categorical, OrderedCategorical"],
                  name: "formDesignVersion", required: false,
                  prompt: "Select the design version variable"});
}
else
    form.textBox({name: "formURL", label: "Design URL", required: true, prompt: "Specify a URL to the design file"});

if (dt != "Experimental design R output")
    form.dataEntry({name: "formLabels",
                    prompt: "Enter alternative labels (optional)",
                    required: false,
                    default_value: [["Enter alternative labels below:"]], label: "Add alternative labels",
                    edit_label: "Edit alternative labels",
                    large_data_error: "The data entered is too large. You may only enter data with up to 1000 rows and up to 100 columns."});

if (allow_control_groups)
    form.group({label: "Respondent data", expanded: true})

form.dropBox({label: allow_control_groups ? "Version" : "Version variable",
            types:["Variable: Numeric, Categorical, OrderedCategorical"],
            name: "formVersion", required: false, prompt: "Select the respondent version variable"});
form.dropBox({label: "Best selections", 
            types:["Variable: Numeric, Categorical, OrderedCategorical"], 
            name: "formBest", multi: true, prompt: "Select the best choice variables"});
form.dropBox({label: "Worst selections", 
            types:["Variable: Numeric, Categorical, OrderedCategorical"], 
            name: "formWorst", multi: true, prompt: "Select the worst choice variables"});
form.comboBox({label: "Missing data", name: "formMissing",
               alternatives: ["Error if missing data", "Exclude cases with missing data", "Use partial data"],
               default_value: "Use partial data", prompt: "Options for handling cases with missing data"});

if (allow_control_groups)
    form.group("Model")
var type = form.comboBox({name: "formType", label: "Type",
                          alternatives: ["Multinomial Logit",
                                         "Latent Class Analysis",
                                         "Hierarchical Bayes",
                                         "Varying Coefficients"],
                          default_value: "Hierarchical Bayes", prompt: "Select the MaxDiff analysis type"});
var is_mnl = type.getValue() == "Multinomial Logit";
var is_lc = type.getValue() == "Latent Class Analysis";
var is_hb = type.getValue() == "Hierarchical Bayes";
var is_vc = type.getValue() == "Varying Coefficients";

var web_mode = (!!Q.isOnTheWeb && Q.isOnTheWeb());
if (is_hb && Q.fileFormatVersion() < 12.31 && !web_mode)
    alert("A newer version of Q (version 5.3) is required to run Hierarchical Bayes. Please contact support@q-researchsoftware.com to upgrade.");

var heading_text = "MaxDiff - ";
if (is_mnl)
   heading_text = heading_text + "Multinomial Logit";
else if (is_lc)
   heading_text = heading_text + "Latent Class Analysis";
else if (is_vc)
   heading_text = heading_text + "Varying Coefficients";
else if (is_hb)
   heading_text = heading_text + "Hierarchical Bayes";

if (!!form.setObjectInspectorTitle)
    form.setObjectInspectorTitle(heading_text, heading_text);
else 
    form.setHeading(heading_text);



if (is_vc || is_hb)
    form.dropBox({label: "Covariates",
            types:["Variable: Numeric, Categorical, OrderedCategorical"], 
            name: "formCovariates", multi: true, required: is_vc, prompt: "Select covariate variables"});
if (is_vc)
    var lc = form.checkBox({label: "Additional latent class analysis", name: "formLC", default_value: true, prompt: "Run a final latent class analysis"});

if (is_lc | (is_vc && lc.getValue()))
    form.numericUpDown({name: "formClassesLC", label: "Number of classes", default_value: 2, increment: 1, maximum:100, minimum: 1,
                        prompt: "Specify the number of classes in the model"});
if (is_hb) // separate controls for HB and LC so classes are not carried over
    form.numericUpDown({name: "formClassesHB", label: "Number of classes", default_value: 1, increment: 1, maximum:100, minimum: 1,
                        prompt: "Specify the number of classes in the model"});
form.comboBox({label: "MaxDiff logit", alternatives: ["Tricked Logit", "Rank-Ordered Logit with Ties"], name: "formModel",
               default_value: "Tricked Logit", prompt: "Select MaxDiff logit treatment"});
form.numericUpDown({name: "formCV", label: "Questions left out for cross-validation", default_value: 0, increment: 1,
                    maximum:100, minimum: 0, prompt: "Specify number of questions to leave out"});
form.numericUpDown({name: "formSeed", label: "Seed", default_value: 123, 
                        prompt: "The random seed", minimum: -999999999, maximum: 999999999,
                              increment: 1});
if (is_hb)
{
    form.numericUpDown({name: "formIterations", label: "Iterations", default_value: 100, increment: 10,
                        maximum:1000000, minimum: 1, prompt: "Specify number of hierarchical Bayes iterations"});
    form.numericUpDown({name: "formChains", label: "Chains", default_value: 8, increment: 1,
                        maximum:1000, minimum: 1, prompt: "Specify number of chains"});
    form.numericUpDown({name: "formMaxTreeDepth", label: "Maximum tree depth", default_value: 10, increment: 1,
                        maximum:1000, minimum: 1, prompt: "Specify maximum tree depth (only change if necessary)"});
    form.numericUpDown({name: "formAdaptDelta", label: "Adapt delta", default_value: 0.8, increment: 0.001,
                        maximum:0.999, minimum: 0.001, prompt: "Specify adapt delta (only change if necessary)"});
}
if (is_lc)
{
    form.numericUpDown({name: "formNStarts", label: "Number of starts", default_value: 1, 
                        prompt: "Number of times to start LCA", minimum: 1, maximum: 1000000, 
                        increment: 1});
}
library(flipMaxDiff)
df <- if (formType == "Varying Coefficients" && exists("formCovariates")) {
    names(formCovariates) <- sapply(formCovariates, function(x) attr(x, "label"))
    formCovariates
}else
    NULL

if (formType == "Hierarchical Bayes") {
    frml <- QFormula(~formCovariates)
    dat <- QDataFrame(formCovariates)
    if (!length(dat))
        frml <- dat <- NULL
}else
    frml <- dat <- NULL

if (exists("formURL")) {
    md.design <- try(read.csv(formURL))
    if (inherits(md.design, "try-error") || NCOL(md.design) <= 1L)
        stop("The provided Design URL does not contain a valid design file. ",
             "The URL needs to be a downloadable link containing a file that ",
             "can be read in CSV format. Please edit the link and try again.")
    keep <- apply(md.design, 1, function(x) !all(is.na(x)))
    md.design <- md.design[keep, ]
}

if (exists("formURL")) {
    md.design <- read.csv(formURL)
    keep <- apply(md.design, 1, function(x) !all(is.na(x)))
    md.design <- md.design[keep, ]
}

max.diff <- FitMaxDiff(design = if(exists("formDesign")) formDesign else if (exists("formURL")) md.design,
    version = if (is.null(formVersion))  rep(1, length(formBest[[1]])) else formVersion,
    best = if (exists("formBest")) formBest,
    worst = if (exists("formWorst")) formWorst,
    design.alternatives = if (exists("formAlternatives")) formAlternatives,
    design.version = if (exists("formDesignVersion")) formDesignVersion,
    alternative.names = if (exists("formLabels") && !any(dim(formLabels) == c(0, 0)) && !all(dim(formLabels) == c(1, 1))) formLabels else NULL,
    n.classes = get0("formClassesLC", ifnotfound = 1) * get0("formClassesHB", ifnotfound = 1),
    subset = if (all(QFilter)) NULL else QFilter,
    weights = QPopulationWeight,
    missing = formMissing,
    characteristics = df,
    cov.formula = frml, cov.data = dat,
    lc = if (exists("formLC")) formLC else TRUE,
    output = "Parameters",
    tasks.left.out = formCV,
    is.tricked = if(exists("formModel")) formModel == "Tricked Logit" else TRUE,
    algorithm = if (formType == "Hierarchical Bayes") "HB-Stan" else "Default",
    hb.iterations = if(exists("formIterations")) formIterations else 0,
    hb.chains = if(exists("formChains")) formChains else 0,
    hb.max.tree.depth = if(exists("formMaxTreeDepth")) formMaxTreeDepth else 0,
    hb.sigma.prior.rate = 1,
    hb.sigma.prior.shape = 1,
    hb.adapt.delta = if(exists("formAdaptDelta")) formAdaptDelta else 0,
    seed = formSeed,
    lc.n.starts = if (exists("formNStarts")) formNStarts else 1)

See Also

Other MaxDiff models:

Compare MaxDiff models:

Create an ensemble of MaxDiff models:

Diagnostics for MaxDiff hierarchical Bayes outputs:

Save variables from MaxDiff hierarchical Bayes outputs:

See MaxDiff Software or What is MaxDiff? for an overview of key MaxDiff concepts and resources.