Creating Binary Variables
Binary variables are variables which only take two values. For example, Male or Female, True or False and Yes or No. While many variables and questions are naturally binary, it is often useful to construct binary variables from other types of data. For example, turning age into two groups: less than 35 and 35 or more. Constructing binary variables is also known as quantizing and dichotomizing.
- 1 The uses of binary variables
- 2 For use in statistical analysis
- 3 Ways of creating binary variables
- 4 Missing values in binary variables
The uses of binary variables
A 1 indicates inclusion of a respondent in the filter and 0 indicates exclusion.
As intermediate variables
Often it is useful to construct binary variables for use in creating other variables. For example, if creating a new segmentation variable, the first step may be to create multiple binary variables, each representing a single segment, and then convert these into a Pick One question using Insert Ready-Made Formula(s) Menu > Pick Any -> Pick One.
For use in statistical analysis
To most people, averages and percentages are quite different concepts, with averages applying to numeric data (e.g., number of pizzas eaten in a week) and percentages relating to categories (favorite brand of pizza). From a computational perspective, averages and proportions are very closely related and this interrelationship can be exploited using Q to save time (if you used SPSS, it is likely that you already understand the basic principles that are demonstrated in this section; if not, it may seem a bit strange at first).
If you construct binary variables by recoding or constructing numeric values to only take values of 0, 1 and NaN, any computed averages will also be proportions. If, for example, you have a sample with 56% males, and you recode the gender variable so that males have a value of 1 and females 0, and convert its Question Type to either Number or Number – Multi, the average will be 0.56. The main benefit of binary variable “maths” is that while a variable that Q knows is binary will always have a NET, a numeric variable instead has a SUM. If, for example, the question was measuring brands that the consumer would consider buying, the SUM would then measure the consideration set size (whereas with a traditional binary variable, the NET would indicate the proportion of the sample to consider 1 or more brands).
Why does this work? It is because any binary variable, by definition, is implicitly also a Numeric Variable. That is, Numeric Variables are variables that can take any value, and binary variables take values of 0 and 1 and are thus Numeric Variables.
Ways of creating binary variables
|Related Online Training modules|
|Generally it is best to access online training from within Q by selecting Help > Online Training|
Any of the ways for creating numeric variables can be used to create binary variables. However, the main approaches are:
Creating a Binary - Complicated filter
Any Filters that are created (from a cell in a table, for example) are binary variables.
Creating a binary question by changing Question Type
Binary variables can be constructed by editing the Values in the Value Attributes so that they all take only two values (generally, 0 and 1 are most appropriate).
Missing values in binary variables
In theory, binary variables should only have two values. In practice, it is often useful if they can also have missing values, in which case the Binary - Complicated Filter and the methods based around creating Filters tend not to be useful.