How to Control Which Categories Are Used When Computing Percentages and Averages on Tables
The categories used when computing statistics on tables are controlled via Value Attributes (see the example below). These can be edited in a variety of ways, including by:
- Right-clicking on the blue or brown row/column name on the table and selecting Values.
- Pressing the ... button in the Variables and Questions tab.
- Right-clicking on a category that you wish to exclude from a category and selecting Remove.
- Automatically when using Set Question.
- Via QScript.
Individual categories are included or excluded as follows:
- If a category in the Missing Data column, it will then be excluded from all calculations. (Note that if you can also see a Missing Data row, this refers to observations that are marked as missing values in the original data file.)
- If a category has its Value shown as NaN, it will be excluded from all numeric calculations (e.g., Average, Median), but will be included when computing percentages. Note that you will only see the Value column on data that can be represented numerically (e.g., it will not appear for Pick Any questions).
- If it is a Date question, there is a completely different set of options (see Setting Time Periods for Date Questions).
- If you have Pick Any or Pick Any - Grid questions, there is a column called Count This Value, which dictates the numerator when computing percentages. In the example below, for example, there are six unique categories in the data file: Like, Love, Neither like nor dislike, Hate, Dislike and Missing data, and the settings shown tell us that:
- Anybody with Missing data is excluded from any calculations.
- The analysis will count up the number of people that have selected either Like or Love. That is, this number will be the numerator in any calculations of percentages (i.e., the bit that goes above the line in a fraction).
- The base used in calculating percentages consists of everybody except those people that have Missing data. Thus, this particular example will compute Top 2 Box percentages scores (i.e., the proportion of people that said Like or Love from amongst all those people that selected one of the five categories).
Excluding categories when computing percentages
Pick One and Pick One - Multi questions
Right-click on the category you wish to exclude and select Remove. This causes the table to be recomputed with this category removed. You can see which categories have been removed by right-clicking on a category and selecting Values and looking/editing the selections in the Missing Data column.
Alternatively, to remove a category from the table without affecting the calculations on a table, you can right-click the category and select Hide. This removes the category label from the table but does not change any of the Missing Data selections.
To undo the removal of categories from a table you can right-click the table and select Revert, or select Values and make changes to the selections in the Missing Data column.
Excluding categories from averages
Number, Number - Multi, and Number - Grid questions, and Statistics - Right and Statistics - Below
In some cases you may wish to keep a category showing in the table but remove it's contribution to the Average, Sum, or other numerical statistics that are displayed in the Statistics - Right or Statistics - Below. For example, a rating scale question may include a Don't Know category, and you want to know about the number of respondents who have selected this category without those respondents contributing to the calculation of the average score for the question.
To achieve this, right-click on your table and select Values, and enter a value of NaN in the Value column for that category. NaN stands for Not a number, and this value will not be used in the calculation of the average.