Statistical Assumptions
Related Online Training modules | |
---|---|
Type 1 Error | |
Generally it is best to access online training from within Q by selecting Help > Online Training |
Statistical Assumptions can be set for an entire project in Edit > Project Options > Customize and for selected tables and charts in Edit > Table Options. The Table Options Statistical Assumptions show as blank. This is because unless you specifically set these options, each table or chart will automatically have the assumptions that have been set for the entire project in the Project Options.
Contents
Default Statistical Assumptions
The default statistical assumption settings are outlined below. More commonly adjusted settings are highlighted in orange.
Multiple comparison correction: False Discovery Rate is by default applied to help reduce for the number of false positives based on the entire table |
Column comparisons: take affect only if Column Comparisons are selected |
More detail on these settings can be found below.
Show significance
Show higher or lower significance with arrows, font colors (tables only), or using symbols to show differences between columns. Some outputs and export formats do not support all options. Show significance is new in Q 4.7. For more information see Ways of Showing Statistical Significance.
Overall significance level
The Overall significance level is used throughout Q when determining which results to show as being statistically significant. By default, this is set at 0.05. It is applied by Q when determining which results to highlight as being significant on tables and charts, whether or not to show symbols for Column Comparisons and when using Smart Tables. The precise meaning of the Overall significance level is determined by other Statistical Assumptions settings (see Interpretation of the Overall Significance Level by Q).
Minimal sample size for testing
Where cells have sample sizes of less than this value, no significance test is conducted when conducting automated tests of statistical significance between cells (i.e,. Cell Comparisons and Column Comparisons). By default this is set to 2.
When Weights and significance is set to Kish approximation or has been specified by the user, the Effective Sample Size is used instead of the actual sample size.
Statistical tests for categorical and numeric data
You can control the type of test conduct by Q when testing proportions, means and correlations. The options for controlling these tests are described in this section.
Proportions
Related Online Training modules | |
---|---|
Population Weights | |
Non-Proportional Sampling Weights | |
Generally it is best to access online training from within Q by selecting Help > Online Training |
The consequence of this setting depends upon the data being viewed. See:
- One Sample Tests - Proportions
- Independent Sample Tests - Comparing Two Proportions
- Related Samples Tests - Comparing Two Proportions
- ANOVA-Type Tests - Comparing Three or More Groups
- Testing the Complement of a Cell.
Proportions Bessel's correction
Apply's Bessel's correction when computing the variance.
Means
This setting applies to tables involving means (i.e., tables that, by default, show the Average statistic).
The consequence of this setting depends upon the data being viewed. See:
- One Sample Tests - Means
- Independent Sample Tests - Comparing Two Means
- Related Samples Tests - Comparing Two Means
- ANOVA-Type Tests - Comparing Three or More Groups
- Multivariate Tests
Means Bessel's correction
Apply's Bessel's correction when computing the variance.
Correlations
This setting determines how the correlations are computed. See Correlations - Comparing Two Numeric Variables .
Equal variance in tests when sample size is less than
This setting determines whether to assume variances are equal or unequal when conducting t-tests and z-tests of means from independent samples involving unweighted data.
Cell comparisons
Multiple comparison correction
Selects the multiple comparison correction employed when determining which cells in a table are or are not shown as significant.
See Multiple Comparisons (Post Hoc Testing) for more information. By default, Q applies these corrections to the entire table simultaneously, but by checking Within row and span they are applied within each span within each row (e.g., if you have a table showing brand preference in the rows and age and gender in the columns, the corrections will be applied within age in each row and within gender in each row).
Within row and span
By default, Q applies these corrections to the entire table simultaneously, but by checking this box, they are applied within each span within each row. See the diagram beneath ANOVA-Type Test for an understanding of how having this option checked results in the comparisons being grouped.
Weights and significance
Determines how Q deals with weights when computing significance.
- Automatic A mixture of Taylor Series Linearization and Kish's Effective Sample Size Formula. Information about how the design effect is taken into account for specific tests can be found in the description of the actual tests (see Tests Of Statistical Significance)
- Taylor series linearization
- Kish approximation See Kish's Effective Sample Size Formula.
- Set to Enter a known design effect.
- deff=1. Prior to Q4.10, this was called Set to.
- deff = Sample size / sum of weights. Introduced in Q4.10.
- Unweighted sample size in tests. Introduced in Q4.10.
These are discussed in more detail in Weights, Effective Sample Size and Design Effects.
Date questions
When charting data from Date questions, you can specify whether significance tests compare to the date in the previous period (Compare to previous period), or, to rest of the data (Compare to rest of data).
Significance levels and appearance
The symbols used to denote different levels of statistical significance.
By default, Q specifies ten different levels of significance on a chart. You can add or remove additional levels by pressing the or symbols.
Only these levels that are less than or equal to the nominated Overall significance level are used. For example, by default the 0.5, 0.2 and 0.1 levels are not shown (as the Overall significance level is set at 0.05).
How the specific rules are applied depends upon whether Cell Comparisons (e.g., arrows and font color) or Column Comparisons (e.g., letters) are used to signify significance. With Cell Comparisons, the length of the arrows is determined by the Corrected p. With Column Comparisons the uncorrected p-value (p) is used.
You can modify the length of arrows and font sizes, the colors used to highlight fonts and arrows. Whether or not arrows, font sizes and colors appear on a particular chart or slide is determined by the Table Styles settings (see Ways of Showing Statistical Significance and How Q Highlights Results as Being Significant).
The column entitled Column letters displays the symbols used to indicate whether there is a significant difference between columns (to be seen, you need to right-click on a table and select Statistics - Cells and Column Comparisons). By default, Q uses lowercase letters to show significant results where the p-value is more than 0.001 and uppercase letters where the p-value is less than or equal to 0.001. Other characters can be entered here, and should be separated by commas. For example, the default characters are entered as "a,b,c,d,e,...,z". When there are more columns in the table than the number of characters that are entered here, Q will repeat the characters where necessary and add a number for each repetition.
How the table works
For each statistical test that is conducted in your tables, Q will use these settings to work out which colors, arrow lengths, or letters to show for that test.
If the result for the cell (the Corrected p statistic) is smaller than the Overall Significance Level, Q will look down the rows of the table and check if the Corrected p is smaller than the Cutoff p-value. It will stop at the last row where the Corrected p is smaller than the Cutoff p-value. Q will then use the settings in that row to display the result. For example, if the Corrected p in your cell is 0.04, Q will look down the table and use the settings for the row 0.05, because 0.04 is smaller than 0.05, but it is not smaller than the next row of the table which is 0.01.
Column comparisons
Multiple comparison correction
The corrections available when conducting the post hoc corrections to Column Comparisons. See Multiple Comparisons (Post Hoc Testing).
See Multiple Comparisons (Post Hoc Testing) for more information. By default, Q applies these corrections to the entire table simultaneously, but by checking Within row and span they are applied within each span within each row (e.g., if you have a table showing brand preference in the rows and age and gender in the columns, the corrections will be applied within age in each row and within gender in each row).
Overlaps
Deals with the treatment of overlapping columns. This option is intended for use when replicating results from other programs. This modification only has an effect on a limited amount of the available tests (and, in particular, it cannot, in general, be used to switch on and off dependent tests).
This option is applicable to crosstabs containing numeric or categorical data in the rows and categorical data in the columns (i.e., not to grid questions). By default, when conducting column comparisons with such data Q ignores any overlapping sample. For example, if a table is created which is comparing Coke buyers with Pepsi buyers (in the columns), any tests will automatically filter out people that buy both brands, and, thus, they test Coke buyers that do not buy Pepsi versus Pepsi buyers that do not buy Coke. This occurs when either Default or Exclude is selected (except for Quantum or Survey Reporter Means/Proportions). If you change the setting to Independent, Q then assumes that the samples are entirely independent and thus ignores the overlap. When Dependent is selected, Q conducts a dependent test. Note that for Quantum or Survey Reporter Means/Proportions, dependent tests are used by default, i.e. when Default is selected for the overlaps setting.
This option should generally only be modified in Table Options for specific tables, and is only provided for use when replicating results from other programs (e.g., when the tests for Proportions and/or Means are set to Quantum Proportions, Quantum Means, Survey Reporter Proportions or Survey Reporter Means).
The table below shows the tests used for Quantum and Survey Reporter overlaps settings:
Independent Samples | Dependent Samples | |
---|---|---|
Quantum Proportions | Independent Samples - Quantum Column Proportions Test | Dependent Samples - Quantum Column Proportions Test |
Quantum Means | Independent Samples - Quantum Column Means Test | Dependent Samples - Quantum Column Means Test |
Survey Reporter Proportions | Independent Samples - Survey Reporter Column Proportions Test | Dependent Samples - Survey Reporter Column Proportions Test |
Survey Reporter Means | Independent Samples - Survey Reporter Column Means Test | Dependent Samples - Survey Reporter Column Means Test |
Within Row and Span
By default, Q applies these corrections within a row and span, but by un-checking this box, they are applied to the entire table (where statistical tests can be computed). See the diagram beneath ANOVA-Type Test for an understanding of how having this option checked results in the comparisons being grouped.
ANOVA-Type Test
When this is checked a test is conducted within the span in the row of the table prior to performing the multiple comparison correction. For example, the dashed boxes below show the groups of cells that are tested with an ANOVA-Type test. If that test is insignificant, then all the valid comparisons in that span are also shown as insignificant (i.e., no letters are shown). This option is only available for checking when Within row and span in Column comparisons is checked.
See ANOVA-Type Tests for more information on how Q conducts such tests.
Show redundant tests
Causes the symbols indicating significance to be shown for both columns. If not checked, then only the column containing the higher value is marked as significant. Note that the higher value is the higher value used in the actual test and this may differ from the number shown on the table in situations where there is missing data or overlapping columns.
Show as groups
Causes the symbols to indicate groups of columns that are not statistically different (as opposed to highlighting differences). This is sometimes referred to as common lettering.
See Planned ANOVA-Type Tests .
Recycle column letters
Re-uses the same column letters within each span (e.g., so the first column within a span is always A, etc.).
No test symbol
This symbol is used when a test was not performed, either because the setting for Comparisons did not request a setting or because a test would not be appropriate (e.g., due to the sample sizes being too small or due to the cell containing column totals). By default a dash is shown.
Symbol for non-significant test
This symbol is used when a test could not be performed (e.g., because one of the groups had no data). By default nothing is shown.