Data File Setup for Studies with Multiple Versions of the Questionnaire

From Q
Jump to: navigation, search

It is common for tracking studies to contain slightly different versions of questionnaires. For example, an option in a question previously labeled “New York” may be relabeled as “New York – New York”, either because the questionnaire itself changed, or, because of instructions regarding how data from the study should be prepared. Or, a new brand may be added. The following principles will save a lot of time in the analysis of a study with different versions of the questionnaire:

  1. Where at all possible, any changes should be made retrospectively to the raw data file. That is, even if respondents were shown “New York”, the system should export the data as if they had been shown “New York – New York”. Q has tools to make changes of labels quite straightforward. However, if the data file is created in such a way that it contains different response options for different waves of respondents it will cause a massive increase in the workload required to analyse the data.
  2. Variable names must not change. That is, if variable name Q2a means “Awareness – Coca Cola” in the initial data file, then this name should be retained forever. The variable name is used by Q to work out which data means what. Any changes to the variable name will cause all analyses of the data to “break” and fixing the Q project file to address such changes will generally be extraordinarily difficult.
  3. Where a response option is removed from a questionnaire, then if it is a multiple response question, the variable should remain in the data file but be assigned missing value codes in waves where it did not appear. If it is a single response question the value and label should be left in the data file.

Where a response option is added to a question (e.g., a new brand), this should involve adding new codes (values) if they are single response and new variables if they are multiple response. A variable from a previously deleted response option should not be re-used. Where a new variable is created, respondents from earlier waves of the study need to be assigned missing values.