Using R in Q

From Q
(Redirected from R Variable)
Jump to: navigation, search
Related Videos

Chapters within What's New in Q5 (Video)

 

Analyses can be conducted in Q using the R Language. In every sense, when using R from within Q, you are using "pure" R. All the functions are written in R. Any R code is automatically sent over the internet to a server with a normal version of R installed on it. The results are then sent back, and presented to you in Q. While we have attempted to make it feel like Q and R are one-in-the-same, in reality they are completely different programs which "talk" to each other.

How to use R from within Q

There are a number of ways of using R from within Q:

  1. Entering R code directly into Q in R Outputs.
  2. Creating R Variables.
  3. Creating new Data Sets using R.
  4. Accessing the R functions using menus and forms. This is how most advanced analyses are conducted in Q (e.g., regression, principal components analysis). This is referred to as Standard R.
  5. Including R code in QScripts. QScript is Q's automation language.
  6. Automatic updating. Any R code that is created via R Outputs, Standard R, or QScripts can be set to automatically update when the inputs change (e.g., if the input data changes, if a new data file is created, or if other options are changed).

R Outputs

An R Output is an item in the Report Tree that contains both some R code and the result of the R code.

Creating a R Output

  1. Selecting Create > R Output (or, right-click on an item in the Report Tree).
  2. Entering instructions in the R Code box within Properties, on the right-hand side of the screen (in the Object Inspector). These instructions need to be written in the R Language.
  3. Press the Calculate button. This sends the instructions over a secure internet connection to a computer in the cloud. The result is sent back to Q, and shown on your screen (i.e., the result is the R Output). The output will typically be a table, chart, text string, or an error message.

In the example below, a histogram is created of 9 numbers. (If you are not familiar with the R Language, refer to Learning the R language.)

RinQExample.png

References to variables and questions

R code can refer to both variables and questions, by typing either the Variable Name or Question Name into the R Code box. See Data Sets in R for more information.

References to tables

New tables and other types of outputs (strings, charts, variables, data files) can be created by manipulating tables. The example below creates a new table as the ratio of two existing tables, removes a few rows, and creates a chart. This chart will automatically update whenever the inputs change (e.g., if the data file is updated, or the questions are re-coded).

ManipulatingTablesinQ.png

Every table has a Reference Name, which can be viewed and changed by right-clicking on the table and selecting Reference Name.... Most of the time, the Reference Name is the same as the name shown in the Report Tree. Where a table's reference name is the same as the name of a variable, question, or R Output, you can disambiguate using QTables$reference.name (e.g., QTables$table.2).

ReferenceName.PNG

References to other R Outputs

Like a table, an R Output has both a Reference Name and a Name. The Reference Name must be unique within a project. The Reference Name is used to refer to other R Outputs in code. For example, if one R Output has a reference name of x, the code x * 2 in another R Output will show the value of x multiplied by 2.

There are a number of ways of changing the Reference Name of an R Output:

  1. By changing the Reference Name in the Object Inspector (Properties > General).
  2. By right-clicking on an R Output in the Report Tree and selecting Reference Name....
  3. By changing the Name, if the Name and the Reference Name are the same and there are no other R Outputs with the same Reference Name.
  4. By assigning a variable name in the last line of code. For example, the following code creates an R Output with a Reference Name of dog containing the string (or, in R parlance, character) Sherlock:
dog <- "Sherlock"

Avoiding ambiguous references names

There are situations where two things may have the same Reference Name. For example:

  • A table and a variable may both have the Reference Name of Q2.
  • An R Output and a table may both have the reference name brand.health.

Where this occurs, any R code code that refers to the non-unique name needs to be disambiguated, by using a Fully Qualified Name:

Object type Syntax Example
R Outputs QROutputs$item.name QROutputs$r.output.3
Tables QTables$item.name QTables$age.by.gender.3
Variables Colas.sav$variable.name or Colas.sav$Variables$variable.name Colas.sav$d1 or Colas.sav$Variables$d1
Questions Colas.sav$question.name, Colas.sav$Questions$question.name or Colas.sav$VariableSets$variable.set.name Colas.sav$Age, Colas.sav$Questions$Age or Colas.sav$VariableSets$Age

R Variables

An R Variable is a variable in a Data Set, created as follows:

  1. Create > Variables and Questions > Variable(s) > R Variable...
  2. Enter code written the R Language in the R Code box. This code should create a vector, table or data-frame, with the same number of observations as in the data file.
  3. Run the code (F5). If you provide variable or column names, these will be the Labels for the variables when they are created.
  4. Enter the Variable Base Name. Where your code only creates a single variable, this will be the name of that variable. Otherwise, the new variable names will be whatever you enter here, followed by an underscore and a number (e.g., dog_2).
  5. Enter a name for the Question.
  6. Press Add R Variable.

RVariable.png

R Data Sets

Data Sets can be added to a project using R: File > Data Sets > Add to Project > From R. See R Data Sets for more information.

QScript

QScript is Q's macro language, which is used for automation. Many of the menu items in Q are written in QScript. Users can write their own automations using QScript. The key distinctions between QScript and R are:

  • QScript can be used for manipulating the user interface (e.g., creating dialog boxes). R cannot.
  • QScript can be used for automatically both creating and modifying charts, tables, variables, and questions. By contrast, if you wish to create an R Variable, R Output or, R Data Set you need to either manually create it from the menus, or, create it via QScript. For an example, see Regression - Diagnostic - Prediction-Accuracy Table. Note that within R Outputs you still have all the R functions for creating R data types, such as variables, vectors, and data frames. The distinction being discussed here relates to the ability to control data as shown in the Variables and Questions tab (i.e., a Data Set).
  • QScript is generally faster than R (e.g., it is better to create lots of variables in QScript than R).
  • It is much easier and faster for users to write R code than QScript. R is specifically designed for data analysis, whereas JavaScript, which is the language that QScript is written for, is designed to be used for many, many, different applications, and the consequence of this is that it can be quite unwieldy for data analysis (i.e., to use JavaScript you need more advanced coding skills and will generally need to write many more lines of code than if trying to achieve the same thing in R).

Standard R

It is possible to do just about any form of data analysis using R by writing code. Where we think analyses are likely to be used by many of our clients, we have made it available via a graphical user interface (i.e., menus and/or buttons and the like, without needing to write code). We refer to the analyses that we have made available via a graphical user interface as Standard R. The R Logo (i.e., RLogo.png) is used to mark menu items that use Standard R. See Standard R for more information about how Standard R items work and are created.

Updating

R code is automatically re-run whenever:

An R Item is a block of code written in the R Language.

When multiple R Outputs are selected, a table displaying the status of the selected R Items will be shown:

Multiple r items.png

R Items that require updating will be greyed out. There are two buttons above the table:

  • Update all these items updates all the selected R Items, regardless of whether they require updating.
  • Update grey items updates only the the selected R Items that are greyed out.