R Data Sets

From Q
Jump to: navigation, search

An R Data Set is a Data Set that is created using R. Refer to Data Sets for a discussion of other types of data that can be imported into Q.

How to approach using R to add a data set

When writing the R code for an R Data Set you are not able to refer to other outputs or variables that exist in your Q Project. This is different to R Outputs and R Variables, where the underlying technology allows you to refer to variables and outputs throughout your project.

As a result, the R code that you use to create a data set needs to both bring in a source of data and process it into the rows and columns that you want to appear as your cases and variables in Q.

The data source can be:

  1. A file on your local machine. You can use functions like read.spss() or read.csv(). When referring to the location of the file, you must use double-backslashes (\\) when describing the file path. An example of this approach is
    location = "C:\\Users\\Chris\\Desktop\\Cola Tracking - January to September 2017.sav"
    datafile = read.spss(location, use.value.labels = FALSE, to.data.frame = TRUE)
  2. A file located at a URL (again using read.csv or similar).
    location = "https://wiki.q-researchsoftware.com/images/3/35/Technology_2018.sav"
    datafile = read.spss(location, use.value.labels = FALSE, to.data.frame = TRUE)
  3. Data obtained using an API of some kind (e.g. Google Analytics, Twitter, or just about anything else enabled by R).

How to use R to add data sets

  1. File > Data Sets > Add to Project > From R.
  2. Enter code for creating a Data Set into the R Code box. In the example below, an SPSS data file is being read in using the R function read.spss from the foreign package.
  3. Press F5. This will generate a preview of the Data Set's contents.
  4. Enter a Name (at the bottom of the screen).
  5. Press Add Data Set. Q will then take you through the normal process for setting up data, as if you had imported a data file.


When to use R Data Sets

In general, R Data Sets should not be used for importing data files that can be imported into Q via File > Data Sets > Add to Project > From File, as doing so will bypass a host of tools designed to check and correct problems in data.

The main use cases for R Data Sets are:

  1. Importing unusual types of data (e.g., web-scraping).
  2. Manipulating data prior to creating a data file (e.g., merging lots of data files).
  3. Having data sets that automatically update in dashboards (see Automatically Updating R DataSets, Variables, and Outputs).