Visualization Options

From Q
Jump to: navigation, search

When a visualization is created (e.g,. by selecting Create > Charts > Visualization, the visualization appears in the middle of the screen and options for modifying the chart are shown in the Object Inspector. Pressing the Calculate button cause the visualization to be created and updated with any selected data and other options. Checking Automatic sets the visualization so that it automatically updates whenever any of its inputs, including data, update.

The options in the Object Inspector for a visualization are organized into three tabs: Inputs, Properties, and Chart.

Inputs

OUTPUT

Chart type governs the specific type of chart that is created.

When Show as is set to Table, the table that is used in creation of a chart is shown. This table incorporates all the other settings on this page. That is, it shows the selected DATA and the consequences of both the Chart type selection and the selections in DATA MANIPULATION.

DATA

Data source determines which data the visualization is connected to. All chart types can accept one of the following

  • Output in 'Pages' This can be a Q table or R output that already exists in the Report tree. This can include R outputs that are generated by analysis such as Dimension Reduction, Regression or Machine Learning.
  • Variables in 'Data'. This option allows users to directly use variables from the Data tab. Different options are available depending upon the selected Chart type. For most of the chart which accept summary data (e.g. Column or Bar charts), when this option is used, users will also be offered the option of selection a variable for Groups, to create a contingency table (or crosstab).
  • Type or paste in data. This option allows the user to type in or paste some data (e.g., from a spreadsheet). When the data source is Type or paste in data, the following strings are treated as missing values: NA, NaN, Missing, missing, N/A, -, ., Invalid, invalid and blank data.

Additionally, you can write R code to hook up the visualization in other ways using Properties > R CODE.

DATA MANIPULATION

Aggregate the data prior to plotting creates a frequency table if one variable is selected or a computes the average when multiple variables are provided. Other types of tables can be hooked up by creating the table of interest prior to the visualization, and then setting Data source to Link to a table.

Automatically tidy the data performs a variety of cleaning operations, such as removing missing values, and changing the underlying type of the data table in order to convert it into a numeric form useful for charting. It will also simplify 1-row tables so it is treated in the same way as a 1-column vector (i.e. a single data series).

Automatically detect row and column names Unselect this if you want to manually adjust whether or not row or column names are present in the data.

Tidy labels extracts common prefixes from the row labels or vector names to shorten the labels in the chart.

Switch rows and columns swaps the rows and columns of the underlying table being plotted. This is useful for swapping which categories are shown on the x-axis versus the legend.

Convert to percentages/proportions calculates percentages based on the input data. If your data is already in percentages, you should instead use the number formatting settings for the applicable axes/numbers in the Chart tab. If crosstabs (i.e. a grouping variable) is used then row percentages are computed except for heatmap and pie/donut charts which uses percentages out of the total responses.

Categorical as binary determines how categorical variables are treated for aggregation. When categorical as binary is selected, we count the occurrences of each category separately; otherwise, the variable is represented as sequential integers (i.e., 1 for the first category, 2 for the second, etc.)

Date format Text in the input data that can be interpreted as dates will automatically be converted into dates when appropriate. To turn this conversion off select No dates. Sometimes warnings will occur when we cannot determine what format the date is e.g 02/01/2017 could be January 2 2017 (International format) or February 1 2017 (US format, default). In these cases, the user can choose a particular option to clarify the desired format.

Hide output with small sample sizes This can only be used if the input is a Q Table containing the 'Base n' statistic. If any cell in the table has 'Base n' smaller than the sample size cut-off, then an error will be given. Otherwise the output is unchanged.

ROW/COLUMN MANIPULATIONS

Hide empty rows/columns removes rows (columns) from the data that contain entirely missing values. In the case when the data are percentages, selecting this option will also remove rows and columns that contain all zeros. One-dimensional vectors are treated as a table with one column and multiple rows.

Hide rows/columns with small sample sizes This can only be used if the input is a Q Table containing the 'Column n' or 'Base n' statistics. Rows/columns with 'Column n' or 'Base n' smaller than the sample size cut-off will be removed.

Sort rows/columns Whether to sort the input table according the values in one of the columns/rows of the table. Sorting is performed before the last four controls for selecting and hiding rows/columns are applied.

Column/row used for sorting The name or index of the column/row to be used for sorting. If none is specified, the last column/row will be used.

Rows/columns to exclude from sorting A comma-separated list of rows/columns (either by name or indices), that should not included at the end of the table and excluded from sorting.

Order rows/columns by similarity Row/column similiarity is determined by performing correspondence analysis on input data, and ordering the rows/column by the coordinates in the first dimension. This option is only available if sorting is not done.

Reverse rows/columns This option can be used with any combination above.

Rows/columns to show Thus can be either a comma-separated list of rows/columns (either by name or indices); or in Displayr, a List box or Combo box control. If none is specified (default), then all rows/columns of the table will be shown. This control can be used in combination with the two below.

Number of rows/columns from top/left to show This option can be used to specify that range from the top/left of the input table is shown. These rows/columns will be shown as the first rows/column in the output table. If they are used in combination with Rows/columns to show or Number of rows/columns from bottom/right to show, then rows/column specified multiple times will be shown multiple times.

Number of rows/columns from bottom/right to show This option can be used to specify that a range from the bottom/right of the input table is shown at the end of the output table.

Rows/columns to ignore automatically removes rows (columns) from the table that contain the shown names. By default, it removes NET, Total, and SUM. This is performed after all the operations above.

Row/column labels override the default row or column names. This is particularly useful for adjusting text in the legend, etc.

Properties

This tab contains options for formatting the size of the object, as well as the underlying R code used to create the visualization, and the JavaScript code use to customize the Object Inspector itself (see Object Inspector for more details about these options).

Chart

Options for customizing the appearance of the visualization. Not all visualizations have all the options.

APPEARANCE

This group contains options that are specific to the Chart type chosen. For example, for a column chart, there is an option called Stack series that creates a stacked column chart.

Chart types 'Area', 'Bar', 'Column', 'Line', 'Scatter', 'Radar' and 'Geographic Map', also has an option to 'Show as small multiples (panel chart)'. When this option is selected each series will be shown in a separate panel. When small multiples are used, the following options will also be shown

  • Number of rows: controls the layout of the panels
  • Share axes between panels: if not selected, each panel will have its own range determined by the range of the series in the panel (unless Minimum/Maximum values in the Values axis is specified).
  • Show averages of all series: when selected, an additional series is added in each panel, showing the values averaged across all series.
  • Order: controls the order in which the panels are shown. If no order is specified, then the panels will follow the order of the columns in the input data.
  • Padding: controls the space between panels. These controls are particularly useful if the panel titles overlap with the chart (increase top padding), or the data labels in the radar charts are truncated (increase left padding or right padding). The padding is always between 0 and 1, and specifies (approximately) a proportion of the total charting area. For a chart with r rows and c columns of small panels, the left and right padding should be smaller than 1/c and the top and bottom padding should be smaller than 1/r. If the values are too large, then the padding will be set to zero. It is best to start at small value and slowly increase the padding.

DATA SERIES

Formatting options controlling the appearance of the data series.

Multiple colors within a single series This option is can be used with Bar and Column charts so that each bar is a different color, similar to a Pyramid, Pie or Bar Pictograph chart. This changes the default behaviour where each data series (typically a column of a summary table) to be shown in a single color. When this option is applied to a data set containing multiple series, Small multiples must be used to display the multiple series.

Color palette allows selection from and customization of the colors of different series. Examples of different palettes can be found here. There are also alternatives to provide greater control than the preset palettes

  • If Default or template setting is selected, the Color palette from the template (from Visualization - Create Template) is used. If no template has been supplied for the chart, then Default colors will be used instead. If the template used has a Named colors palette, then the categories from in the input data will be matched according to the names associated with specific colors. Unmatched data points will be shown in a color specified in the template (default grey).
  • Custom color The user will be provided with a color picker single color. Note that this may not give the desired output if there is more than one series in the chart.
  • Custom color gradient Two color pickers will be provided. A colorscale will be constructed by linearly interpolating between the two colors.
  • Custom palette A textbox allowing the user to specify multiple colors using a comma separated list of hex codes. In most cases this palette is treated as categorical but if Values is specified, a color scale will be created by linear interpolation.
  • Custom palette (color pickers) The user will be provided with 12 color pickers to manually specify a palette. If the input data uses more than 12 colors, the palette will be recycled.
  • Custom palette (R output) A palette can be specified as an R character vector. If the vector is unnamed, then the palette will be recycled if necessary. But when a named vector is used, then a color must be specified for each category in the input data.

Values A numeric table or R output can be provided if a sequential color palette is used. For a chart using Multiple colors within a single series with Small Multiples, Values can either be a vector with length equal to the number of rows in the input data (bars in each panel will colored the same); or a table with the same number or rows and columns as the input data (bars in each panel colored separately but using the same color scale). Data points in the chart will be colored according to the corresponding value in the table. Other types of R outputs (e.g. Regression, Machine learning) can potentially also be used (the values associated with these R outputs is the same as the ones gives by exporting to Excel).

Opacity determines whether the data series is transparent (0.0) or opaque (1.0). In some cases the default setting may not be visible in the control, so you may have to click on the arrows to set the opacity to 0.

TREND LINES

A line of best fit can be drawn through each series in the chart. The type of line constructed can be one of

Linear A linear regression is applied to each series
LOESS A smooth curve is estimated using local polynomial regression (degree = 2) with tricubic weights. The span (proportion of points used) is set to 0.75, which is fairly wide.
Friedman's super smoother A smooth curve is estimated using local regression. The span is determined by cross-validation, so this smoother is usually more approprate than LOESS if there is a large amount of local variation but it is less smooth.
Cubic spline A smooth curve is estimated using a cubic spline, estimated using the conventional integrated square second derivative cubic spline penalty with knots spread evenly through the covariate values. It is implemented using mgcv::gam with bs = "cr".
Moving average A smooth curve is estimated by taking the moving average. When the window size is h, the estimated value at i is the average if the data series at (i - (h-1)) to i.
Centered moving average Uses a centered window instead. So the estimated value at i is the average if the data series at (i - ceiling((h-1)/2)) to (i + floor((h-1)/2)).

Options to customize the trend line:

  • Show 95% confidence interval For Linear, LOESS and Cubic spline smoothers, 95% confidence interval about the trend line can be added to the chart.
  • Ignore last data point Whether to ignore the last data point when fitting the trend line. For example, for time series when the data for the last period is incomplete.
  • Line type The trend line can be drawn as a solid, dotted or dashed line.
  • Line width The width of thickness of the line.
  • Fit line color palette The default option is Group colors which means that the line of best fit for each series is shown in the same color as the data series. However, the trend lines can be customized with any color palette in the same way as the data series.

FONT

Default formatting options for fonts. They are overriden by the formatting options of specific text elements.

Use default or template settings If selected an Arial font in #2C2C2C will be used unless a template (from Visualization - Create Template) is present in the document. Unselect the checkbox to customize fonts for the chart. These setting can be overriden at any level (e.g. the global fonts can follow the defaults and data label fonts can be customized).

Global font family Default font family of all text on the chart.

Global font color Default font color of all text on the chart (not including tool tips shown on hover). This option is available for all visualization charts except Palm (currently in progress), and Venn (data labels take on the data series colors).

Font units This can be set to pt (points) or px (pixels). By default, font sizes are specified in pt, which is consistent with font sizes specified in textboxes. However, older visualizations which do not offer this option used px. It is occasionally more convenient to use px because dimensions of other outputs are frequently specified in pixels.

DATA LABELS

Options for controlling the appearance of any data values to be plotted, including number formatting.

Show data labels Whether or not to show data labels.

Data label font family Font family used to show data labels.

Color data labels

Automatically For Stacked and Pyramid charts, data labels will be shown in black or white depending on the darkness of the series color. Note that in Stacked Area charts, the data labels for the last series is outside the shaded area so it is shown in the Global font color. The background color of the chart is not used because it is transparent and the slide background could be black or white. In Scatter charts, Automatically colored data labels means that the labels will be shown in the same color as the markers.
In a single color A color picker is provided for selecting the data label font color.
In different colors for each series This option is only offered for Area, Bar, Column, Line, Pyramid, Radar charts. Users can provide a comma-separated list of colors (in hex code). If the number of colors is smaller then the number of series then the colors will be recycled.

Data label font size Font size of the data labels.

Number type By default this will be automatically selected depending on the data, but users can manually set this to Number or Percentage. If Percentage is chosen, the input value will be multiplied by 100 and a % sign will be added. If the multiplication by 100 is not appropriate, then use the Custom suffix instead.

Decimal places Number of decimal places to show in the data labels.

Custom prefix Optional text to prepend to the data label.

Custom suffix Optional text to append to the data label.

LEGEND

Show legend Depending on the chart type and the version of the chart this is either a checkbox or a drop down menu. If the checkbox is ticked, the legend will be shown if there is more than one data series in the chart. In the newer versions of the chart, most chart types allow the users to choose between Automatic, Show and Hide. Automatic has the same behavior as the ticked checkbox does, but setting the control to Show will always show the legend even if there is only one data series.

TITLE

Options for setting and formatting a Title, Subtitle, and Footer. Not all chart types have all these options.

CATEGORIES (X/Y) AXIS and VALUES (X/Y) AXIS

Options for controlling the appearance of the horizontal and vertical axes, including number formatting.

In Q, the x-axis refers to the horizontal axis and the y-axis refers to the vertical axis. The specific meaning of these axes change depending on chart type (e.g., for a bar chart the x-axis show values, whereas for a column chart it shows categories).

Maximum or minimum value Where available, there controls allow users to specify the range shown on the chart. If the axis is numeric, a numeric value (without prefix or suffix) should be added; if the axis is a date, then a date string should be entered. If the axis is categorical, then enter a 0-based index can be used (i.e setting the Minimum value to -0.5 will add extra space before the first category).

Grid line width Width of grid lines in pixels. By default the grid lines on the categories axis is set to 0 to hide the grid lines.

Grid line color Color of the grid line if the grid line width is greater than 0.

Axis line width Width of axis line, i.e. a line drawn a the edge of the chart showing tick marks. By default this is set to 0 to hide the axis line.

Axis line color Color of the axis line.

HOVER

Formatting of any hover text used in the charts. Options to control the hover text is not available for Pie (in progress), Bar Pictograph (no hover text) and Geographic Maps using the leaflet package (options not available).

Font family Font family for hover text. This usually defaults to the Global font family.

Font size Size of the hover text in either pixels or points (depending on font units). This is not available for the time series charts.

Font color This option is only available for stream and time series charts. This is because the hover text in these two charts appears directly on the chart background (with no text box).

BACKGROUND

Transparent background By default all of the visualization outputs have a transparent background. For most of the charts, the background setting can be changed if this box is unchecked.

Background color Color of the chart background.

Background opacity Alpha transparency of the chart background which should be a number between 0 (transparent) and 1 (opaque).