Creates a Sankey diagram showing the flows between different values of variables. It is generally advisable to view only a small number of variables. Please see the Sankey articles on our blog examples of different how to set up data for Sankey diagrams.
Example
Object Inspector Options
The following is an explanation of the options available in the Object Inspector for this specific visualization.
Refer to Visualization Options for general chart formatting options.
Inputs
DATA SOURCE
There are three options for inserting table into a Sankey diagram:
Input table A table with each row describing a set of linked categories.
Variables Categorical variables from a Data set.
Paste or type data Enter a table with each row describing a set of linked categories.
Max. categories The maximum number of categories to display for each variable.
FILTERS & WEIGHT
Weight A dropdown that takes a numeric variable to control the size of each link. This option is only available if the Variables data source is used. Otherwise, use the checkbox last column contains weights.
Chart
APPEARANCE
Links colored by
None: all links are shown in grey.
Source: links are shown in the same color as the source node (left)
Target: links are shown in the same color as the target node (right)
First variable: similar to Source but nodes will also be the same color as nodes they are linked to on the left. If there are multiple such nodes, then the color will be taken from the node which is linked with the largest weight.
Last variable: similar to First variable, but using the color of the Target node, and looking at downstream links.
Variables share common values If the same colors should be used for each variable in the Sankey diagram.
Node colors / Node and link colors Customize colors of the nodes.
Node width Controls width of the nodes.
Vertical spacing between nodes Controls padding between nodes of the same variable.
Order nodes to reduce overlap The vertical positions of the nodes are automatically adjusted to reduce the overlap between links. When this is turned off, nodes are positioned in the order they occur in the data.
Place right-most nodes at the edge Force the nodes to fill up the right-edge of the widget. The node labels in the last column will then be placed to the left of the node.
LABELS
Font family Font family of node labels.
Font size Font size of node labels.
Include variable in node label Prefix node label with the variable name or label.
Include counts in node label Append node label with the number of observations in each category.
Include percentages in node label Append node label with the percentages of each category.
Variable names Displays Variable Names in the node labels if the Variables data source is used.
Tidy labels Extract common prefixes from the node labels.
Label maximum length Number of characters in the node label before it is truncated. Truncated labels will be indicated with an ellipsis. No truncation is applied to numeric variables.
HOVERTEXT
Show percentages instead of counts Show percentages instead of counts in the hovertext (tooltips) for nodes and links.
Acknowledgements
Uses on a variant of the networkD3 htmlwidget, created by Kent Russell.
Technical details
An error will occur if more than 20 variables are selected. It is generally advisable to show a relatively small number (e.g., 4 or 5).
Although the sankey diagram in this example shows flows between different values of variables, sankey diagrams can be used to show many other types of flows (e.g., migration patterns, regression trees, and energy flows (see https://christophergandrud.github.io/networkD3/).
Code
▶ Show Code
form.setHeading("Sankey Diagram")varallow_control_groups=Q.fileFormatVersion()>10.9;// Group controls for Displayr and later versions of Qvardisplayr=Q.isOnTheWeb();vartemplate_prompt="Create a template to control settings for all visualizations in the document by inserting 'Visualization > Template'";functionisEmpty(x){return(x==undefined||x.getValue()==null&&(x.getValues()==null||x.getValues().length==0))}functionisBlankSheet(x){return(x.getValue()==null||x.getValue().length==0)}varcontrols=[];if(allow_control_groups)form.group("DATA SOURCE");vartableInput=form.dropBox({label:"Input table",types:["table","RItem"],name:"formTable",multi:false,required:false});varvarInput=form.dropBox({label:"Variables",name:"formVariables",multi:true,min_inputs:2,max_inputs:20,required:false,types:["Variable: Numeric, Date, Money, Categorical, OrderedCategorical, Text"],prompt:"Choose variables from the same data set"});varpastedInput=form.dataEntry({name:"formEnteredData",label:"Paste or type table",prompt:"Opens a spreadsheet into which you can paste data.",required:true,large_data_error:"The data entered is too large. The best alternative is to add your data as a Data Set, use Table > Raw Data > Variable(s), and connect that table to this analysis."})if(!allow_control_groups||!isEmpty(tableInput)||(isBlankSheet(pastedInput)&&isEmpty(varInput)))controls.push(tableInput);if(!allow_control_groups||!isEmpty(varInput)||(isEmpty(tableInput)&&isBlankSheet(pastedInput)))controls.push(varInput);if(!allow_control_groups||!isBlankSheet(pastedInput)||(isEmpty(tableInput)&&isEmpty(varInput)))controls.push(pastedInput);if(!isEmpty(tableInput)||!isBlankSheet(pastedInput)){varqContainsWgt=form.checkBox({label:"Last column contains weights",name:"formContainsWeights",default_value:false,prompt:"Use the last column of the input table as the weights variable"});controls.push(qContainsWgt);}varmaxCat=form.numericUpDown({label:"Maximum number of categories",name:"formMaxCategories",increment:1,minimum:2,default_value:10,maximum:100,prompt:"Variables with more categories than this will a number of categories merged. The nodes are merged on the basis of similar linkage patterns"})controls.push(maxCat);if(allow_control_groups)form.page("Chart");if(allow_control_groups)form.group("APPEARANCE");varqTemplate=form.dropBox({name:"formTemplate",label:"Use template",types:["RItem:AppearanceTemplate"],required:false,prompt:template_prompt});controls.push(qTemplate);varuse_default_fonts=!isEmpty(qTemplate);varlinkCol=form.comboBox({label:"Links colored by",name:"formLinkColors",alternatives:['Target','Source','None','First variable','Last variable'],default_value:'Source',prompt:"Choose color scheme for nodes and links"});controls.push(linkCol);if(linkCol.getValue()=="Target"||linkCol.getValue()=="Source"){varqShared=form.checkBox({label:"Variables share common values",name:"formSharedValues",default_value:false});controls.push(qShared);}varcolorLabel="Node and link colors";if(linkCol.getValue()=="None")colorLabel="Node colors";palettes=["Default or template settings","Legacy colors","Office colors","Colorblind safe colors","Rainbow","Light pastels","Strong colors","Spectral colors (red, yellow, blue)","Spectral colors (blue, yellow, red)","Reds, dark to light","Reds, light to dark","Greens, dark to light","Greens, light to dark","Blues, dark to light","Blues, light to dark","Greys, dark to light","Greys, light to dark","Heat colors (yellow, red)","Terrain colors (green, beige, grey)","Custom color","Custom gradient","Custom palette"];qColor=form.comboBox({name:"formPalette",label:colorLabel,alternatives:palettes,default_value:palettes[0],required:true});controls.push(qColor);if(qColor.getValue()=="Custom color"){varqCustCol=form.colorPicker({name:"formCustomColor",label:"Custom color",default_value:"#5C9AD3"});controls.push(qCustCol);}if(qColor.getValue()=="Custom gradient"){varqCustGrad1=form.colorPicker({name:"formCustomGradientStart",label:"Gradient start",default_value:"#5C9AD3"});varqCustGrad2=form.colorPicker({name:"formCustomGradientEnd",label:"Gradient end",default_value:"#ED7D31"});controls.push(qCustGrad1);controls.push(qCustGrad2);}if(qColor.getValue()=="Custom palette"){varqCustPalette=form.textBox({name:"formCustomPalette",label:"Custom palette",default_value:"#5C9AD3, #ED7D31",prompt:"Enter color as a string. Multiple values should be separated by commas."});controls.push(qCustPalette);}varqNodeWidth=form.numericUpDown({label:"Node width",name:"formNodeWidth",minimum:0,maximum:100,default_value:30});controls.push(qNodeWidth);varqNodePad=form.numericUpDown({label:"Vertical spacing between nodes",name:"formNodePad",minimum:0,maximum:100,default_value:10});controls.push(qNodePad);controls.push(form.checkBox({name:"formNodeOrderAuto",label:"Order nodes to reduce overlap",default_value:true}));varqNodeRight=form.checkBox({name:"formNodeRight",label:"Place right-most nodes at the edge",default_value:false});controls.push(qNodeRight);if(allow_control_groups)form.group("LABELS");font_families=!!Q.GetAvailableFontNames?Q.GetAvailableFontNames():["Arial","Arial Black","Comic Sans MS","Courier New","Georgia","Impact","Open Sans","Tahoma","Times New Roman","Trebuchet MS","Verdana"];varqFontDefault=form.checkBox({name:"formFontDefault",label:"Use default or template font settings (values axis title)",default_value:use_default_fonts,prompt:template_prompt});controls.push(qFontDefault);if(!qFontDefault.getValue()){varqFontFamily=form.comboBox({label:"Font family",name:"formFontFamily",alternatives:font_families,default_value:"Open Sans",editable:true,prompt:"Select the font to use. You can also type the name of a font directly (including custom fonts)."});varqFontSize=form.numericUpDown({label:"Font size",name:"formFontSize",default_value:9,minimum:4});varqFontUnits=form.comboBox({name:"formFontUnit",label:"Font units",alternatives:["pt","px"],default_value:"pt",prompt:"Are font sizes specified in terms of points or pixels?"});controls.push(qFontFamily);controls.push(qFontSize);controls.push(qFontUnits);}varqVarShow=form.checkBox({label:"Include variable in node labels",name:"formShowVar",default_value:true,prompt:"Node labels are prefixed with the variable name or label"});controls.push(qVarShow);varqCountsShow=form.checkBox({label:"Include counts in node labels",name:"formShowCounts",default_value:false,prompt:"Append node labels with the number of observations in each category"});controls.push(qCountsShow);varqPercentagesShow=form.checkBox({label:"Include percentages in node labels",name:"formShowPercentages",default_value:false,prompt:"Append node labels with the percentages of each category"});controls.push(qPercentagesShow);if(!isEmpty(varInput)&&qVarShow.getValue()){varqVarNames=form.checkBox({label:"Variable names",name:"formNames",default_value:false,prompt:"Show variable names instead of variable labels"});controls.push(qVarNames);}varqTidyLabels=form.checkBox({label:"Tidy labels",name:"formTidyLabels",default_value:true,prompt:"Extract common prefixes to simpliy labels"});controls.push(qTidyLabels);varqLabelMaxLen=form.numericUpDown({label:"Label maximum length",name:"formLabelMaxLen",default_value:100,minimum:10,maximum:500,increment:5,prompt:"Maximum number of characters before labels are truncated. Truncated labels will be indicated with an ellipsis"});controls.push(qLabelMaxLen);if(allow_control_groups)form.group("HOVERTEXT");varqHoverPercentages=form.checkBox({label:"Show percentages instead of counts",name:"formHoverPercentages",default_value:false});controls.push(qHoverPercentages);form.setInputControls(controls);
▶ Show Code
library(flipPlots)library(flipData)library(flipFormat)library(flipChartBasics)weights<-NULLdat<-NULLdat<-get0("formTable")if (is.null(dat)){if (exists("formEnteredData")&&sum(dim(formEnteredData))>0)dat<-flipTransformations::ParseEnteredData(formEnteredData)}if (is.null(dat)){dat<-as.data.frame(get0("formVariables"))if (is.null(dat)||sum(dim(dat))==0)StopForUserError("No data has been provided.")weights<-QPopulationWeightnames(dat)<-if (!isTRUE(get0("formNames")))Labels(formVariables)elseNames(formVariables)}if (isTRUE(get0("formContainsWeights"))){weights<-dat[,ncol(dat)]dat<-dat[,-ncol(dat)]}if (formTidyLabels)names(dat)<-ExtractCommonPrefix(names(dat))$shortened.labelsdat<-TidyRawData(dat,weights=weights,subset=QFilter,missing="Use partial data",error.if.insufficient.obs=FALSE)# Create sankey data so we know the categories in the chartsankey.dat<-SankeyDiagram(dat,max.categories=formMaxCategories,link.color=formLinkColors,subset=TRUE,weights=attr(dat,"weights"),variables.share.values=get0("formSharedValues",ifnotfound=FALSE),hovertext.show.percentages=formHoverPercentages,label.show.counts=formShowCounts,label.show.percentages=formShowPercentages,label.show.varname=formShowVar,label.max.length=formLabelMaxLen,output.data.only=TRUE)num.colors<-length(unique(sankey.dat$nodes$group))u.ind<-which(!duplicated(sankey.dat$nodes$group,fromLast=formLinkColors%in%c("Target","Last variable")))u.names<-sankey.dat$nodes$name[u.ind]# Different from 'Named colors' in other charts because nodes can be merged# And questions names/counts/percentages may be included in node name# we use partial matching.template<-get0("formTemplate")if (!is.null(template)){bcol<-rep(NA,num.colors)names(bcol)<-u.namescol.list<-if (!is.null(template$brand.colors))template$brand.colorselsetemplate$colorscol.names<-names(col.list)col.ord<-order(nchar(col.names))for (iincol.ord){ind<-grep(paste0("\\Q",col.names[i],"\\E"),u.names)if (length(ind)>0)bcol[ind]<-col.list[i]}template$brand.colors<-bcol}else{default.font<-QAppearance$Font$Familydefault.color<-"#444444"template<-list(global.font=list(family=default.font,color=default.color,size=8,units="pt"),fonts=list(`Values axis title`=list(family=default.font,color=default.color,size=10)),colors=QSettings$ColorPalette)}colors<-NULLif (formPalette!="Legacy colors")colors<-ChartColors(num.colors,given.colors=GetPalette(formPalette,template),custom.color=formCustomColor,custom.gradient.start=formCustomGradientStart,custom.gradient.end=formCustomGradientEnd,custom.palette=formCustomPalette,silent=TRUE)# Plot chartviz<-SankeyDiagram(links.and.nodes=sankey.dat,colors=colors,sinks.right=formNodeRight,font.family=get0("formFontFamily",ifnotfound=template$fonts$`Values axis title`$family),font.size=get0("formFontSize",ifnotfound=template$fonts$`Values axis title`$size),font.unit=get0("formFontUnit",ifnotfound=template$global.font$units),node.position.automatic=get0("formNodeOrderAuto",ifnotfound=TRUE),node.width=formNodeWidth,node.padding=formNodePad)