Sort/Reorder Rows or Columns - Sort Rows (Automatically Updates When Data Changes)

From Q
Jump to navigation Jump to search

This rule sorts the rows of table from the highest to lowest (or lowest to highest) according to the values shown (i.e., ranks the data, so that the order of the rows reveals a ranking). When the table has more than one column you can choose which column should be used to do the sorting. Some rows, like NET, Don't Know, Other/Specify, All of these, or None of these will automatically be left at the bottom of the table. The rule allows you to control which rows should be left at the bottom.

If you are applying the rule to other tables from the Rules tab, the sorting settings flow through to all tables it is applied to. So if you are sorting based on column name, the exact same column name must be in all of the tables the rule is applied to. If you want to change the sorting settings for a particular table, you will need to create a "fresh" version of the rule from the Automate menu to adjust the settings for that table.

While you can already sort individual tables by right-clicking on one of the columns and selecting an option form the Sort By menu, the table may no longer remain sorted if the values shown in the table change. This rule allows for the table to remain sorted even when the values in the table change. See also Sorting and Reordering - Sort from Highest to Lowest (Does Not Update When Data Changes).

When the table has spans in the rows, the rows of the table will be sorted within the spans.

Examples

In this example, the rule has been used to sort the table in descending order according to the values of the Column % in the NET column. The NET column has been chosen because it is the column with the largest sample size (Column n). To specify to sort by a particular row, use the Choose a column option under Column Selection. Four rows have not been sorted, and have been left in their original order at the bottom of the table: Don't Know, Nothing, Other, and NET. The rule automatically leaves rows like this fixed.

SortRuleDefault1.PNG

The next two examples describe how to control which rows are fixed at the bottom of the table.

Excluding a row from the sort

In this example, the Exclude from sort option has been used to make sure that any row label containing the text don't is not sorted with the other rows and is kept at the bottom of the table. This allows the row They don't drink diet or sugar-free cola to remain below the four sorted rows. Note that the row Don't know/Not applicable is automatically kept at the bottom because the rule can identify it as a Don't know option.

SortingRuleWithFixed2.png

Preventing a row from being excluded

In this example, the Never exclude from sort option has been used to make sure that any row containing the text Others is always included in the sorted rows. If this is not done then the row Others perceive me as wholesome may be mistaken for an Other/Specify option and left at the bottom of the table.

SortingRuleWithNotFixed2.png

Technical details

  • The table will be sorted using the first Statistic shown on the table.
  • You can choose whether the rows of the table will be sorted in descending or ascending order (from top to bottom).
  • By default, if there are multiple columns in the table, the rule will sort according to the largest column (according to the Column n in the Statistics - Below). Using this option:
    • This script will have no effect when Column n is not available from the Statistics - Below.
    • If two or more columns have the same value for Column n then the table will be sorted according to the right-most column with that value of Column n.
  • You can instead choose to sort by specific column by changing Column Selection to Choose a column, and then selecting a column from the list of column labels.
  • Rows that look like they are Don't Know, Other/Specify, All of these, or None of these responses will be left at the bottom of the table, as will NET and SUM rows.
  • The Exclude from sort option is used to exclude rows from the sort, keeping them at the bottom of the table. This text is not case sensitive.
  • The Never exclude from sort option is used to ensure that rows with labels containing certain text are always included among the sorted rows. This is used to override the rule's automatic detection of Don't Know, Other/Specify, All of these, or None of these options. This text is not case sensitive.
  • Values of NaN are treated as the largest possible negative number for the purpose of sorting, which means that they will always be placed below other rows when sorting from highest to lowest (excepting those rows that have been excluded from the sort).

How to apply this rule

For the first time in a project

  • Select the table(s)/chart(s) that you wish to apply the rule to.
  • Start typing the name of the Rule into the Search features and data box in the top right of the Q window.
  • Click on the Rule when it appears in the QScripts and Rules section of the search results.

OR

  • Select Automate > Browse Online Library.
  • Choose this rule from the list.

Additional applications of the rule

  • Select a table or chart that has the rule and any table(s)/chart(s) that you wish to apply the rule to.
  • Click on the Rules tab (bottom-left of the table/chart).
  • Select the rule that you wish to apply.
  • Click on the Apply drop-down and choose your desired option.
  • Check New items to have it automatically applied to new items that you create. Use Edit > Project Options > Save as Template to create a new project template that automatically uses this rule.

Removing the rule

  • Select the table(s)/chart(s) that you wish to remove the rule from.
  • Press the Rules tab (bottom-right corner).
  • Press Apply next to the rule you wish to remove and choose the appropriate option.

How to modify the rule

  • Click on the Rules tab (bottom-left of the table/chart).
  • Select the rule that you wish to modify.
  • Click Edit Rule and make the desired changes. Alternatively, you can use the JavaScript below to make your own rule (see Customizing Rules).

JavaScript

includeWeb('Table JavaScript Utility Functions');
includeWeb('JavaScript Text Analysis Functions');
includeWeb('JavaScript Array Functions');

// This rule does not work with R tables where columns are not recognised,
// so we disable the rule where there is more than one statistic (column).
if (table.statistics.length > 1) {
    excludeRTablesWithoutColumns();
}

Array.prototype.max = function() {
  return Math.max.apply(null, this);
};

Array.prototype.min = function() {
  return Math.min.apply(null, this);
};

// Prevent operating on empty tables
if (table.numberColumns < 1)
    form.ruleNotApplicable('too few columns');

// Prevent operating on tables containing text
let primary_statistic = table.availableStatistics[0];
if (primary_statistic == 'Text')
    form.ruleNotApplicable('this rule does not work on text tables');

let sort_within_spans = table.rowSpans.length > 0

// Check for duplicate labels within spans (if any) or within the table as a whole.
if (sort_within_spans) {
    let row_labels = table.rowLabels;
    let spans = getLowestLevelSpans(table.rowSpans);
    let found_indices = getAllIndicesInSpans(spans);
    let leftovers = getLeftoverIndices(found_indices, table.numberRows);
    spans.forEach(function (span) {
        let labels = span.indices.map(function (x) { return row_labels[x]; });
        if (arrayHasDuplicateElements(labels))
            form.ruleNotApplicable('there are duplicate row labels in span ' + span.label);
    });
    let leftover_labels = leftovers.map(function (x) { return row_labels[x]; });
    if (arrayHasDuplicateElements(leftover_labels))
        form.ruleNotApplicable('there are duplicate row labels in the table');
} else {
    if (arrayHasDuplicateElements(table.rowLabels))
        form.ruleNotApplicable('there are duplicate row labels in the table')
}

// User controls
let ascending_label = form.newLabel('Sort:');
let ascending_menu = form.newComboBox('am', ['Descending', 'Ascending']);
ascending_menu.setDefault('Descending');
ascending_menu.lineBreakAfter = true;
let column_selection_label = form.newLabel('Column selection:');
let column_selection_menu = form.newComboBox('csm', ['Largest column', 'Last column', 'Choose a column']);
column_selection_menu.setDefault('Largest column');
column_selection_menu.lineBreakAfter = true;
let description = form.newLabel('Sorts the rows of table from the highest to lowest (or lowest to highest) according to the values shown');
description.lineBreakAfter = true;
let control_array = [description, ascending_label, ascending_menu, column_selection_label, column_selection_menu];

let column_list = [];
if (column_selection_menu.getValue() == 'Choose a column' && table.columnLabels != null)
{
    column_list = enumerateDuplicatesInStringArray(table.columnLabels, '(', ')');
    let column_choice_label = form.newLabel('Column:')
    var column_choice_menu = form.newComboBox('ccm', column_list);
    column_choice_menu.lineBreakAfter = true;
    control_array.push(column_choice_label);
    control_array.push(column_choice_menu);
}
form.setInputControls(control_array);

let ascending = ascending_menu.getValue() == 'Ascending';
let low_high_label = ascending ? 'Lowest to Highest' : 'Highest to Lowest';
const rule_name = 'Sort Rows (Automatically Updates When Data Changes)';
form.setHeading(rule_name);
form.setSummary('Sort rows from ' + low_high_label + ' (automatically updates when data changes)');

// User specified labels to always exclude from the sort
// or never exclude from the sort.
let exclude_labels = addExpandingTextBoxes(control_array, 'Exclude from sort (in addition to NET/SUM):', 'el').map(function (label) { return label.toLowerCase(); } );
let do_not_exclude_labels = addExpandingTextBoxes(control_array, 'Never exclude from sort:', 'dnel').map(function (label) { return label.toLowerCase(); } );

// Sort according to the first stat selected in the table
let stat = table.statistics[0];
let target_column_index;

if (table.numberColumns == 1) {
    target_column_index = 0;
} else if (column_selection_menu.getValue() == 'Largest column') {
    // Obtain Column n from the Statistics - Below and make sure that the resulting array is the correct shape
    if (!belowTableExists() || below_table.availableStatistics.indexOf('Column n') == -1) {
        const stat_name = table.getTranslation('Column n');
        form.ruleNotApplicable(stat_name + " statistic does not exist for this table. Change Column Selection to 'Choose a column'");
    }
    let column_ns = below_table.get('Column n');
    if (column_ns.length == 1)
        column_ns = column_ns[0];
    else
        column_ns = column_ns.map(function (a) { return a[0]; });
    let num_cols = table.numberColumns;


    // Find largest column n and the index of the column

    if (num_cols != 1) {
        let max_col_n = Number.NEGATIVE_INFINITY;
        for (let j = 0; j < num_cols; j++)
            if (!isNaN(column_ns[j]) && column_ns[j] >= max_col_n) {
                max_col_n = column_ns[j];
                target_column_index = j;
            }
        let columns_with_max = column_ns.filter(function (x) { return !isNaN(x) && x >= max_col_n} );
        // When there is more than one column with the same max value of column n then
        // default to the NET or SUM if found.
        if (columns_with_max.length > 1) {
            let net_columns = table.netColumns;
            if (net_columns.length > 0)
                target_column_index = net_columns[0];
        }
    }
} else if (column_selection_menu.getValue() == 'Last column') {
    target_column_index = (table.numberColumns - 1);
} else {
    target_column_index = column_list.indexOf(column_choice_menu.getValue());
}

// Work out what to do with spans.
var spans;
if (table.rowSpans.length == 0){
    let ind = [];
    for (let j = 0; j < table.numberRows; j++)
        ind.push(j);
    spans = [{ indices: ind, label: 'all rows' }];
} else {
    spans = getLowestLevelSpans(table.rowSpans);
    found_indices = getAllIndicesInSpans(spans);
    leftovers = getLeftoverIndices(found_indices, table.numberRows);
    // Any indices not in spans get moved to the bottom of the table
    if (leftovers.length > 0) {
        let last = found_indices.max();
        let modifier = 0;
        leftovers.forEach(function (x) {
            if ( (x-modifier) < last) {
                table.moveRowAfter(x-modifier, last);
                modifier ++;
            }
        });
        spans = getLowestLevelSpans(table.rowSpans);
        found_indices = getAllIndicesInSpans(spans);
        leftovers = getLeftoverIndices(found_indices, table.numberRows);
        spans.push( {indices: leftovers, label: 'Rows outside of spans'});
    }
}

// Get values to be used by the sorting and determine if each row should be sorted (based on its label)
var values = table.get(stat);


spans.forEach(function (span) {
    let row_labels = table.rowLabels;
    let labels = span.indices.map(function (x) { return row_labels[x]; });
    let row_objects = [];
    let net_rows = table.netRows;
    let sgn = (ascending ? 1 : -1);
    for (let j = 0; j < labels.length; j++) {
        let current_row_index = span.indices[j];
        let is_net = net_rows.indexOf(current_row_index) != -1;
        // Obtain a value for the current row to reflect it's sort order.
        // Replace NaN with -Infinity so that NaNs don't get mixed through the table,
        // as NaN does not play nicely with sort();
        // Add or subtract a small value based on the row's current position
        // so that in case of a tie, the current relative ordering of tied
        // rows will be maintained.
        let this_value = isNaN(values[current_row_index][target_column_index]) ? Number.NEGATIVE_INFINITY : values[current_row_index][target_column_index] + sgn * (j / 10000) 
        row_objects.push({ label: labels[j],
                           value: this_value,
                           sort: shouldSortRow(labels[j], exclude_labels, do_not_exclude_labels, is_net) });
    }

    // Sort the row objects based on their values
    row_objects.sort(function (row_x, row_y) {
        // Get the values for the two rows we are comparing.
        let value_x = row_x.value;
        let value_y = row_y.value;
        if (value_x == Number.NEGATIVE_INFINITY && value_y == Number.NEGATIVE_INFINITY)
            return row_y.label.charCodeAt(0) - row_x.label.charCodeAt(0);
        return value_x - value_y;
    });

    // Perform the sorting of the table rows, ignoring any rows that have been marked with sort = false
    let top_index_in_span = span.indices.min();
    let top_row = top_index_in_span == 0 ? null : top_index_in_span - 1;
    if (ascending) {
        for (let j = row_objects.length - 1; j > -1; j--)
            if (row_objects[j].sort)
                moveRowAfterComplete(findRowInSpan(row_objects[j].label, span.indices), top_row);
    } else {
        for (let j = 0; j < row_objects.length; j++)
            if (row_objects[j].sort)
                moveRowAfterComplete(findRowInSpan(row_objects[j].label, span.indices), top_row);
    }
})


// Should we include this row in the sorting?
function shouldSortRow(label, exclude_labels, do_not_exclude_labels, is_net) {
    if (containsSubstring(label, do_not_exclude_labels))
        return true;
    if (containsSubstring(label, exclude_labels))
        return false;
    if (isNoneOfThese(label))
        return false;
    if (isAllOfThese(label))
        return false;
    if (isDontKnow(label))
        return false;
    if (isOther(label))
        return false;
    if (is_net)
        return false;
    return true;
}

// Add an expanding list of text boxes to control_array and return the
// values entered by the user as an array. "id" is required to uniquely
// identify the controls in this set - do not use the same value of id
// in different calls to this function.
function addExpandingTextBoxes(control_array, label, id) {
    let last_selection = ' ';
    let entries = [];
    let control_label = form.newLabel(label);
    control_label.lineBreakAfter = true;
    control_array.push(control_label);
    let control_counter = 0;
    while (last_selection != '') {
        let text_box = form.newTextBox(id + control_counter);
        text_box.lineBreakAfter = true;
        text_box.setDefault('');
        control_array.push(text_box);
        form.setInputControls(control_array);
        last_selection = text_box.getValue();
        if (last_selection != '')
            entries.push(last_selection);
        control_counter ++;
    }
    return entries;
}

// Work out which spans are lowest in the table.
// These are the spans that are tested within.
function getLowestLevelSpans(spans) {
    let lowest_spans = [];
    let found_indices = [];
    spans.forEach(function (span) {
        if (span.indices.every(function (x) { return found_indices.indexOf(x) == -1; })) {
            found_indices = found_indices.concat(span.indices);
            lowest_spans.push(span);
        }
    });
    return lowest_spans;
}

function getAllIndicesInSpans(spans) {
    let found_indices = [];
    spans.forEach(function (span) {
        found_indices = found_indices.concat(span.indices);
    });
    return found_indices;
}

function getLeftoverIndices(found_indices, num_rows) {
    let leftovers = [];
    for (let j = 0; j < num_rows; j++) {
        if (found_indices.indexOf(j) == -1)
            leftovers.push(j);
    }
    return leftovers
}

function findRowInSpan(row_label, span_indices) {
    let row_labels = table.rowLabels;
    let labels = span_indices.map(function (x) { return row_labels[x]; });
    return span_indices[labels.indexOf(row_label)];
}

Prior to the 15th of December, 2015, this page was known as Sort - Sort Rows (Automatically Updates when Data Changes)

See also