Kendall's Tau-b

From Q
Jump to navigation Jump to search

The correlation between two variables, [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math], is:

[math]\displaystyle{ \tau_b = \frac{n_c-n_d}{\sqrt{(n_t-n_x)(n_t-n_y)}} }[/math]


[math]\displaystyle{ n_c = \sum^n_{i=1} \sum^n_{j=1} w_i (I_{x_i\gt x_j,y_i\gt y_j}+I_{x_i\gt x_j,y_i\gt y_j} ) }[/math],
[math]\displaystyle{ n_d = \sum^n_{i=1} \sum^n_{j=1} w_i ( I_{x_i\lt x_j,y_i\gt y_j}+I_{x_i\gt x_j,y_i\lt y_j}) }[/math],
[math]\displaystyle{ n_w = \sum^n_{i=1} w_i }[/math],
[math]\displaystyle{ n_t =\frac{n_w(n_w-1)}{2} }[/math],
[math]\displaystyle{ n_x = \sum^t_{j=1} \sum^n_{i=n} w_i I_{x_i=j} }[/math],
[math]\displaystyle{ n_y = \sum^r_{j=1} \sum^n_{i=n} w_i I_{y_i=j} }[/math],
[math]\displaystyle{ w_i }[/math] is the Calibrated Weight for the [math]\displaystyle{ i }[/math]th of [math]\displaystyle{ n }[/math] is the number of observations,
[math]\displaystyle{ x }[/math] is a variable with [math]\displaystyle{ t }[/math] unique values, categorised in the range [math]\displaystyle{ {{1,2,..,t}} }[/math],
[math]\displaystyle{ y }[/math] is a variable with [math]\displaystyle{ r }[/math] unique values, categorised in the range [math]\displaystyle{ {{1,2,..,r}} }[/math],

The tests statistic is:

[math]\displaystyle{ z = {n_c - n_d \over \sqrt{ v } } }[/math]


[math]\displaystyle{ v = (v_0 - v_x - v_y)/18 + v_1 + v_2 }[/math],
[math]\displaystyle{ v_0 = n (n-1) (2n+5) }[/math],
[math]\displaystyle{ v_x = \sum_j t_{xj} (t_{xj}-1) (2 t_{xj} + 5) }[/math],
[math]\displaystyle{ v_y = \sum_j t_{yj} t_{yj}-1) (2 t_{yj} + 5) }[/math],
[math]\displaystyle{ v_1 = \sum^r_{j=1} t_{xj}(t_{xj}-1)(t_{xj}-2) }[/math],
[math]\displaystyle{ v_2 = \sum^t_{j=1} t_{yj}(t_{yj}-1)(t_{yj}-2) }[/math],
[math]\displaystyle{ v_3 = (v1 v2) / (9 n_w (n_w - 1) (n_w - 2)) }[/math],
[math]\displaystyle{ v_4 = \sum^r_{j=1} t_{xj}(t_{xj}-1) }[/math],
[math]\displaystyle{ v_5 = \sum^t_{j=1} t_{yj}(t_{yj}-1) }[/math],
[math]\displaystyle{ v_6 = (v_4 v_5) / (2 n_w (n_w - 1)) }[/math],
[math]\displaystyle{ \hat{\sigma} = (v_0 - v_x - v_y) / 18 + v3 + v6 }[/math],
[math]\displaystyle{ z = \frac{n_c - n_d}{\hat{\sigma}} }[/math],
[math]\displaystyle{ p \approx 2(1-\Phi(|z|)) }[/math]

See also

Correlations - Comparing Two Numeric Variables