Kendall's Tau-b

From Q
Jump to: navigation, search

The correlation between two variables, x and y, is:

\tau_b = \frac{n_c-n_d}{\sqrt{(n_t-n_x)(n_t-n_y)}}

where

n_c = \sum^n_{i=1} \sum^n_{j=1} w_i (I_{x_i>x_j,y_i>y_j}+I_{x_i>x_j,y_i>y_j} ),
n_d = \sum^n_{i=1} \sum^n_{j=1} w_i ( I_{x_i<x_j,y_i>y_j}+I_{x_i>x_j,y_i<y_j}) ,
n_w = \sum^n_{i=1} w_i  ,
n_t =\frac{n_w(n_w-1)}{2} ,
n_x = \sum^t_{j=1} \sum^n_{i=n} w_i  I_{x_i=j} ,
n_y = \sum^r_{j=1} \sum^n_{i=n} w_i  I_{y_i=j} ,
w_i is the Calibrated Weight for the ith of n is the number of observations,
x is a variable with t unique values, categorised in the range {{1,2,..,t}},
y is a variable with r unique values, categorised in the range {{1,2,..,r}},

The tests statistic is:

z = {n_c - n_d \over \sqrt{ v } }

where

v  =  (v_0 - v_x - v_y)/18 + v_1 + v_2 ,
v_0  =  n (n-1) (2n+5) ,
v_x = \sum_j t_{xj} (t_{xj}-1) (2 t_{xj} + 5) ,
v_y = \sum_j t_{yj} t_{yj}-1) (2 t_{yj} + 5) ,
v_1 = \sum^r_{j=1} t_{xj}(t_{xj}-1)(t_{xj}-2),
v_2 = \sum^t_{j=1} t_{yj}(t_{yj}-1)(t_{yj}-2),
v_3 = (v1  v2) / (9  n_w  (n_w - 1)  (n_w - 2)) ,
v_4 = \sum^r_{j=1} t_{xj}(t_{xj}-1),
v_5 = \sum^t_{j=1} t_{yj}(t_{yj}-1),
v_6 = (v_4  v_5) / (2  n_w  (n_w - 1)),
\hat{\sigma} = (v_0 - v_x - v_y) / 18 + v3 + v6 ,
z = \frac{n_c - n_d}{\hat{\sigma}} ,
p \approx 2(1-\Phi(|z|))

See also

Correlations - Comparing Two Numeric Variables

Personal tools
Namespaces

Variants
Actions
Navigation
Categories
Toolbox