Department of Statistics, Virginia Tech

Kendall’s \(\tau\) measures correlation by counting the number of pairs of \((X_i, Y_i)\) where both coordinates in one agree in their comparison with both coordinates in the another \((X_j, Y_j)\), either lesser or greater, and vice-versa where they don’t agree. The former are called *concordant* pairs, and the latter *discordant*. I.e.,

- A pair \((X_i, Y_i)\) and \((X_j, Y_j)\) are
**concordant**when \(X_i < X_j\) and \(Y_i < Y_j\) or \(X_i > X_j\) and \(Y_i > Y_j\).- i.e., when

\[ \frac{Y_j - Y_i}{X_j - X_i} > 0 \quad \mbox{ as long as } X_i \ne X_j. \]

- Otherwise they are
**discordant**. I.e., when

\[ \frac{Y_j - Y_i}{X_j - X_i} < 0 \quad \mbox{ as long as } X_i \ne X_j. \]

Kendall’s \(\tau\) is the difference between the number of concordant pairs \(N_c\) in the sample and the number of discordant pairs \(N_d\), divided by the total number of pairs \({n \choose 2} = n (n-1) /2\):

\[ \tau = \frac{N_c - N_d}{n(n-1)/2}. \]

In the case of **ties** in the \(Y\)-values, we make the following adjustment. When

\[ \frac{Y_j - Y_i}{X_j - X_i} = 0 \quad \mbox{ as long as } X_i \ne X_j \]

we count it as half concordant and half discordant, i.e., adding \(1/2\) to both \(N_c\) and \(N_d\). Whenever \(X_i = X_j\), the comparison is void and no augmentation to \(N_c\) or \(N_d\) is made for that case. The definition of \(\tau\) is adjusted to follow

\[ \tau = \frac{N_c - N_d}{N_c + N_d}. \]

However, notice that in the case of no ties, \(N_c + N_d = n (n-1)/2\), so the two \(\tau\) definitions are actually the same.

Observe that the above specifications apply equally on the ranks of \(X_i\) and \(Y_i\) in lieu of the original values, since only relative comparisons are being made.

Another description for how to calculate \(\tau\) comes from the documentation for `pKendall`

in R:

- There are two categories with \(n\) treatments each. The treatments are ranked for each category, and then sorted according to the ranks for the first (\(X\)) category. This produces a \(n \times 2\) array in which the numbers in the first column are increasing from 1 to \(n\). The array is scanned, and every time two adjacent ranks in the second row are not in order, they are exchanged. The scanning is repeated until the second row is in increasing order. Let \(s\) denote the number of exchanges, then Kendall’s \(\tau\) is given by

\[ \tau = 1 - \frac{4s}{n(n-1).} \]

In R we we may simply calculate `cor(x, y, method="kendall")`

.

- Kendall’s \(\tau\) checks all of the traditional boxes for desirable qualities of a correlation.

In order to turn a measure of \(\tau\) into a test for the strength of the association we need the null distribution. This is calculated in the same spirit as our other rank-based tests (e.g., Wilcox and squared ranks).

- I.e., the mass function involves counting the number of ways to get a particular \(\tau\) value from each permutation on the pair of \(\{1, \dots, n\}\) ranks, divided by the total number of such permutations.

Again, we know how to do that now, but we won’t reinvent the wheel.

- An add-on package
`SuppDists`

contains a`pKendall`

function that works directly on \(\tau\). - When there are ties in the ranks and/or when \(n\) is large, a CLT method may be applied.

\[ \begin{aligned} && Z &= \tau \frac{3 \sqrt{n(n-1)}}{\sqrt{2(2n+5)}} \sim \mathcal{N}(0,1), \\ \mbox{or equivalently (and slightly more simply) } && Z &= (N_c - N_d) \frac{\sqrt{18}}{\sqrt{n(n-1)(2n+5)}} \sim \mathcal{N}(0,1). \end{aligned} \]

The **hypotheses** are specified as follows, and the specific verbiage here is due to the insensitivity of \(\rho\) to some types of dependence.

For a two-tailed test we consider

\[ \begin{aligned} \mathcal{H}_0 &: \; \mbox{The $X_i$ and $Y_i$ are independent.} \\ \mathcal{H}_1 &: \; \mbox{Pairs of observations either rend to be concordant, or tend to be discordant.} \end{aligned} \]

The \(p\)-value is calculated in the usual way as

\[ \varphi = 2 \min\{\mathbb{P}(T \leq \tau), \mathbb{P}(T > \tau) \} \quad \mbox{ or } \quad \varphi \approx 2\min\left\{\mathbb{P}\left(Z \leq \frac{(N_c - N_d+1)\sqrt{18}}{\sqrt{n(n-1)(2n+5)}}\right), \mathbb{P}\left(Z \geq \frac{(N_c - N_d-1)\sqrt{18}}{\sqrt{n(n-1)(2n+5)}}\right)\right\} \]

- depending on whether the exact or CLT approximated null distribution is preferred, respectively.
- The \(\pm 1\) above constitutes a continuity correction, which is negligible for larger \(n\).

For a lower-tailed test (for negative correlation), we have

\[ \begin{aligned} \mathcal{H}_0 &: \; \mbox{The $X_i$ and $Y_i$ are independent.} \\ \mathcal{H}_1 &: \; \mbox{Pairs of observations tend to be discordant.} \end{aligned} \]

The \(p\)-value is calculated as

\[ \varphi = \mathbb{P}(T \leq \tau) \quad \mbox{ or } \quad \varphi \approx\mathbb{P}\left(Z \leq \frac{(N_c - N_d+1)\sqrt{18}}{\sqrt{n(n-1)(2n+5)}}\right). \]

For an upper-tailed test (for positive correlation), we consider

\[ \begin{aligned} \mathcal{H}_0 &: \; \mbox{The $X_i$ and $Y_i$ are mutually independent.} \\ \mathcal{H}_1 &: \; \mbox{Pairs of observations tend to be concordant.} \end{aligned} \]

The \(p\)-value is calculated as

\[ \varphi = \mathbb{P}(T > \tau) \quad \mbox{ or } \quad \varphi \approx \mathbb{P}\left(Z \geq \frac{(N_c - N_d-1)\sqrt{18}}{\sqrt{n(n-1)(2n+5)}}\right). \]

Metro officials gathered aggregate data summarizing the number of employees that receive transit benefits and the average price that the employer spends per employee for them to use the transit benefits. Is there evidence to suggest that the average price per employee drops as the number of employees using the service increases (i.e. employer discount)?

```
emp <- data.frame(number=c(173, 149, 124, 64, 88, 113, 142, 27, 39, 51),
price=c(2.14, 2.39, 2.19, 2.56, 2.44, 2.29, 2.18, 2.55, 2.32, 2.27))
```

First, lets present the data in a way that makes it easy to eyeball the number of concordant and discordant pairs.

```
o <- order(emp$number)
emp[o,]
```

```
## number price
## 8 27 2.55
## 9 39 2.32
## 10 51 2.27
## 4 64 2.56
## 5 88 2.44
## 6 113 2.29
## 3 124 2.19
## 7 142 2.18
## 2 149 2.39
## 1 173 2.14
```

Now, an easy way to count the number of concordant and discordant pairs is to count, for each value in the second, \(Y\) column, how many are above or below that value.

```
n <- nrow(emp)
yo <- emp$price[o]
concord <- discord <- rep(NA, n)
for(i in 1:(n-1)) {
yties <- sum(yo[i] == yo[(i+1):n])
concord[i] <- sum(yo[i] < yo[(i+1):n]) + yties/2
discord[i] <- sum(yo[i] > yo[(i+1):n]) + yties/2
}
cbind(emp[o, ], concord, discord)
```

```
## number price concord discord
## 8 27 2.55 1 8
## 9 39 2.32 3 5
## 10 51 2.27 4 3
## 4 64 2.56 0 6
## 5 88 2.44 0 5
## 6 113 2.29 1 3
## 3 124 2.19 1 2
## 7 142 2.18 1 1
## 2 149 2.39 0 1
## 1 173 2.14 NA NA
```

- Notice that there are no \(X\)-ties in the data, otherwise we would have had to count more carefully.

Translating those numbers into \(N_c\) and \(N_d\) proceeds as follows, and then our estimate of \(\tau\) …

```
Nc <- sum(concord, na.rm=TRUE)
Nd <- sum(discord, na.rm=TRUE)
tau <- (Nc - Nd) / (Nc + Nd)
tau
```

`## [1] -0.5111111`

Lets check that against R’s `cor`

command.

`cor(emp$number, emp$price, method="kendall")`

`## [1] -0.5111111`

- Nice! (Which one is easier?)
- So the estimate of \(\tau\) is negative; that means indeed there is some kind of employer discount.

Lets see if that result is statistically significant by performing a lower-tailed test.

```
library(SuppDists)
pKendall(tau, nrow(emp))
```

`## [1] 0.02331129`

- We reject the null hypothesis at the 5% level and conclude that there is indeed an employer discount.

R has a built-in library function called `cor.test`

which can perform a test on Kendall’s \(\tau\).

`cor.test(emp$number, emp$price, alternative="less", method="kendall")`

```
##
## Kendall's rank correlation tau
##
## data: emp$number and emp$price
## T = 11, p-value = 0.02331
## alternative hypothesis: true tau is less than 0
## sample estimates:
## tau
## -0.5111111
```

- Bingo.

From a Metro survey, officials analyzed the overall rating that passengers gave to the transit system and their most popular time of commute. Below is sample data from passengers. Note that the higher the rating, the more satisfied that the customer was with the transit system. Is there a significant correlation in the two variables at the five percent level of significance?

```
survey <- data.frame(
time=c("7:00", "7:20", "8:00", "7:30", "8:00", "8:30", "8:45", "7:25", "9:00", "8:00", "7:55",
"7:15", "7:45", "9:00"),
rating=c(5.5, 8.5, 10, 8.5, 7.0, 6.5, 7.0, 9.0, 4.5, 7.5, 8.5, 6.5, 8.0, 10))
```

Again, the times are in a “string” format, so we will need to make a little conversion into something “`rank`

able”.

`ptime <- as.POSIXct(strptime(survey$time, "%H:%M"))`

Since we have \(X\)-ties this time around, lets see how counting by hand would work in that case.

```
o <- order(ptime)
n <- length(ptime)
yo <- survey$rating[o]
xo <- ptime[o]
concord <- discord <- rep(NA, n)
for(i in 1:(n-1)) {
xu <- xo[i] != xo[(i+1):n]
yties <- sum((yo[i] == yo[(i+1):n]) * xu)
concord[i] <- sum((yo[i] < yo[(i+1):n])*xu) + yties/2
discord[i] <- sum((yo[i] > yo[(i+1):n])*xu) + yties/2
}
cbind(survey[o,], concord, discord)
```

```
## time rating concord discord
## 1 7:00 5.5 12.0 1.0
## 12 7:15 6.5 10.5 1.5
## 2 7:20 8.5 4.0 7.0
## 8 7:25 9.0 2.0 8.0
## 4 7:30 8.5 2.5 6.5
## 13 7:45 8.0 3.0 5.0
## 11 7:55 8.5 2.0 5.0
## 3 8:00 10.0 0.5 3.5
## 5 8:00 7.0 1.5 2.5
## 10 8:00 7.5 1.0 3.0
## 6 8:30 6.5 2.0 1.0
## 7 8:45 7.0 1.0 1.0
## 9 9:00 4.5 0.0 0.0
## 14 9:00 10.0 NA NA
```

Translating those numbers into \(N_c\) and \(N_d\) proceeds as follows, and then our estimate of \(\tau\) …

```
Nc <- sum(concord, na.rm=TRUE)
Nd <- sum(discord, na.rm=TRUE)
tau <- (Nc - Nd) / (Nc + Nd)
tau
```

`## [1] -0.03448276`

Now we can calculate our \(\tau\) (in the simple way).

`cor(as.numeric(ptime), survey$rating, method="kendall")`

`## [1] -0.03488608`

- Not identical, but very close. (I’m not sure how to account for the difference actually.)

Looks like a pretty small correlation, but we can test if it is small enough to be considered as zero.

The \(p\)-value of the two-tailed test may be calculated as follows.

```
n <- nrow(survey)
2*min(pKendall(tau, n), 1-pKendall(tau, n))
```

`## [1] 0.9144913`

- Not enough evidence to reject the null hypothesis that time and rating are mutually independent at the 5% level.

Since there were ties in the ranks it is better to use the normal approximation.

```
z1 <- (Nc - Nd + 1) * sqrt(18)/sqrt(n*(n-1)*(2*n+5))
z2 <- (Nc - Nd - 1) * sqrt(18)/sqrt(n*(n-1)*(2*n+5))
2 * min(pnorm(z1), pnorm(z2, lower.tail=FALSE))
```

`## [1] 0.912814`

- In either case, there is not enough evidence to reject the null hypothesis that time and rating are mutually independent.

Here is a library version.

`cor.test(as.numeric(ptime), survey$rating, method="kendall", exact=FALSE, continuity=TRUE)`

```
##
## Kendall's rank correlation tau
##
## data: as.numeric(ptime) and survey$rating
## z = -0.11135, p-value = 0.9113
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
## tau
## -0.03488608
```

- Very close. Notice that we used
`continuity=TRUE`

. - Observe that we would get (nearly) the same result as
`continuity=FALSE`

if we did the following.

```
z <- (Nc-Nd) * sqrt(18)/sqrt(n*(n-1)*(2*n+5))
2 * pnorm(-abs(z))
```

`## [1] 0.8695464`

`cor.test(as.numeric(ptime), survey$rating, method="kendall", exact=FALSE)$p.value`

`## [1] 0.8673479`

- Very close.

One application of Kendall’s \(\tau\) applies to a single sample \(Y_1, \dots, Y_n\) where the other set of coordinates, the \(X_i\)’s, are derived from the order (or time) at which the \(X_i\) samples were collected. Then a calculation of \(\tau\), and an associated test can be interpreted as a test for trend, similar to the Cox & Stuart test.

- In fact, tests for trend based on ranks are generally more powerful Cox & Stuart style tests.
- Nevertheless, rank-based tests for trend are not as widely applicable as Cox & Stuart tests.

The following values record annual precipitation over the span of 1950–1968 inclusive.

```
precip <- c(45.25, 45.83, 41.77, 36.26, 45.27, 52.25, 35.37, 57.16, 35.38, 58.32, 41.05, 22.72,
45.73, 37.90, 41.72, 36.07, 49.83, 36.24, 39.90)
```

So the idea is to create an \(X\) to go with these \(Y\)s. Quite simply …

`year <- 1950:1968`

Skipping the plot to look for trend, and jumping right to the Kendall’s \(\tau\) calculation.

```
tau <- cor(year, precip, method="kendall")
tau
```

`## [1] -0.1345029`

- With a number like \(\tau = -0.13\) above we might be tempted to conclude a (modestly strong) negative correlation, although notice that this is lower than our Spearman’s \(\rho=-0.25\). But both measurements are without a benchmark unless we consider their sampling distribution.

The \(p\)-value associated with a two-tailed test may be calculated as follows.

`2*min(pKendall(tau, length(precip)), 1-pKendall(tau, length(precip)))`

`## [1] 0.4467731`

How about using the library?

`cor.test(year, precip, method="kendall")`

```
##
## Kendall's rank correlation tau
##
## data: year and precip
## T = 74, p-value = 0.4467
## alternative hypothesis: true tau is not equal to 0
## sample estimates:
## tau
## -0.1345029
```

- A nearly identical result.