Testing Equality in Ordinal Data with Repeated Measurements: A Model-Free Approach

Kung-Jong Lui

doi:10.1515/ijb-2015-0075

Article Publicly Available

Testing Equality in Ordinal Data with Repeated Measurements: A Model-Free Approach

Kung-Jong Lui

Published/Copyright: January 20, 2016

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal The International Journal of Biostatistics Volume 12 Issue 2

Abstract

In randomized clinical trials, we often encounter ordinal categorical responses with repeated measurements. We propose a model-free approach with using the generalized odds ratio (GOR) to measure the relative treatment effect. We develop procedures for testing equality of treatment effects and derive interval estimators for the GOR. We further develop a simple procedure for testing the treatment-by-period interaction. To illustrate the use of test procedures and interval estimators developed here, we consider two real-life data sets, one studying the gender effect on pain scores on an ordinal scale after hip joint resurfacing surgeries, and the other investigating the effect of an active hypnotic drug in insomnia patients on ordinal categories of time to falling asleep.

Keywords: testing equality; interval estimators; generalized odds ratio; ordinal data; repeated measurements; model-free

1 Introduction

In clinical trials or health-related studies, we often encounter the patient response on an ordinal scale, for example, worse, same or better. Arbitrarily assigning scores (such as –1, 0, 1) to these ordinal categories for arithmetic operation can be inappropriate due to the fact that the relative distance between any two successive ordinal categories is not really equal or even comparable. One may also have difficulty in interpreting the mean of these arbitrary scores in terms of practical meaning. Grouping multiple categories into a single category to reduce ordinal outcomes into dichotomous responses can cause the loss of efficiency.

To reduce the number of patients in clinical trials, taking more than one measurement on each patient frequently arises in practice. Research on repeated ordinal responses has been intensive [1, 2, 3–5, 6, 7, 8, 9]. Francom et al. [2] applied a family of structural log-linear models and required investigators to assign the scores to represent the relative distance between ordinal categories. Agresti [1] addressed use of generalized linear models with different link functions and linear predictors. Ware et al. [9] as well as Kenward and Jones [4] discussed various approaches to analyze repeated categorical measurements. Parsons et al. [7] considered the proportional odds logistic regression with a range of working correlation models. All these publications focused discussions on model-based methods. By contrast, the methods proposed here is model-free and does not assume any parametric form for the data structure. Furthermore, there is no need to assume or specify any particular dependence structure between repeated measurements. Also, rather than concentrating attentions on testing equality between treatments in ordinal data with repeated measurements, there were publications [10, 11] discussing and deriving procedures for testing positive (quadrant) dependence under various parameter constraints and marginal modeling in two-way tables with ordinal categories. Agresti and Coull [12] focused discussion on model-based approach as well and addressed testing hypothesis against order-restricted alternatives. The purposes of these papers are different from what we focus here is to derive model-free procedures for testing equality of treatments, while the dependence between measurements taken within patients is nuisance effect.

Using the generalized odds ratio (GOR) [13], we develop in this paper model-free procedures for testing equality of treatments and derive interval estimators for the relative treatment effect in ordinal data with repeated measurements. We further develop a simple procedure for testing the treatment-by-period interaction. We discuss the usefulness and limitations of test procedures developed here. To illustrate the use of these procedures, we consider two real-life data sets, one taken from a trial studying the gender effect on pain scores on an ordinal scale after hip joint resurfacing surgeries [7] and the other taken from a double-blind randomized trial comparing an active hypnotic drug with a placebo in insomnia patients with respect to the ordinal category of time to falling asleep [2].

2 Notation and Methods

Consider comparing two treatments in a randomized clinical trial, in which we randomly assign ng patients to group g (=1, 2). For a randomly selected patient i(=1, 2,…, ng) from group g, we let Yizg denote the patient response at period z (=1, 2), where Yizg takes one of L ordinal labels Cj (j=1, 2,…, L) with C1<C2<C3<⋯<CL. Let nrsg denote the number of patients with (Yi1g=Cr,Yi2g=Cs) among ng patients in group g. The random vector (n11g,n12g,⋯n1Lg,n21g,n22g,...,n2Lg,⋯,nL1g,nL2g,...,nLLg)′ then follows the multinomial distribution with parameters ng and (π11g,π12g,⋯π1Lg,π21g,π22g,...,π2Lg,⋯,πL1g,πL2g,...,πLLg)′, where πrsg denotes the cell probability that a randomly selected patient from group g has the vector of responses (Yi1g=Cr,Yi2g=Cs). We let “+” denote the summation over that particular subscript. For example, πr+g(=∑sπrsg) denotes the probability P(Yi1g=Cr) that a randomly selected patient from group g (=1, 2) has the response Yi1g=Crat period 1. Following Agresti [13], we define ΠC1=∑r=1L−1∑r′=r+1Lπr+1πr′+2, denoting for period 1 the probability that the response of a randomly selected patient from group 2 is larger than that from group 1. We further define ΠD1=∑r=2L∑r′=1r−1πr+1πr′+2 denoting for period 1 the probability that the response of a randomly selected patient from group 1 is larger than that from group 2. The GOR of responses between groups 2 and 1 at period 1 is simply defined as G1=ΠC1/ΠD1. When there is no difference in treatment effects at period 1 between the two groups, G1=1. If the treatment at period 1 in group 2 tends to increase the response as compared with that in group 1, G1>1. If the former tends to decrease the response as compared with the latter, G1<1. Similarly, we define the GOR of responses between groups 2 and 1 at period 2 as G2=ΠC2/ΠD2, where ΠC2=∑s=1L−1∑s′=s+1Lπ+s1π+s′2 and ΠD2=∑s=2L∑s′=1s−1π+s1π+s′2. Note that when L=2, Gz(z=1, 2) reduces to the regular odds ratio (OR) of responses in dichotomous data.

Note that we can estimate πrsg by the unbiased consistent sample proportion estimator πˆrsg=nrsg/ng, and thereby, we can estimate G1 at period z=1 between groups 2 and 1 by

(1)Gˆ1=ΠˆC1/ΠˆD1,

where ΠˆC1=∑r=1L−1∑r′=r+1Lπˆr+1πˆr′+2 and ΠˆD1=∑r=2L∑r′=1r−1πˆr+1πˆr′+2. Using the delta method [14], we can show that an estimated asymptotic variance of Gˆ1(1) with the logarithmic transformation is given by [13, 15]

(2)Var^ (log(G^1))=∑r=1L[∑r′=r+1Lπ^r′+2−G^1∑r′=1r−1π^r′+2]2π^r+1n1(∏^C1)2+∑r′=1L[∑r=1r′−1π^r+1−G^1∑r=r′+1Lπ^r+1]2π^r′+2n2(∏^C1)2.

Note that we define ∑r=L+1Lπˆr+1=∑r′=L+1Lπˆr′+2=0. Similarly, we define ∑r=10πˆr+1=∑r′=10πˆr′+2=0.

Following the same arguments as for deriving Gˆ1 (1) and Var^(log(G^1))(2), we obtain a consistent estimator for G2 between groups 2 and 1 at period 2 as

(3)Gˆ2=ΠˆC2/ΠˆD2,

where ΠˆC2=∑s=1L−1∑s′=s+1Lπˆ+s1πˆ+s′2 and ΠˆD1=∑s=2L∑s′=1s−1πˆ+s1πˆ+s′2. Again, using the

delta method, we can show that an estimated asymptotic variance of Gˆ2(3) with the logarithmic transformation is given by

(4)Var^(log(G^2))=∑s=1L[∑s′=s+1Lπ^+s′2−G^2∑s′=1s−1π^+s′2]2π^+s1n1(∏^C2)2+∑s′=1L[∑s=1s′−1π^+s1−G^2∑s=s′+1Lπ^+s1]2π^+s′2n2(∏^C2)2.

Also, we define ∑s=L+1Lπˆ+s1=∑s′=L+1Lπˆ+s′2=0 and ∑s=10πˆ+s1=∑s′=10πˆ+s′2=0 in (4).

Note that Gˆ1(1) and Gˆ2(3) are actually correlated. For clarity, we include the tedious details in derivation of Cov^(log(G^1),log(G^2))in Appendix I.

2.1 Test non-equality between treatments in the absence of interactions

When treatments received at the two periods in a group g are the same and there is no treatment-by-period interaction (i. e., G1=G2), we may apply a weighted average wlog(Gˆ1)+(1−w)log(Gˆ2) to test H0:G1=G2=1 versus Ha:G1≠1 or G2≠1, where 0<w<1 is the weight reflecting the relative importance of responses at period 1 to those at period 2, and can be assigned by clinicians based on their subjective knowledge. If we have no prior preference to assign the weight, we may simply set w equal to 0.50. This leads us to consider the following summary test procedure over the two periods. We will reject H0:G1=G2=1 at the α-level if

(5)|log(G^1)+log(G^2)|/Var^(log(G^1)+log(G^2))>Zα/2,

where Var^(log(G^1)+log(G^2))=Var^(log(G^1))+Var^(log(G^2))+2Cov^(log(G^1),log(G^2)), Cov^(log(G^1),log(G^2)) is given in (A.3) with replacing parameters by their corresponding parameter estimators (Appendix I), and Zα is the upper 100(α)th percentile of the standard normal distribution. Note that because the sampling distribution of Gˆz (z=1, 2), which is a ratio of two random variables, can be skewed, we use the logarithmic transformation to improve the normal approximation here.

2.2 Test non-equality between treatments in the presence of interactions

When treatments received at the two periods in a group are not the same or there is a treatment-by-period interaction (i. e., G1≠G2), the test procedure (5) can be meaningless or lose power. When there is a treatment-by-period interaction, the estimates Gˆ1 and Gˆ2 may fall in opposite relative directions. For example, consider the situation G1=1 and G2>1, in which log(Gˆ1) can be negative with a non-negligible probability, while log(Gˆ2) tends to be positive. In this case, |log(Gˆ1)+log(Gˆ2)| can be small due to cancelation between values of log(Gˆ1) and log(Gˆ2). Therefore, the summary test procedure (5) can lack power in these cases. To alleviate this concern, we may consider the following bivariate test procedure. We will reject H0:G1=G2=1 at the α-level if

(6)log(Gˆ1),log(Gˆ2)Σˆ_−1log(Gˆ1)log(Gˆ2)>χα2(2),

where Σˆ_ is the estimated covariance matrix with diagonal elements equal to Var^(log(G^1))and Var^(log(G^2)), and the off-diagonal element equal to Cov^(log(G^1),log(G^2)), as well as χα2 is the upper 100(α)th percentile of the central chi-squared distribution with 2 degrees of freedom.

2.3 Procedure for testing the group-by-period interaction

When wishing to study whether there is a group-by-period interaction, we may consider testing H0:G1=G2 versus Ha:G1≠G2. We will reject H0:G1=G2 at the α-level if

(7)|log(G^1)−log(G^2)|/Var^(log(G^1)−log(G^2))>Zα/2,

where Var^(log(G^1)−log(G^2))=Var^(log(G^1))+Var^(log(G^2))−2Cov^(log(G^1),log(G^2)). Note that when treatments received at two periods in a group are the same, the group-by-period interaction may actually represent the treatment-by-period interaction.

2.4 Interval estimation of the relative treatment effect

When treatments received at the two periods in a group are the same and there is no treatment-by-period interaction, we let G0 denote the common value of G1 and G2. On the basis of Gˆ1(1), Var^(log(G^1))(2), Gˆ2(3), Var^(log(G^2))(4) and Cov^(log(G^1),log(G^2)) (A.3), we may obtain 100(1- α)% confidence interval for G0 as

(8)[exp(LGl),exp(LGu)],

where LGl=(log(G^1)+log(G^2))/2−Zα/2Var^(log(G^1)+log(G^2))/2, and LGu=(log(G^1)+log(G^2))/2+Zα/2Var^(log(G^1)+log(G^2))/2.

When treatments received at the two periods are not the same or there is a treatment-by-period interaction, we may wish to obtain an interval estimator for Gz(z=1, 2) separately. On the basis of Gˆz and Var^(log(G^z))for z=1, 2, we obtain 100(1-α)% confidence interval for Gz (z=1, 2) as

(9)[G^zexp(−Zα/2Var^(log(G^z))), G^zexp(Zα/2Var^(log(G^z)))].

3 Examples

To illustrate the use of point estimators Gˆ1 and Gˆ2, test procedures and interval estimators developed here, we first consider the trial studying hip joint resurfacing surgeries [7]. To alleviate pain and debilitation caused by osteoarthritis, rheumatoid arthritis, fractures and other hip related problems, hip replacement surgery is a widely-used procedure. To assess the failure rate and prognosis after hip joint resurfacing, we may use Harris scores [16], of which one important component is the pain scores coded on an ordinal scale: none (1), slight (2), mild (3) and moderate or marked (4) pain in hip joint. We summarize in Table 1 the data regarding the frequency distribution according to pain scores taken at two and five years after hip joint resurfacing for 58 patients by gender published elsewhere [7]. Because there were very few patients with moderate or marked pain (4), we grouped these patients and patients with mild pain (3) into one category without loss of much information. It is of interest to study whether there is a difference in pain scores between genders over time after surgeries. In terms of our notation, we define “females” and “males” as group (g =) 1 and 2, as well as “two years” and “five years” after the surgery as period (z =) 1 and 2. From Table 1, the numbers of patients are n1=21 and n2=37 for the two comparison groups. We can calculate, for example, the estimated cell proportion πˆ111=n111/n1=7/21. The marginal percentage for each row (or column) is simply equal to the corresponding marginal total divided by the total number of patients ng(g=1, 2). For example, the marginal percentage πˆ1+1 in the first row for females is 0.476 (=10/21). Using marginal percentages for the rows between the two sub-tables (Table 1), we can then calculate the GOR estimate Gˆ1 (1) of pain scores at two years post surgeries as

Table 1:

Frequency distribution of patients with pain scores coded as: none (1), slight (2), mild, moderate or marked pain (3) taken at two and five years post-surgery between females and males.

Gender	At Two Years		At Five Years
		None	Slight	Mild, Moderate or Marked Pain	Marginal Total	Marginal Percentage
Female	None	7	1	2	10	0.476
	Slight	3	1	3	7	0.333
	Mild, Moderate or Marked Pain	0	1	3	4	0.190
	Marginal Total	10	3	8	21
	Marginal Percentage	0.476	0.143	0.381		1.000
		None	Slight	Mild, Moderate or Marked Pain	Marginal Total	Marginal Percentage
Male	None	19	7	2	28	0.757
	Slight	1	3	1	5	0.135
	Mild, Moderate or Marked Pain	0	0	4	4	0.108
	Marginal Total	20	10	7	37
	Marginal Percentage	0.541	0.270	0.189		1.000

[0.476×(0.135+0.108)+0.333×0.108]/[0.333×0.757+0.190×(0.757+0.135)]=0.360. Similarly, using marginal percentages for columns between the two sub-tables, we obtain Gˆ2=0.637 at five years post surgeries. As an example, we can interpret Gˆ1=0.360 as that the odds of a randomly selected male with the pain score higher than a randomly selected female at two years post surgeries is 0.360, given no ties in pain scores between genders. Furthermore, when employing interval estimators (9), we obtain 95 % confidence intervals [0.133, 0.974] and [0.250, 1.623] for G1 and G2, respectively. Since the upper limit of the 95 % confidence interval for G1 falls below 1, there is significant evidence at the 5 % level that males tend to have pain scores lower than females at two years after surgeries. However, since Gˆ2(=0.637) is closer to 1 than Gˆ1(=0.360), this difference in pain scores between genders seems to decrease as the time increases from two years to five years after surgeries. Note that the above resulting 95 % confidence interval for G2 covers 1. Thus, the difference in pain scores between genders is no longer significant at the 5 % level after five years post surgeries, despite that males is still likely to have pain scores lower than females (because Gˆ2 < 1). When using test procedures (5) and (6) for testing the overall equality of pain scores across two periods between genders, we obtain p-values as 0.086 and 0.131. On the basis of the above results, we may conclude that although males tend to fall, as compared with females, in categories with low pain scores especially at two years post surgeries, the difference in pain scores between genders seems to become smaller in the long-term results.

When assuming a normal random effects proportional odds model [17], we may employ Proc Glimmix in SAS [18] to study the difference in pain scores between genders on the basis of the model-based approach. Using the data in Table 1, we have obtained the parameter estimate (and its estimated standard error (SE)) for the relative gender effect of males versus females to be 0.961 (SE=0.6148). This leads the p-value for testing the equality of pain scores across the two periods between genders is 0.124, which is similar to those obtained by use of test procedures (5) and (6). Also, note that the parameter estimate 0.961 is larger than 0. Thus, males tend to fall in categories with lower pain scores than females. This inference is identical to that obtained on the basis of the GOR focused here.

We may sometimes encounter a trial in which the patient response taken at period (z=) 1 actually represents the baseline response. In this case, we may apply the test procedure (7), developed for testing group-by-period interaction, to study whether there is a relative treatment effect. To illustrate this point, we consider the data (Table 2) taken from a double-blind randomized clinical trial comparing an active hypnotic drug (g=1) with a placebo (g=2) in patients with insomnia [1, 14, 2]. The outcome of interest is to respond the question “How quickly did you fall asleep after going to bed?” and is recorded on a four-point ordinal scale (< 20, 20–30, 30–60, > 60 in minutes). Each participated subject was asked this question twice, one after a one-week placebo washout period for both groups and the other at the conclusion of a two-week treatment period. Using the data in Table 2, we obtain Gˆ1=1.031 and Gˆ2=1.883. Because subjects received placebo at the first one-week washout period in both groups, the estimate for the GOR of responses is expected to be, as shown Gˆ1=1.031, around 1 due to randomization. At the end of two-week treatment period, the estimate for the GOR of responses becomes Gˆ2=1.883 with 95 % confidence interval using interval (9) given by [1.270, 2.792]. Because this lower confidence limit falls above 1, one may conclude that the hypnotic drug changes the GOR over periods and tends to reduce time to falling asleep. When using test procedures (6) and (7), we obtain p-value as 0.002 and 0.004; these small p-values also suggest that the hypnotic drug can significantly reduce the time to falling asleep. We note that all these test results are essentially similar to those based on the proportional odds model with the cumulative logits found elsewhere [1]. When using the summary test procedure (5) using data in Table 2, we obtain the p-value 0.055. Note that since the GOR at the initial period is around 1, there is non-negligible probability that the time to falling asleep for patients in the placebo group is smaller than those in the other group at the initial washout period. As noted previously, use of a summary test procedure (5) over two periods in this case may not only be senseless (due to different treatments received at two periods), but also lack power.

Table 2:

Frequency distribution of patients with time to falling asleep (in minutes) taken at the end of one-week washout period and at the conclusion of two-week treatment period.

Treatment	At One-week Period	At Two-week Treatment Period
		<20	20-30	30-60	>60	Total
Active	< 20	7	4	1	0	12
	20-30	11	5	2	2	20
	30-60	13	23	3	1	40
	> 60	9	17	13	8	47
	Total	40	49	19	11	119
		<20	20-30	30-60	>60	Total
Placebo	< 20	7	4	2	1	14
	20-30	14	5	1	0	20
	30-60	6	9	18	2	35
	> 60	4	11	14	22	51
	Total	31	29	35	25	120

4 Discussion

We can employ, as demonstrated here, the GOR to measure the relative treatment effect without the need to assume any specific parametric model. Since the GOR has a simple interpretation and is easily understood, the GOR is of use in ordinal data. In fact, the GOR is closely related to the gamma correlation [19], a commonly-used measure of the strength of association between two ordinal variables. We refer readers to some publications on estimation of the GOR and its applications under other situations [20, 21, 22–24]. When repeated measurements are taken at the same time (or there are no period effects), one may employ the Dirichlet-multinomial distribution to model the intraclass correlation between repeated measurements [25]. As considered in the above two examples, however, repeated measurements on patients are often taken at different time intervals in clinical trials. The period effect is likely to exist and is required to be incorporated to avoid bias in data analysis [26, 27]. The methods based on the Dirichlet-multinomial model for cluster sampling without accounting for the period effect would not be appropriate for use in situations focused here.

The proportional odds model is probably one of the most commonly-used models to analyze ordinal data. Just like all model-based approaches, we can easily extend the proportional odds model to account for confounders (if there were) or accommodate other general situations. However, the implicit assumption of the proportional odds model can be badly violated by many bivariate distributions [13, 28, 29]. Furthermore, when applying Proc Glimmix in SAS [18] based on the random effects proportional odds model to ordinal data with repeated measurements, we need to assume that the random effects (accounting for the intraclass correlation between repeated measurements) due to patients follow a normal distribution. This normal assumption for random effects can be difficult to be justified. By contrast, the proposed method is model-free. It does not require the random effects due to patients to follow the normal distribution, not does assume any parametric models for the data structure. Thus, our methods are applicable despite of various parametric models for the underlying data structure and distribution assumptions for the patient random effects. Furthermore, the point estimators, test procedures and interval estimators developed here can all be expressed in closed forms. Readers may employ these test procedures and estimators by use of a pocket calculator even without knowledge of any statistical software. The interpretation of the GOR is, as illustrated in examples, easily understood. When there are confounders in a trial of a large size, we may extend the methods proposed here by use of stratified analysis with strata determined by the combined levels of confounders [23]. But we want to note that the model-based approach can be preferable to the model-free approach proposed here if there are many covariates to adjust for a trial of a small or moderate size.

Finally, we note that using similar arguments as above, it is straightforward to extend the results to accommodate the cases with three or more periods. We outlines the extension of results presented here to accommodate three periods in Appendix II.

In summary, we have developed model-free test procedures for testing equality of treatments in ordinal data with repeated measurements. We have further derived interval estimators for the relative treatment effect measured by the GOR. We recommend use of the summary test procedure to improve power when treatments received at two periods in a group are the same and there is no treatment-by-period interaction. However, we should cautiously employ this summary test procedure when there is a treatment-by-period interaction. The bivariate test procedure can be of use in this case. We further outline the extension of results to accommodate three periods. The results, findings and discussions should have use for biostatisticians and clinicians when they encounter ordinal responses with repeated measurements.

Funding statement: Funding: The research received no specific grant from any funding agency in the public, commercial, or not for-for-profit sectors.

Acknowledgements

The author wishes to thank the associate editor and two reviewers for many valuable comments and suggestions to improve the clarity and contents of this article.

Appendix I

Suppose that random vectors (X1,X2)′and (Y1,Y2)′ have the joint probability density functions f(X1,X2) and f(Y1,Y2), respectively. Suppose further that (X1,X2)′ and (Y1,Y2)′ are mutually independent. We can show that the covariance

(A.1)Cov(X1Y1,X2Y2)=Cov(X1,X2)Cov(Y1,Y2)+Cov(X1,X2)E(Y1)E(Y2)+Cov(Y1,Y2)E(X1)E(X2).

On the basis of (A.1), we have

(A.2)Cov(π^r+1π^r′+2,π^+s1π^+s′2)=[(πrs1−πr+1π+s1)/n1][(πr′s′2−πr′+2π+s′2)/n2]+[(πrs1−πr+1π+s1)/n1]πr′+2π+s′2+πr+1π+s1[(πr′s′2−πr′+2π+s′2)/n2].

Using the delta method, we may obtain the asymptotic covariance

(A.3)Covlog(Gˆ1),log(Gˆ2)=Covlog(ΠˆC1),log(ΠˆC2)−Covlog(ΠˆC1),log(ΠˆD2)−Covlog(ΠˆD1),log(ΠˆC2)+Covlog(ΠˆD1),log(ΠˆD2),

where Covlog(ΠˆC1),log(ΠˆC2)=[∑r=1L−1∑s=1L−1∑r′=r+1L∑s′=s+1LCov(πˆr+1πˆr′+2,πˆ+s1πˆ+s′2)]/(ΠC1ΠC2),

Covlog(ΠˆC1),log(ΠˆD2)=[∑r=1L−1∑s=2L∑r′=r+1L∑s′=1s−1Cov(πˆr+1πˆr′+2,πˆ+s1πˆ+s′2)]/(ΠC1ΠD2),

Covlog(ΠˆD1),log(ΠˆC2)=[∑r=2L∑s=1L−1∑r′=1r−1∑s′=s+1LCov(πˆr+1πˆr′+2,πˆ+s1πˆ+s′2)]/(ΠD1ΠC2),and

Covlog(ΠˆD1),log(ΠˆD2)=[∑r=2L∑s=2L∑r′=1r−1∑s′=1s−1Cov(πˆr+1πˆr′+2,πˆ+s1πˆ+s′2)]/(ΠD1ΠD2).

We can obtain an estimated asymptotic covariance Cov^(log(Gˆ1),log(Gˆ2)) with replacing the parameters in (A.3) by their corresponding parameter estimators.

Appendix II

For a randomly selected patient i(=1, 2,…, ng) from group g, we let Yizg denote the patient response at period z (=1, 2, 3). Let nrstg denote the number of patients with (Yi1g=Cr,Yi2g=Cs,Yi3g=Ct) among ng patients in group g. The random vector (n111g,n112g,⋯,n11Lg,n121g,n122g,...,n12Lg,⋯,nL11g,nL12g,...,nL1Lg,⋯,nLL1g,...,nLLLg)′ then follows the multinomial distribution with parameters ng(=∑r∑s∑tnrstg) and (π111g,π112g,⋯,π11Lg,π121g,π122g,...,π12Lg,⋯,πL11g,πL12g,...,πL1Lg,⋯,πLL1g,...,πLLLg)′, where πrstg den randomly selected patient from group g has the vector of responses (Yi1g=Cr,Yi2g=Cs,Yi3g=Ct). Note that we can estimate πrstg by πˆrstg=nrstg/ng. Thus, we can estimate the GOR of responses between group 2 and 1 at period z (=1, 2, 3) as given by

(A.4)Gˆz=ΠˆCz/ΠˆDz,

where ΠˆC1=∑r=1L−1∑r′=r+1Lπˆr++1πˆr′++2 and ΠˆD1=∑r=2L∑r′=1r−1πˆr++1πˆr′++2; ΠˆC2=∑s=1L−1∑s′=s+1Lπˆ+s+1πˆ+s′+2 and ΠˆD2=∑s=2L∑s′=1s−1π+s+1πˆ+s′+2;ΠˆC3=∑t=1L−1∑t′=t+1Lπˆ++t1πˆ++t′2 and ΠˆD3=∑t=2L∑t′=1t−1πˆ++t1πˆ++t′2.

Using the delta method [14], we obtain an estimated asymptotic variance for log(Gˆ1) with the logarithmic transformation as

Var^(log(G^1))=∑r=1L[∑r′=r+1Lπ^r′++2−G^1∑r′=1r−1π^r′++2]2π^r++1n1(∏^C1)2

(A.5)+∑r′=1L[∑r=1r′−1πˆr++1−Gˆ1∑r=r′+1Lπˆr++1]2πˆr′++2n2(∏ˆC1)2,

where ∑r=L+1Lπˆr++1=∑r′=L+1Lπˆr′++2=∑r=10πˆr++1=∑r′=10πˆr′++2=0. Similarly, we obtain the estimated asymptotic variances for log(Gˆ2) as given by

(A.6)Var^(log(G^2))=∑s=1L[∑s'=s+1Lπ^+s'+2−G^2∑s'=1s−1π^+s'+2]2π^+s+1n1(∏^C2)2++∑s'=1L[∑s=1s'−1π^+s+1−G^2∑s=s'+1Lπ^+s+1]2π^+s'+2n2(∏^C2)2

where ∑s=L+1Lπˆ+s+1=∑s′=L+1Lπˆ+s′+2=∑s=10πˆ+s+1=∑s′=10πˆ+s′+2=0. Also, we obtain an estimated asymptotic variances for log(Gˆ3) as

(A.7)Var^(log(G^3))=∑t=1L[∑t'=t+1Lπ^++t'2−G^3∑t'=1t−1π^++t'2]2π^++t1n1(∏^C3)2 + ∑t'=1L[∑t=1t'−1π^++t1−G^3∑t=t'+1Lπ^++t1]2π^++t'2n2(∏^C3)2

where ∑t=L+1Lπˆ++t1=∑t′=L+1Lπˆ++t′2=∑t=10πˆ++t1=∑t′=10πˆ++t′2=0.

Note that the covariance

(A.8)Cov(πˆr++g,πˆ+s+g)=∑k∑k′Cov(πˆr+kg,πˆ+sk′g)=∑kCov(πˆr+kg,πˆ+skg)+∑k∑k′Cov(k≠k′πˆr+kg,πˆ+sk′g)=∑k(πrskg−πr+kgπ+skg)−∑k∑k′(k≠k′πˆr+kgπˆ+sk′g)=πrs+g−πr++gπ+s+g.

Thus, from (A.1) we have

(A.9)Cov(πˆr++1πˆr′++2,πˆ+s+1πˆ+s′+2)=[(πrs+1−πr++1π+s+1)/n1][(πr′s′+2−πr′++2π+s′+2)/n2]+[(πrs+1−πr++1π+s+1)/n1]πr′++2π+s′+2+πr++1π+s+1[(πr′s′+2−πr′++2π+s′+2)/n2].

Using the delta method, we obtain the asymptotic covariance

(A.10)Covlog(Gˆ1),log(Gˆ2)=Covlog(ΠˆC1),log(ΠˆC2)−Covlog(ΠˆC1),log(ΠˆD2)−Covlog(ΠˆD1),log(ΠˆC2)+Covlog(ΠˆD1),log(ΠˆD2),

where Covlog(ΠˆC1),log(ΠˆC2)=[∑r=1L−1∑s=1L−1∑r′=r+1L∑s′=s+1LCov(πˆr++1πˆr′++2,πˆ+s+1πˆ+s′+2)]/(ΠC1ΠC2),

Covlog(ΠˆC1),log(ΠˆD2)=[∑r=1L−1∑s=2L∑r′=r+1L∑s′=1s−1Cov(πˆr++1πˆr′++2,πˆ+s+1πˆ+s′+2)]/(ΠC1ΠD2),

Covlog(ΠˆD1),log(ΠˆC2)=[∑r=2L∑s=1L−1∑r′=1r−1∑s′=s+1LCov(πˆr++1πˆr′++2,πˆ+s+1πˆ+s′+2)]/(ΠD1ΠC2),and

Covlog(ΠˆD1),log(ΠˆD2)=[∑r=2L∑s=2L∑r′=1r−1∑s′=1s−1Cov(πˆr++1πˆr′++2,πˆ+s+1πˆ+s′+2)]/(ΠD1ΠD2).

We obtain the estimated asymptotic covariance Cov^(log(G^1),log(G^2)) by substituting parameter estimators for their corresponding parameters in (A.10). Using the same arguments, we can also obtain Cov^(log(G^1),log(G^3)) and Cov^(log(G^2),log(G^3)). These lead us to obtain, as for the ordinal data with two repeated measurements, the summary test procedure, the trivariate test procedure, the test procedure for treatment-by-period interaction, as well as interval estimators for the GOR in the ordinal data with three repeated measurements. For example, consider the trivariate test procedure for testing H0:G1=G2=G3=1. We will reject H0 at the α-level if

(A.11)log(Gˆ1),log(Gˆ2),log(Gˆ3)Σˆ_−1log(Gˆ1)log(Gˆ2)log(Gˆ3)>χα2(3),

where Σˆ_ is the estimated covariance matrix with diagonal elements equal to Var^(log(G^1)), Var^(log(G^2)) and Var^(log(G^3)), and off-diagonal elements equal to Cov^(log(G^1),log(G^2)), Cov^(log(G^1),log(G^3)) and Cov^(log(G^2),log(G^3)), respectively.

References

1. Agresti A. A survey of models for repeated ordered categorical response data. Stat Med 1989;8:1209–24.10.1002/sim.4780081005Search in Google Scholar PubMed

2. Francom SF, Chuang-Stein C, Landis JR. A log-linear model for ordinal data to characterize differential change among treatments. Stat Med 1989;8:571–82.10.1002/sim.4780080506Search in Google Scholar PubMed

3. Kenward MG, Jones B. The analysis of categorical data from cross-over trials using a latent variable model. Stat Med 1991;10:1607–19.10.1002/sim.4780101012Search in Google Scholar PubMed

4. Kenward MG, Jones B. Alternative approaches to the analysis of binary and categorical repeated measurements. J Biopharm Stat 1992;2:137–70.10.1080/10543409208835036Search in Google Scholar PubMed

5. Koch GG, Landis JR, Freeman JL, Freeman JrDH, Lehnen RG. A general methodology for the analysis of experiments with repeated measurement of categorical data. Biometrics 1977;33:133–58.10.2307/2529309Search in Google Scholar

6. McCullagh P. A logistic model for paired comparisons with ordered categorical data. Biometrika 1977;64:449–53.10.2307/2345320Search in Google Scholar

7. Parsons NR, Costa ML, Achten J, Stallard N. Repeated measures proportional odds logistic regression analysis of ordinal score data in the statistical software package R. Comput Stat Data Anal 2009;53:632–41.10.1016/j.csda.2008.08.004Search in Google Scholar

8. Stram DO, Wei LJ, Ware JH. The analysis of repeated ordinal categorical outcomes with possible missing observations and time-dependent covariates. J Am Stat Assoc 1989;83:631–7.10.1080/01621459.1988.10478642Search in Google Scholar

9. Ware JH, Lipsitz S, Speizer FE. Issues in the analysis of repeated categorical outcomes. Stat Med 1988;7:95–107.10.1002/sim.4780070113Search in Google Scholar PubMed

10. Bartolucci F, Forcina A, Dardanoni V. Positive quadrant dependence and marginal modeling in two-way tales with ordered margins. J Am Stat Assoc 2001;96:1497–505.10.1198/016214501753382390Search in Google Scholar

11. Bartolucci F, Forcina A. Extended RC association models allowing for order restrictions and marginal modeling. J Am Stat Assoc 2002;97:1192–9.10.1198/016214502388618988Search in Google Scholar

12. Agresti A, Coull BA. The analysis of contingency tables under inequality constraints. J Stat Plan Infernce 2002;107:45–73.10.1016/S0378-3758(02)00243-4Search in Google Scholar

13. Agresti A. Generalized odds ratios for ordinal data. Biometrics 1980;36:59–67.10.2307/2530495Search in Google Scholar

14. Agresti A. Categorical Data Analysis. New York: Wiley, 1990.Search in Google Scholar

15. Lui K-J. Statistical Estimation of Epidemiology Risk. New York: Wiley, 2004.10.1002/0470094087Search in Google Scholar

16. Harris WH. Traumatic arthritis of the hip after dislocation and acetabular fractures: Treatment by mold arthroplasty. An end-result study using a new method of result evaluation. J Bone Joint Surg 1969;51-A:737–55.10.2106/00004623-196951040-00012Search in Google Scholar

17. Ezzet F, Whitehead J. A random effects model for ordinal responses from a crossover trial. Stat Med 1991;10:901–7.10.1002/sim.4780100611Search in Google Scholar

18. SAS Institute, Inc. User’s Guide, 2nd ed. Cary, NC: SAS Institute, 2009.Search in Google Scholar

19. Goodman LA, Kruskal WH. Measure of association for cross classification. J Am Stat Assoc 1954;49:732–64.10.1080/01621459.1954.10501231Search in Google Scholar

20. Edwardes MD, Baltzan M. The generalization of the odds ratio, risk ratio and risk difference to r k tables. Stat Med 2000;19:1901–14.10.1002/1097-0258(20000730)19:14<1901::AID-SIM514>3.0.CO;2-VSearch in Google Scholar

21. Lui K-J. Notes on estimation of the general odds ratio and the general risk difference for paired-sample data. Biometric J 2002a;44:957–68.10.1002/bimj.200290007Search in Google Scholar

22. Lui K-J, Chang K-C. Hypothesis testing and estimation in ordinal data under a simple crossover design. J Biopharm Stat 2012;22:1137–47.10.1080/10543406.2011.574326Search in Google Scholar

23. Lui K-J, Chang K-C. Notes on interval estimation of the generalized odds ratio under stratified random sampling. J Biopharm Stat 2013a;23:513–25.10.1080/10543406.2011.616977Search in Google Scholar

24. Lui K-J, Chang K-C. Notes on testing noninferiority in ordinal data under the parallel groups design. J Biopharm Stat 2013b;23:1294–307.10.1080/10543406.2013.834923Search in Google Scholar PubMed

25. Lui K-J. Interval estimation of generalized odds ratio in data with repeated measurements. Stat Med 2002b;21:3107–17.10.1002/sim.1239Search in Google Scholar PubMed

26. Gart JJ. An exact test for comparing matched proportions in crossover designs. Biometrika 1969;56:75–80.10.1093/biomet/56.1.75Search in Google Scholar

27. Senn S. Cross-over Trials in Clinical Research, 2nd ed. Chichester: Wiley, 2002.10.1002/0470854596Search in Google Scholar

28. Fleiss JL. On the asserted invariance of the odds ratio. Br J Preventive Soc Med 1970;24:45–6.10.1136/jech.24.1.45Search in Google Scholar PubMed PubMed Central

29. Mosteller F. Association and estimation in contingency tables. J Am Stat Assoc 1968;63:1–28.10.1080/01621459.1968.11009219Search in Google Scholar

Published Online: 2016-1-20

Published in Print: 2016-11-1

Articles in the same Issue

https://doi.org/10.1515/ijb-2015-0075

Keywords for this article

testing equality; interval estimators; generalized odds ratio; ordinal data; repeated measurements; model-free