Abstract
Randomized controlled trials have been the mainstay of high-level evidence for or against cancer screening strategies. However, simple proof of reduction in cause-specific mortality is not enough for policymaking, particularly in the last 10 years. Today’s clinical and public health guidelines take into account the balance of risks to benefits from screening, costs, utilities, the political risk of inaction, the societal tolerance to risk, healthcare providers’ preferences, client choices, and other imponderables or subjective variables that cannot be captured or addressed via epidemiologic studies. This commentary uses the above perspective in discussing Miettinen’s arguments concerning the science of breast cancer screening.
1 Introduction: ranking evidence and the paradigm of cervical cancer screening
In the totem pole of medical and public health decision-making, randomized controlled trials (RCT) represent the carvings that are closest to the top, above all other study designs and lines of evidence. The US National Cancer Institute’s respected Physician Data Query (known more commonly by its acronym PDQ) program ranks in descending order of strength: (1) evidence obtained from RCTs; (2) evidence obtained from nonrandomized controlled trials; (3) evidence obtained from cohort or case-control studies; (4) evidence from ecologic and descriptive studies; and (5) opinions of respected authorities based on clinical experience, descriptive studies, or reports of expert committees (http://www.cancer.gov/publications/pdq/levels-evidence/screening-prevention, Franco et al. 2002). In one form or another, the same ranking is observed by health technology agencies anywhere in the world. In the absence of RCTs, cancer screening policy must rely only on observational studies (i.e., #2 or #3 above) or ecologic-level evidence (i.e., #4), and, if expert panels have mulled over the findings from these lines of inquiry, it may also be able to rely on level-5 evidence.
For instance, policymaking on cervical cancer screening has never relied on RCTs or even non-randomized trials; not a single trial was ever conducted on the value of Papanicolaou (Pap) cytology in reducing cervical cancer mortality. [1] There is plentiful evidence from case-control and cohort studies, as well as from ecologic within-country and between-country comparisons of cervical cancer mortality contrasting screening with no screening (Franco et al. 2002). Policy recommendations and professional guidelines everywhere are unequivocal; Pap testing and its consequent diagnostic and treatment actions represent a valuable cancer prevention activity. [2] The science on cervical cancer prevention has been an embarrassment of riches. The uterine cervix is easily accessible and observable without invasive procedures. Screening tests rely on a simple and safe exfoliation of the endo- and ecto-cervical epithelium; one needs no x-ray-based imaging. [3] The precancerous stages of cervical cancer are easily identifiable and measurable with reasonable precision [4] and risks of progression and regression are well-known (Kim et al. 2007).
2 Sovereignty of randomized controlled trials
History was not as kind and policy decisions were not as easy with screening for cancers in anatomical sites other than the uterine cervix. There is no shortage of RCTs of screening for cancers of the breast, lung, prostate, and of the colon (and rectum). Even for cancers that can be detected at the pre-invasive stage, such as colorectal cancer, estimates of benefit have been difficult to estimate and seem to be dependent on the screening technology (Ransohoff 2009). Yet, policy discussions on screening for colorectal cancer are akin to a Gregorian chant when compared to the cacophony of the debates on prostate, breast and lung cancer screening. [5] Miettinen’s confessed disillusion with how science has informed (or according to him “misinformed”) the cancer screening debate (Miettinen this issue) is long-standing. Just over 20 years ago, as the senior epidemiologist in the then nascent “Early Lung Cancer Action Project” (ELCAP) he had outlined his “proposed paradigm for requisite research” based on a critique of the RCT as an inappropriate design to assess the change in case-fatality afforded by the clinical intervention (Henschke 1994). His point then and subsequently (Miettinen 2000, Miettinen et al. 2002) was that the study should be concerned with measuring the reduction in disease-specific mortality on those with disease detected by screening and subsequently treated by one or more clinical interventions, which could be compared directly as part of the study. He also insisted that assessment of screening benefit be focused on a time period that is relevant to the disease’s natural history and that it captured the effects from the repeated rounds of screening and treatment in the protocol under study. The time period should begin after all prevalent lesions are detected and managed and should continue long enough to permit the full clinical course of all lesions that would have progressed to a lethal conclusion for the patient had she/he not benefitted from having the disease discovered and treated as per the screening protocol. Hanley, a McGill colleague of Miettinen, [6] formally reviewed the critical importance of this time window (Hanley 2011).
Miettinen’s expressed disappointment with the timidity of policy recommendations stem mostly from the tyrannical [7] superiority of RCTs relative to all other study designs. Swimming against the current has surely frustrated him. It is hard to argue against the need for RCTs for medical or public health interventions in which the therapeutic or preventive maneuver is associated with the subjects’ (or patients’) or their physicians’ preferences. [8] Miettinen himself has argued on that basis (Miettinen 1983). What he must acknowledge, however, is that his squeaky-clean ideation of what a study design should be to fit the object of the research is never feasible in practice. I believe most trialists would find comfort in the clarity of the formulae he proposes as the underpinnings for what trials should be measuring. In practice, however, life for cancer screening trialists is messy. Screening interventions cannot be blinded, either singly or doubly. In most of the Western world, it is ethically untenable to conduct a placebo-controlled RCT for any given cancer screening intervention. Physicians have clear preferences for their patients. Clinical guidelines and professional best practices advocate for specific forms of cancer screening as part of periodic health examinations in many countries. [9] We have hammered on with the mantra of “early detection saves lives” for much too long; screening study participants may have a clear preference for which arm they want to be assigned. Today’s ethical conduct requires disclosure of anticipated benefits and harms before a subject is enrolled. Those that have a clear preference may not accept to participate or may wait to see to which arm they are randomized and eventually decide whether to complement the assigned regimen with visits to their doctors to get the same benefits of the study arm in which they hoped to have been assigned. Lack of subject blinding also leads to quantitative and qualitative differences in dropout rates between study arms. Subjects who perceive being vulnerable because of their trial assignment may eschew returning to follow-up visits, especially if they found the means to be screened outside of the trial. General practitioners or primary care providers, the gatekeepers of subject enrolment in many cancer screening trials, are also not blinded and may confide their preferences to patients, further aggravating the imbalance between arms.
Last, but not least, randomization, the sacred design feature that is intended to bring balance between arms regarding confounders and prognostic factors, may fail to be properly followed, whether by trial participants, their care providers, or even by study personnel. In North America, policy or clinical guideline decisions concerning prostate cancer screening have relied on findings from the US National Cancer Institute’s Prostate, Lung, Colorectal, and Ovarian Cancer Screening (PLCO) trial, which did not find a benefit even after long-term follow-up (Andriole 2012), whereas the European Randomised Study of Screening for Prostate Cancer (ERSPC) observed a clear mortality reduction that was sustained after 13 years (Schröder 2014). The PLCO trial was conceived in the early 1990’s soon after the wave of enthusiasm brought by the belief that prostate-specific antigen (PSA) screening could save the lives of men with prostate cancer. In consequence, the PLCO did not have the ideal counterfactual for a control group. There was a high level of screening by PSA testing and digital rectal examination in the control arm and suboptimal compliance with screening in the intervention arm. Both of these conditions severely blunted the trial’s ability to discriminate between arms. The ERSPC trial, being conducted in European populations that enjoy universal and mostly centralized health care systems, did not suffer from these protocol deviations and thus, the measure of screening benefit (Schröder 2014) was not blurred to the point of disappearance as happened in the PLCO trial (Andriole 2012).
3 Cause-specific mortality: an inadequate outcome?
If the foregoing was not complicated enough, cancer screening trialists must also contend with issues in defining endpoints. For a long time it was felt that death, the ultimate endpoint, was unassailable as an outcome. Incidence of a cancer precursor, stage shift, cancer incidence, and survival benefit were all considered surrogate endpoints prone to biases or errors in measurement. In the past 15 years, however, some trials that relied on independent ascertainment of mortality data [10] were shown to be affected by biased adjudication of conditions leading to death among participants. If the participant dies from a condition that is totally unrelated to the screening or treatment process or from the disease following its clinical course that death will not be assigned as cause-specific and the participant’s person-time already contributed to the study becomes censored at the time of death. This decision is critical because death can come as a direct consequence from the cancer being targeted by screening, from harm caused by the screening procedure, [11] or by an early and unintended result from cancer treatment, [12] all of which must be assigned as cause-specific deaths. Thus, the need to distinguish the screening-induced effects (hopefully beneficial) from harm caused by overdetection, overdiagnosis, and overtreatment dictated that each death in a screening study be properly adjudicated as to whether it is disease- or treatment-related. [13] If not, oddly named biases, such as sticky-diagnosis and slippery-linkage, may ensue and invalidate the results (Black 2002). The obvious corollary is that expert personnel tasked with adjudication of the causes of death be completely blinded to the study arm. However, for the sake of conservatism, a more fail-proof design feature was introduced, i.e., the recommendation that trials of screening attempt to demonstrate an impact on all-cause mortality (Black 2002; Saquib 2015). Although it eliminates the hand-wringing that comes from interpreting and defending decisions concerning death attribution, the use of all-cause mortality as endpoint brings the need for trials to be much larger (and thus more expensive) to be able to detect what would be arguably smaller reductions (or increases) in mortality as a consequence of the net effect of screening.
4 Pragmatism trumps idealism
Life in the trenches of RCTs of cancer screening is arduous and lonely. The critics are many, judging from the ever-growing category of systematic reviews among papers appearing in medical journals. Miettinen does recognize the tribulations suffered by trialists (Miettinen this issue), by examining their dilemma. His clarity of logic [14] in proposing the ideal scenario:
“one might first contemplate the theoretically ideal trial: Suitably-informed volunteers from the domain of the study (of freedom from the cancer’s clinical manifestations, etc.) would be enrolled to application of the diagnostic protocol, and those diagnosed with the cancer (preclinically) – with a genuine, not overdiagnosed case of it – would be randomly assigned to the defined early (undelayed) treatment or to its late (overt-stage) alternative.”
Comes with an admission that this proposal is untenable in
“But the actual conduct of such a trial would be marred by (unavoidable) overdiagnoses, and the feasibility of its execution (with whatever flaws in the assurance of validity) would be negated by ethical considerations if not by unavailability of volunteers.”
Trialists have merely executed the best research that money could buy based on era-specific standards of study design and ethical boundaries. Miettinen has few (but sharp) criticisms for those involved with systematic reviews, meta-analyses, and policy decisions. I agree with him that review panels tasked with these activities frequently lack sufficient expertise in substantive (disease) and methodologic (e.g., epidemiologic) areas. On the other hand, he (and I) should not be surprised that syntheses and decisions that stem from these panels take into account more than the epidemiologic principles that he has developed and taught to so many in public health, directly or indirectly (Franco et al. 2014). His utopian logic has no place in policy decisions, which take into account the balance of risks to benefits from screening, costs, utilities, the political risk of inaction, the societal tolerance to risk, healthcare providers’ preferences, patient choices, and other imponderables or subjective variables. The Swiss report that was the focus of Miettinen’s commentary (Miettinen this issue) cannot be faulted for lack of clarity and sincerity of purpose (Biller-Andorno and Jüni 2014). Much like the Swiss cool-headed neutrality that dictated their country’s destiny in critical moments in history, their choice with breast cancer screening is a resolute interpretation of the evidence base reconciled with their perception that women should be better informed about the risk-benefit balance (Biller-Andorno and Jüni 2014).
The financial requirements of RCTs of cancer screening are larger by a factor of 10–100 relative to those of observational studies. A typical trial with 20,000 participants will cost $10M. Granting agencies usually combine intramural and extramural resources to be able to fund such trials. The ability to run trials across multiple centres and countries helps disperse costs to more funding agencies than those linked to the principal investigator but undeniably, they represent very costly research activities. Since it began in the 1960s, the pursuit of RCT research on the value of cancer screening has likely passed the one or two billion-dollar mark. Yet, not even one of the RCTs for screening, irrespective of anatomical site of cancer, can be deemed as a paradigm; they are all more or less flawed in conduct or in interpretation (or in both).
Future research on screening would be more cost-effective if RCTs focused on understanding the value (and harms) from screening in specific population subsets defined by risk stratification. This would decrease costs because of reduced sample size requirements, while increasing efficiency by focusing on the segment of the population with greater clinical risk. [15] Finally, it would help also if the cancer control community used more of the knowledge base than simply the upper carvings of the totem pole. Many useful insights can be had from non-randomized investigations or pooled analyses of individual level records from these studies. It is easy to dismiss non-RCTs to simplify the task of evidence review. RCTs can be collapsed and fitted neatly into summary tables. Case-control, cohort, and other studies are each one-of-a-kind research investigations that require from the evidence reviewer an individualized interpretation and more attention to the gems and perils that lurk beneath the surface. As I have argued, and Miettinen would agree, we need a more eclectic approach to assessing the evidence for or against cancer prevention and control strategies (Franco 2012).
Acknowledgments
The author has served as occasional consultant to pharmaceutical (GSK, Merck) and biotechnology (Roche, Gen-Probe, BD, Qiagen, Ikonisys) companies involved with HPV vaccination, HPV diagnostics, and cervical cytology screening.
Funding: The author’s research on cancer screening has been funded by the Canadian Institutes of Health Research (grants MOP-64454, MOP-49396, MCT-54063, CRN-83320), the National Institutes of Health (grant CA70269), and Cancer Research Society.
Disclosure: The author declares that he has no conflict of interest.
References
Andriole GL, Crawford ED, Grubb RL, Buys SS, Chia D, Church TR, et al. Prostate cancer screening in the randomized Prostate, lung, colorectal, and ovarian cancer screening trial: mortality results after 13 years of follow-up. Journal of the National Cancer Institute 2012 Jan 18;104:125–32.10.1093/jnci/djr500Search in Google Scholar
Black WC, Haggstrom DA, Welch HG. All-cause mortality in randomized trials of cancer screening. Journal of the National Cancer Institute 2002;94:167–73.10.1093/jnci/94.3.167Search in Google Scholar
Biller-Andorno N, Jüni P. Abolishing mammography screening programs? A view from the Swiss medical board. New England Journal of Medicine 2014;370:1965–7.10.1056/NEJMp1401875Search in Google Scholar
Franco EL. Towards more eclectic evidence-based medicine in cancer prevention and control. Preventive Medicine 2012;55:552–3.10.1016/j.ypmed.2012.09.017Search in Google Scholar
Franco EL, Duarte-Franco E, Rohan TE. Evidence-based policy recommendations on cancer screening and prevention. Cancer Detection and Prevention 2002;26:350–61.10.1016/S0361-090X(02)00118-6Search in Google Scholar
Franco EL, Shinder GA, Tota JE, Isidean SD. Striving for excellence while adapting to change: redefining our mission of serving the preventive medicine community. Preventive Medicine 2014;67:311–12.10.1016/j.ypmed.2014.07.021Search in Google Scholar
Hanley JA. Measuring mortality reductions in cancer screening trials. Epidemiologic Reviews 2011;33:36–45.10.1093/epirev/mxq021Search in Google Scholar
Henschke CI, Miettinen OS, Yankelevitz DF, Libby DM, Smith JP. Radiographic screening for cancer. Proposed paradigm for requisite research. Clinical Imaging 1994;18:16–20.10.1016/0899-7071(94)90140-6Search in Google Scholar
http://www.cancer.gov/publications/pdq/levels-evidence/screening-prevention. Accessed 31 May 2015.Search in Google Scholar
Kim JJ, Kuntz KM, Stout NK, Mahmud S, Villa LL, Franco EL, et al. Multiparameter calibration of a natural history model of cervical cancer. American Journal of Epidemiology 2007 Jul 15;166:137–50.10.1093/aje/kwm086Search in Google Scholar PubMed
Miettinen OS. The need for randomization in the study of intended effects. Statistics in Medicine 1983;2:267–71.10.1002/sim.4780020222Search in Google Scholar PubMed
Miettinen OS. Screening for lung cancer: do we need randomized trials? Cancer 2000 1;89:2449–52.10.1002/1097-0142(20001201)89:11+<2449::AID-CNCR20>3.0.CO;2-8Search in Google Scholar
Miettinen OS. ‘Screening’ for breast cancer: misguided research misinforming public policies. Epidemiological Methods (this issue) 2015.10.1515/em-2015-0020Search in Google Scholar
Miettinen OS, Henschke CI, Pasmantier MW, Smith JP, Libby DM, Yankelevitz DF. Mammographic screening: no reliable supporting evidence? Lancet 2002;359:404–5.10.1016/S0140-6736(02)07592-XSearch in Google Scholar
Ransohoff DF. How much does colonoscopy reduce colon cancer mortality? Annals of Internal Medicine 2009;150:50–2.10.7326/0003-4819-150-1-200901060-00308Search in Google Scholar
Saquib N, Saquib J, Ioannidis JP. Does screening for disease save lives in asymptomatic adults? Systematic review of meta-analyses and randomized trials. International Journal of Epidemiology 2015;44:264–77.10.1093/ije/dyu140Search in Google Scholar
Schröder FH, Hugosson J, Roobol MJ, Tammela TL, Zappa M, Nelen V, et al. Screening and prostate cancer mortality: results of the European randomised study of screening for prostate cancer (ERSPC) at 13 years of follow-up. Lancet 2014;384:2027–35.10.1016/S0140-6736(14)60525-0Search in Google Scholar
©2015 by De Gruyter
Articles in the same Issue
- Frontmatter
- Editorial
- Research to Inform Public Policies on Screening for a Cancer: A Critical Disquisition Followed by Invited Commentaries
- Article
- ‘Screening’ for Breast Cancer: Misguided Research Misinforming Public Policies
- Comments
- Comments on “‘Screening’ for Breast Cancer: Misguided Research Misinforming Public Policies” by O. S. Miettinen
- Comment on: ‘Screening’ for Breast Cancer: Misguided Research Misinforming Public Policies, by O. S. Miettinen
- Comment on Misguided Research Misinforming Public Policies
- Perfect is the Enemy of Good: Going to the War on Cancer with Less Evidence than We Could Have
- Enhancing the Validity and Generalizability of Randomized Trials of Cancer Screening
- Discussion of a Paper by Professor Miettinen
- Rejoinder
- Understanding the Research Needed to Inform Public Policies on ‘Screening’ for a Cancer
- Articles
- Model Choice Using the Deviance Information Criterion for Latent Conditional Individual-Level Models of Infectious Disease Spread
- Doubly Robust Estimation with the R Package drgee
- Age–Period–Cohort Models and the Perpendicular Solution
Articles in the same Issue
- Frontmatter
- Editorial
- Research to Inform Public Policies on Screening for a Cancer: A Critical Disquisition Followed by Invited Commentaries
- Article
- ‘Screening’ for Breast Cancer: Misguided Research Misinforming Public Policies
- Comments
- Comments on “‘Screening’ for Breast Cancer: Misguided Research Misinforming Public Policies” by O. S. Miettinen
- Comment on: ‘Screening’ for Breast Cancer: Misguided Research Misinforming Public Policies, by O. S. Miettinen
- Comment on Misguided Research Misinforming Public Policies
- Perfect is the Enemy of Good: Going to the War on Cancer with Less Evidence than We Could Have
- Enhancing the Validity and Generalizability of Randomized Trials of Cancer Screening
- Discussion of a Paper by Professor Miettinen
- Rejoinder
- Understanding the Research Needed to Inform Public Policies on ‘Screening’ for a Cancer
- Articles
- Model Choice Using the Deviance Information Criterion for Latent Conditional Individual-Level Models of Infectious Disease Spread
- Doubly Robust Estimation with the R Package drgee
- Age–Period–Cohort Models and the Perpendicular Solution