Abstract
Advanced artificial intelligence (AI) techniques such as ‘deep learning’ holds promise in healthcare but introduces novel legal problems. Complex machine learning algorithms are intrinsically opaque, and the autonomous nature of systems can produce unexpected harms, which leaves open questions around responsibility for error at the clinician/AI interface. This raises concerns for compensation systems based in negligence because claimants must establish that a duty exists and demonstrate the specific fault that caused harm.This paper argues that clinicians should not ordinarily be negligent for following AI recommendations, and developers are unlikely to hold a duty of care to patients. The healthcare provider is likely to be the duty holder for AI systems. There are practical and conceptual problems with comparing AI errors to human performance or other AI systems to determine negligence. This could leave claimants with unsurmountable technical and legal challenges to obtaining compensation. Res ipsa loquitur could solve these problems by allowing the courts to draw an inference of negligence when unexpected harm occurs that would not ordinarily happen without negligence. This legal framework is potentially well-suited to addressing the challenges of AI systems. However, I argue res ipsa loquitur is primarily an instrument of discretion, which may perpetuate legal uncertainty and still leave some claimants without a remedy.
I Introduction
The use of advanced artificial intelligence techniques such as ‘deep learning’ may have a profound impact in healthcare. These complex machine learning models have the potential to improve patient outcomes in diagnostics and treatment planning, which could save lives and help alleviate resource constraints. For example, image recognition algorithms have demonstrated potential in the interpretation of head CT scans,[1] as well as in the diagnosis of malignant tumours in breasts,[2] lungs,[3] skin[4] and the brain.[5] There are claims that decision-making with support from AI systems has the potential to improve the performance of even experienced radiologists in diagnosing lung cancers.[6] Avoiding common errors may become a ‘core competency’ of AI, as it can putatively avoid inattention, fatigue and cognitive biases.[7] Numerous publications from public and private research institutions highlight the many potential benefits and anticipate that half of the clinical workforce could be using artificial intelligence within a decade.[8]
The implications for clinical negligence should be addressed because AI systems are not perfectly accurate and will still lead to errors in diagnosis and treatment recommendations. There are some concerns that AI is caught up in hype and may not deliver on the promise because there is still a lack of good quality evidence, and a need for more comprehensive randomised controlled trials,[9] but the most recent evidence suggests that the diagnostic performance of deep-learning models is now equivalent to that of healthcare professionals.[10] Therefore, where AI systems make errors at the same rate as human clinicians, many patients may still be entitled to compensation for the harm that they suffer. However, errors at the interface of machine learning systems and human clinicians raise important legal issues around where responsibility lies and how a patient should be able to make a claim. It is axiomatic that not all errors are negligent and that a coherent legal test is required to determine when compensation should be awarded.
This paper examines machine learning specifically, which is a type of AI that allows software applications to become more accurate at predicting outcomes without being explicitly programmed. Systems learn from historical data and identify patterns and make decisions with minimal human intervention. Machine learning systems are intrinsically opaque and often referred to as ‘black boxes’.[11] The black box analogy describes the lack of interpretability or transparency in the way that these highly complex mathematical models develop their knowledge because they are difficult to understand, even by the experts who build them. Since the inner workings of the algorithm cannot be feasibly understood, it can be difficult to identify and correct the causes of errors and harmful outcomes. This inevitably raises concerns in compensation systems based in negligence, where a claimant generally needs to establish the nature and extent of the specific fault that caused their harm. It is theoretically possible that claims of systemic negligence could be targeted at healthcare providers for failures to operate safe systems, but it is uncommon, and litigation is mostly directed at the actions of individuals.[12] There are longstanding criticisms that fault-based liability is unfairly distorted by heuristics such as hindsight bias and outcome bias, which can make errors appear more predictable in hindsight with the corollary that these individuals were careless.[13]
The lack of explicability is not the only unresolved legal issue: if the system is making predictions and recommendations that cannot reasonably be anticipated by the developers, then it presents a novel form of agency introducing a legal conundrum around who can or should take responsibility for errors. The opacity and agency of machine learning systems also present an additional complication because they have been shown to be biased and could recommend morally unjustifiable differences in treatments to patients within different groups. For example, cancer detection algorithms have been shown to be less effective on dark skin.[14] The development of AI is a complex sociotechnical endeavour with many processes and actors suggesting that some errors may not be caused by a single fault at all, but rather responsibility may be distributed along long chains of agency across the ‘AI lifecycle’. The claim is not necessarily to state that, in the type of AI errors anticipated, no single act can be at fault, or that all AI faults will generally be distributed, but that the question of contribution and causation is complex, which can stretch the coherency of fault-based systems.
The question examined in this paper is whether clinical negligence regimes can still function within this new technological paradigm, or whether this technology is advancing beyond the capacity of the law to provide a remedy. It is not the first time that novel technologies have required the law to adapt to new technological realities: after all, the law of negligence had to respond to out of control stagecoaches long before driverless cars.[15] However, the literature suggests that AI systems present a genuinely novel challenge to the law and that clinical negligence regimes could soon surpass their utility, meaning that the appropriate legal responses to patient harm may have to be found in strict liability regimes,[16] no-fault systems,[17] shared insurance schemes,[18] or in theorising novel legal solutions for the age of AI, such as ascribing legal personality to it.[19] The aim of this paper is not to categorically solve the problem of who is liable for AI errors. Instead, the more modest focus is on whether, in certain prescribed circumstances, the law can still allow negligence claims to succeed when relevant facts cannot be established by drawing an inference of negligence. The regulatory and legal landscape within healthcare is complex and this analysis of the potential scope and application of negligence law is relevant to broader legal and policy considerations around how to respond to emerging accountability gaps within AI healthcare systems. The analysis of healthcare specifically in this paper is a useful case study, which could inform discussions around the role of tort in responding to AI harms more generally.
The use of inferences in determining liability for negligence requires an examination of the putative legal doctrine of res ipsa loquitur. Res ipsa loquitur is a Latin maxim which literally means the ‘thing speaks for itself’ and can allow a negligence claim to succeed when the exact nature of the breach cannot feasibly be demonstrated by a claimant. It is not an ancient legal doctrine, and the first use of the phrase is attributed to the 1863 case of Byrne v Boadle.[20] The case involved an unsuspecting pedestrian who was seriously injured when a barrel of flour inexplicably fell from a warehouse window above the street. The defendant claimed that the lack of any direct evidence of what occurred precluded liability. Judge Baron Pollock, presiding, disagreed, stating that ‘there are certain cases where it might be said res ipsa loquitur,’ and that ‘the mere fact that the accident has occurred is evidence of negligence.’[21]
This capacity of the law to rely on inferences when certain information cannot be determined is a useful avenue of enquiry in relation to solving the legal intractability of machine learning systems, but the doctrine demands a deeper consideration because there is a significant judicial inconsistency in the way that res ipsa loquitur has been described and applied. There is an enduring scepticism around whether res ipsa loquitur should be considered a legal doctrine at all, or whether it is a principle of law, a rule of evidence, or something of a ‘myth’[22] and an unnecessary Latinism that should be removed from the vocabulary of lawyers. As Lord Shaw comments in Ballard ‘if the phrase had not been in Latin nobody would have called it a principle’.[23] The inconsistent use of terminology promulgates the confusion but res ipsa loquitur continues to be part of the negligence law architecture, and to a greater or lesser degree, is used in many common law jurisdictions. The paper considers the English jurisdiction and not other European jurisdictions but it may be of wider interest for two reasons: firstly, the concept of duty of care is widely shared across jurisdictions; and secondly, the general principles and contours of res ipsa loquitur still exist in other jurisdictions such as the German legal principle of ‘Anscheinsbeweis’, which determines that certain facts or circumstances may give rise to a presumption that a particular event occurred, even if direct evidence is lacking.[24]
This article addresses the issues in four sections. Part II introduces machine learning in healthcare and sets out the novel technical challenges and describes the legal implications in the healthcare context. Part III then turns to clinical negligence and argues that the present clinical negligence paradigm will not be well-suited to AI errors. In part IV, an analysis of res ipsa loquitur highlights that there are discordant legal interpretations, but the case law adequately sets out the principles that are applied to its use; I argue that res ipsa loquitur could present a practical legal framework for AI errors, which merits further consideration. Part V then builds on this analysis to determine whether res ipsa loquitur can rescue clinical negligence regimes in the age of machine learning. I argue that it does not necessarily matter whether res ipsa loquitur is described as a legal doctrine because it offers a theoretical path to justice for claimants and could prevent inequity in the face of doctrinal roadblocks. It is outside the scope of this paper to get drawn into well-rehearsed debates about what qualifies as doctrine, and whether legal certainty or the discretion to remedy injustice is the appropriate lens for drawing normative conclusions about the law.[25] Instead, I acknowledge that res ipsa loquitur has always been situated in a contested legal space and that, as primarily an instrument of discretion, res ipsa loquitur is likely to offer a partial solution to AI errors but may be restricted to the most serious and unjustifiable errors that invite a principled approach from the judiciary.
II Machine learning in healthcare
Machine learning is not a new AI development, the term was coined by AL Samuel in 1959 and defined as the ‘field of study that gives computers the ability to learn without being explicitly programmed.’[26] There are several different methods, but all require high quality data to perform well. Over the last decade, substantial progress has been made and machine learning systems are now embedded into our social reality, powering our smart assistants, filtering emails, curating social media feeds, and performing writing tasks through popular large language models such as Open AI’s ChatGPT.[27] Machine learning has already accelerated significant scientific breakthroughs such as the Deepmind ‘Alphafold’ programme which solved a longstanding scientific problem by predicting protein structures from their amino acid sequences, leading to the discovery of new treatments.[28] Currently, advanced neural networks such as deep-learning techniques promise the most significant breakthroughs in healthcare because of their versatility in being able to learn from data without the need to encode task-specific knowledge. The name ‘neural network’ refers to the metaphorical way that the processes simulate the neurons of the human brain. The ‘deep’ refers to the many layers of functions within the system. These mathematical functions flow through the layers, adjusting parameters to allow the system to learn from prior outputs and predictions. Explanations of this process can be so abstract and mathematically complex that they cannot feasibly be understood in ordinary language, meaning it is not generally possible to understand exactly why a particular decision has been made. The implications are that the outputs and internal workings of a machine learning algorithm remain inconsistent with legal norms, which generally require the processes of professional decision-making to be explained in witness statements and oral evidence, which form the factual basis for resolving negligence claims.
The AI systems examined in this paper are advisory systems. They will not be legally responsible for making decisions. At some point in the future, it may be possible to introduce closed loops where decisions are made independently by machines, but at present an AI system will advise the clinician to take a particular action and then they will remain responsible for the implementation of the care. There is arguably some scope to advocate for alternative views about the ‘advisory’ nature of AI decisions within this technological reontologisation of medical care. For example, it is well established through Kahneman’s explication of dual systems of thinking that human decision-making is subject to cognitive biases, overconfidence, emotional influence, and reliance on mental shortcuts, particularly when making decisions quickly,[29] which can particularly affect clinicians.[30] Avoiding these common flaws in human judgement may become a core competency of AI.[31] If an AI system reaches a higher diagnostic rate than clinicians, then, ex hypothesi, the error rate will fall, which arguably justifies deferring all decision-making to AI at a system level with consequentialist justifications. However, there are two problems with this assumption: firstly, in rare circumstances, AI errors can be highly counterintuitive, which may mean that clinicians will likely remain legally responsible whilst this is an open problem; secondly, it is discernible that the distribution of error tends to disproportionately impact specific demographic cohorts, thereby raising legitimate concerns regarding the equitable application of such systems in a non-advisory capacity, even though these errors will be undetectable at the clinician/AI interface.[32] Under a paradigm where medical decision-making is always deferred to algorithms, AI could credibly no longer be categorised as ‘advisory’; this would arguably necessitate alternative legal solutions which could not be fault-based; such a fairness-based justification of compensating victims is advocated by Keren Paz in the context of innovative treatments, where adverse events can advance medical knowledge which benefits patients more generally.[33]
Under current EU law, allowing AI systems full autonomy over healthcare decisions would not be permitted.[34] However, this interface between man and machine bears further scrutiny. The argument that the systems are ‘advisory’ does not necessarily reflect the way that they will operate in the practical clinical environment. In the following section, I briefly argue that the superior expertise of AI systems, the inscrutable nature of AI recommendations, automation bias, and the predilection for biased and discriminatory outputs means that it is a legal fiction to present AI systems as merely ‘advisory’.
Firstly, there is much hope that diagnostic and treatment recommendation systems could relieve pressure in frontline emergency care and are likely to be used by junior clinicians working under considerable system pressures. Doctors will be expected to follow the correct professional procedure and use the AI system for diagnostic and treatment advice. The doctor will know that the system has been tested, gained regulatory approval, and has demonstrated diagnostic accuracy equivalent to that of an experienced specialist. In this anticipated clinical paradigm, the AI system could be considered analogous to a specialist clinician offering advice to a generalist,[35] and doctors will phenomenologically experience AI advice like being overruled by a senior colleague.[36] If the doctor then decides to commence the treatment and the predictions made by the system are incorrect, then the patient will not be treated appropriately as a result.
To hold the doctor solely responsible is problematic because the putative negligent act will involve the validation of a treatment recommendation that is uninterpretable to them: AI recommendations come on a ‘take it or leave it’ basis. Therefore, some argue that AI in healthcare should be restricted to more ‘interpretable’ models.[37] There is intense academic interest in creating a form of explainable AI but explanatory breakthroughs have been ‘few and far between.’[38] There are a range of techniques designed to make machine learning more interpretable but Lipton argues that the technical descriptions of interpretable models are ‘diverse and occasionally discordant’ and that ‘explainable AI’ is not a single concept.[39] The methods developed to attempt to explain AI outputs are highly technical processes which are generally only useful to developers, and offer no help to clinicians who are unsure why they have been given particular advice. There are methods to make machine learning systems more explainable, but this generally reduces the level of accuracy, which is undesirable in a medical diagnostic context. If AI is restricted to explainable models in healthcare, then it is likely to mean foregoing the benefits of deep-learning techniques altogether. It raises the ethical question of how much prediction accuracy should be sacrificed to gain any form of interpretability. There have been recent attempts to solve this problem by creating separate AI algorithms capable of increasing the interpretability of decisions by revealing the salient features used to make a prediction:[40] one AI guesses what another AI is looking at. These techniques are designed to present AI advice in parameters understood by clinicians, but the underlying problem is recognised by Grote and Berens who point out that that dialectical reasoning between the system and clinician is not possible, which would produce better outcomes, and this presents a novel challenge to the clinician because ‘given the opacity and the overconfidence of machine learning algorithms, assessing their epistemic position is currently not feasible.’[41] This has produced limited success in healthcare to date,[42] and as systems increase in complexity, an explainable AI could prove to be a fool’s gold or ‘false hope’;[43] therefore, justice requires that the law must adapt to errors where there is no prospect of a satisfactory explanation of how inaccurate recommendations have materialised.
The inscrutable nature of AI decisions introduces an additional risk at the clinician/AI interface. It is well established that there are cognitive biases such as automation bias (AB) when decision-support systems are used. Goddard describes automation bias as the process:
‘by which users tend to over-accept computer output ‘as a heuristic replacement of vigilant information seeking and processing.’ AB manifests in errors of commission (following incorrect advice) and omission (failing to act because of not being prompted to do so) when using CDSS [Clinical Decision Support Systems]’.[44]
Automation bias is prevalent in medical decision-making generally and cannot be reliably removed. Factors such as the complexity of tasks, and decision-making under time constraints make automation bias more likely to occur.
Finally, an additional challenge to the black box problem relates to the way that AI is designed and created, which introduces a different kind of system-level opacity into the legal puzzle. AI development is often referred to as the ‘AI lifecycle’, which describes the many processes that are required to build machine learning systems including: the design of the product; the acquisition of the data; creating and evaluating the model; and deployment.[45] There are many points of potential failure in data collection, product development as well as clinical use of AI.[46] The nature of these errors can be highly technical, difficult to detect, and may produce harms that are socially unjustifiable. An example garnering much attention is that AI applications are particularly susceptible to bias.[47] Bias in training data occurs because the data sources themselves may not reflect the true epidemiology within a given demographic.[48] This means that errors are more likely to occur in under-represented groups such as ethnic minorities,[49] women,[50] and those with disabilities.[51] In the US, an algorithm used to allocate healthcare resources had been widely discriminating against African Americans; subsequently, they were far less likely to be referred for treatment than white people when equally sick.[52] Structural inequalities in data could be distilled into unsafe recommendations which are implemented by doctors in frontline care. Whilst the missing data is an extant problem in medical care, the way that it crystalises into AI outputs exacerbates the problem, and using past data to make predictions risks creating ‘runaway feedback loops’.[53] This makes understanding relative levels of bias between AI and clinicians exceptionally complex. This unresolved challenge presents two potential dimensions of unfairness. Firstly, it may be unreasonable to hold doctors in breach of their duty of care for validating certain AI errors which could plausibly originate from biased outputs. Errors that originate from systemic biases will be particularly difficult to recognise and address at the level of an individual patient and presents a ‘new dimension of moral luck’ for clinicians.[54] Secondly, it presents a significant challenge to victims in identifying and evidencing the fault that caused additional harm, and justice demands that patient groups particularly vulnerable to AI decision-making can be assured of fair compensation.
When considering these factors, it is reasonable to state that doctors should not ordinarily be negligent for following AI advice. However, it is not possible to assert this as a rule because there is increasing evidence that an inherent vulnerability within AI systems is that they can use spurious correlations to make predictions.[55] These errors may or may not be detectable by clinicians who will also have to consider the resource costs and effects on system-level diagnostic accuracy when second-guessing AI advice.[56] However, there will be circumstances when heedlessly following AI advice could constitute a negligent act, which means that the AI/clinician interface will remain a potential locus of fault attribution. For example, a much-discussed example in the literature involves a problem with a machine learning algorithm designed to predict the probability of death amongst hospital patients with pneumonia to assist HCP (healthcare professional responsible for care) decision-making.[57] Patients with asthma were systematically classified as presenting a lower risk by the algorithm. This computational determination was fundamentally flawed because asthma patients within the historic data were routinely sent straight to the ICU where the continuous intensive treatment improved their prognosis, thereby making it appear that they have a better chance of survival. If implemented into a triage system, the patients would be treated inappropriately.
The potential sources of AI error are manifold and complex, and it will be difficult for claimants to identify a particular fault without significant expertise and resources to manage a complex claim. If the clinician is not ordinarily in breach of duty for following AI advice and there is no feasible explanation of why an error occurred, how does the claimant obtain compensation under the current clinical negligence paradigm?
III The failure of clinical negligence?
A Duty of care
If a patient suffers harm following a medical error involving AI, the first practical challenge for the claimant seeking to sue in negligence is where to direct the claim. The analysis of res ipsa loquitur is principally concerned with establishing a breach of duty but it makes sense to briefly examine the prior question of who the duty holder is likely to be.
Establishing the healthcare provider as the prospective defendant is legally unproblematic because NHS trusts are vicariously liable for the actions of their employees. NHS indemnity ensures that any liability incurred by a clinician will be met by the employer. Clinicians can be sued directly,[58] but Brazier and Cave suggest that ‘patients and their lawyers should ponder the wisdom of suing a doctor personally. As the employer will meet the cost of compensation in full, there is little to gain from naming the doctor in full.’[59]
The hospital also has a non-delegable duty towards the patients of a hospital for providing safe systems of care. Lord Browne Wilkinson stated in (x) minors v Bedfordshire CC: ‘it is established that those conducting a hospital are under a direct duty of care, to those admitted as a patient of the hospital’.[60] This was reaffirmed in the Supreme Court in Essex v Woodland County Council,[61] and it uncontroversial to state that, in all paradigms of care anticipated in this paper, the hospital or NHS trust will owe a duty of care. In cases where the patient is physically outside of the hospital either in a virtual ward or ‘hospital at home’, this duty of care still applies, and patients will be part of the system of care until discharged. Darnley v Croydon establishes that the duty of care does not depend on the patient remaining in the building.[62]
The extent to which the AI developer may also owe a duty of care to a patient is less clear. The foundations for establishing a duty of care in negligence lie with the judgment of Lord Atkin and his ‘neighbour principle’ in Donoghue v Stevenson[63] together with Lord Bridge’s formula for determining duty of care in novel situations laid out in Caparo Industries v Dickman. Further, in Robinson (Appellant) v Chief Constable West Yorkshire Police Lord Reed clarified that the Caparo test should be considered in comparison to established precedents so that the law develops ‘incrementally and by analogy to established precedents.’[64] Therefore, the potential duty of the AI developer depends on: 1) that the harm was reasonably foreseeable; 2) that the relationship between the parties is sufficiently proximate in law and 3) that it is ‘fair, just and reasonable’ to impose the duty on the defendant. As already identified, machine learning errors can be unanticipated by the creators of the system. The autonomy and agency with which machine learning systems both learn and produce outputs may stretch the limits of foreseeability beyond breaking point.[65] On the other hand, a developer should be aware that if a sufficient degree of care and skill is not used, some form of consequential additional harm due to this lack of care is likely to occur within a healthcare context, which may not necessarily need to be specific.[66]
The second criterion is that developers must be of ‘such close and direct relations that the act complained of directly affects a person whom the person alleged to be bound to take care would know would be directly affected by his careless act’. In the foundational case of Donoghue,[67] the duty of care was applied to the manufacturer of ginger beer when a decomposing snail was ultimately poured into the claimant’s ice cream float. The case bears some similarity to AI scenarios because the manufacturer was held to be the duty holder, in part, because of the opacity of the product (albeit a brown bottle rather than a black box) and the proprietor of the café could not reasonably have averted the harm. The difference is that snails should never be in ginger beer, but machine learning systems may need to be deployed when they are imperfect but reliably outperform clinicians at a relevant classification task. AI developers are likely to frame the system in an advisory role and solely intended to assist the clinician, who will remain a ‘learned intermediary’ primarily responsible for the safety of patients.[68] However, with machine learning, the outputs are technically inscrutable, so the courts may find that the importance of the system in clinical decision-making creates a sufficiently proximate relationship in law, despite being far from the bedside. However, developers are likely to argue that it is the healthcare provider and not the patient that is the end user of the system. However, there are arguably circumstances in other areas of negligence law where the courts have examined the reasonableness of any exemption clauses when it is clear that third parties will rely on advice.[69]
The final part of the Caparo test considering whether it is fair, just and reasonable to impose a duty is even more problematic because it introduces a much criticised and unpredictable policy paradigm into the equation. Full consideration of the range of policy factors lies outside the scope of this paper but even if the foreseeability and proximity criteria were met, careful consideration would need to be given to the policy arguments in favour of, and against, the recognition of such a duty.
Taking this into account, it seems unlikely that developers will be the duty holders in the near term whilst AI systems remain clinician facing ie designed to interact primarily with clinicians rather than directly with patients. It is worth briefly pointing out that, even in the unlikely event that the courts expand the duty of care to include developers, they would remain an unappealing target of litigation. Even if a lack of appropriate care and skill could be identified in development, the self-learning and autonomous nature of AI systems means that it would be impossible to evidence how the system would have acted any differently but for the alleged negligent act. The black box nature of machine learning systems will create a circuit-breaker to causation leaving developers safely outside the gravity of negligence law for the foreseeable future.[70] This may not be as unreasonable as it appears at first glance because reliable AI systems will, ex hypothesi, decrease the number of medical errors. It would be perverse to argue that healthcare providers should not hold the duty of care for the systems that they use to reduce their overall litigatory burden.
B Breach of duty
Having established that the duty of care is owed, the patient must then demonstrate that this duty has been breached and that it caused the harm. In negligence generally, the doctor does not fail in their duty of care simply because they make a wrong decision that leads to patient harm: some wrong decisions are justifiable. The decision must be one which a reasonable doctor in the same position would not have made. As McNair J. set out in the landmark case of Bolam v Friern Hospital Management Committee, a doctor will not be liable for negligence where they acted:
‘in accordance with a practice accepted as proper by a responsible body of medical men skilled in that particular art .... a man is not negligent, if he is acting in accordance with such a practice, merely because there is a body of opinion who would take a contrary view’.[71]
This test was tweaked in the case of Bolitho v City and Hackney HA, so that the court, rather than the profession, is the final arbiter of responsible practice. As stated in that case: ‘the court has to be satisfied that the exponents of the body of opinion relied upon can demonstrate that such opinion had a logical basis’.[72] This gloss to the Bolam test allows the court to make rare interventions to determine when professional peer opinion is ‘illogical’ and prefer the account of the claimant over the defendant. Mulheron argues that the influence of this gloss is more significant than often perceived, and that, while the test itself offers poor delineation of the ambit of Bolitho, it demonstrates that categories have emerged where peer opinion must be ‘logical,’ ‘responsible,’ and ‘defensible.’[73]
Medical science is constantly evolving, and the courts have taken a pragmatic approach that is generally permissive of new treatments, even where only a few ‘super-specialists’ can carry out the treatment.[74] Therefore, using a system that makes errors is not necessarily negligent per se if it can be justified by a responsible body of medical opinion taking sufficient account of the risks and benefits. However, determining when exactly such an error could be categorised as falling below the legal standard presents a problem. Generally, this is determined by comparing the care received with a hypothetical reasonably competent doctor.
Holding the actions of a doctor negligent simply because they followed erroneous AI has already been established as problematic. Following AI is likely to be the correct clinical procedure and the legal standard of care, particularly if systems start to outperform human doctors. The argument of Froomkin et al recognises that it may be negligent not to use AI systems with a higher diagnostic accuracy or to fail to adhere strictly to their advice.[75] Therefore a core question is whether the experienced body of medical opinion remains credible when using AI systems that make errors which cause additional harm. This tort-induced imperative presents a particular problem in situations where a doctor disagrees with the AI, which introduces an ethical dilemma,[76] and potentially exposes doctors to legal and moral penalties whatever action they take.[77]
In most cases, the envisaged errors will originate within the inexplicable internal workings of the machine learning system. If the data are accurate and the developers have used due care and skill and the error is not manifestly obvious, then it can only be attributed to the system itself. But this introduces a problem around what these errors should be compared with in order to determine whether the legal standard of care has been met: should they be compared to the standard of a human doctor or that of a similarly operated AI system?
There are many research papers which describe AI performance as ‘comparable’ to clinicians, but they often involve a system performing a classification task alongside a human, who is given relatively little time to make a decision and the results are judged on a one-off decision.[78] This reveals a distinct difference in approach because the doctor and the AI system are experts but work in fundamentally different ways. For example, machine learning systems classify an image by analysing it pixel by pixel and comparing it with other datasets in a highly mathematical process. Human clinicians use heuristic processes involving their experience and judgement, which may allow them to perform better in uncommon situations.[79] The resulting effect is that both AI and humans will make different types of errors. Comparing human performance as a benchmark is problematic because AI systems can be both superhuman and subhuman in terms of performance. The following section will demonstrate that the nature of AI errors makes assessing its reasonableness by ascribing a simple error rate under a robust method complex.
The superhuman aspect of AI systems involves an ability to detect patterns that far exceeds human capacity.[80] AI systems ‘are capable of arriving at dynamic solutions to problems based on patterns in data that humans may not even be able to perceive.’[81] For example, systems can potentially predict blood sugar levels from heart rate signals,[82] or detect atrial fibrillation from monitoring patients with a smart watch.[83] The human body may soon reveal a tapestry of novel patterns that can only be detected by advanced machine learning programmes rendering comparison with human clinicians uninstructive. A system could become less accurate over time for reasons such as a changing demographic, and the subsequent drift could cause additional harm to patients whilst performing a task that is still beyond human capabilities.
However, AI systems can be subhuman in that they are incapable of demonstrating common sense and have been described as ‘digital idiots savants’.[84] Early experiences in automated vehicles highlight some potential problems in the healthcare sector. Machine learning models can struggle to adapt to rare and unanticipated events not contained within training data. For example, a piece of tape on a speed sign can mislead some models of cars and cause a vehicle to accelerate.[85] In the United States, a fatal accident occurred when the clear blue sky was reflected in the mirror finish on a truck, misleading the system.[86] The way that self-learning systems operate does not bear comparison to human cognition. It may be more credible to successfully claim in negligence if decisions are obviously flawed and manifestly below a human standard, but, in healthcare, the errors may not always appear so obvious: an erroneously discharged patient will attract a lot less attention than a crashed truck. There may be little the developers can do to prepare for unforeseen rare events. These accidents may come to be considered the unfortunate cost of generally reliable AI, and it makes sense that patients are compensated, but it is legally incoherent if medical errors cannot be compensated because no specific breach of duty can be identified under the existing legal frameworks. This again presents a fundamental challenge to a fault-based approach and could be an argument that lends support for an enterprise liability based on principles of fairness and restitution. However, in jurisdictions where common law fault-based systems endure, the question of liability will depend on the reasonableness of the clinician in following the advice, or the institution in failing to manage risks and install appropriate guard rails into medical systems.
Finally, when accurate systems become routine, it may not be an optimal use of resources for healthcare workers to continue to perform certain roles. There are many examples of deskilling historically: nobody is employed to physically reset bowling pins anymore; lifts historically needed operators to manually stop the mechanism as it aligned with a particular floor. At some point in the future, when AI systems are commonplace and reach higher diagnostic accuracy, human clinicians may not perform certain functions at all. When these AI systems fail, it will be legally incoherent to compare them against human performance where it is obsolete.
C The reasonable computer
One solution to resolve the human comparator problem is to introduce the concept of the ‘reasonably competent computer’ as the standard of care for AI errors. The idea being that any error is judged against a comparable AI system performing a similar role and could be determined through technical assessments and expert witnesses.[87] Whilst holding intuitive appeal, this presents both conceptual and practical problems. Firstly, there is a conceptual problem around what it means for an algorithm to be ‘reasonable’. An inherent presumption in negligence law is that when clinicians make a negligent error, they should have acted differently, with the corollary that they should do things differently if the same scenario arises: they ought to know better. This conceptual framework of ‘reasonableness’ is incompatible with a machine learning system which has been trained on data until sufficiently accurate, following which, the model is set. If the system is then locked, it would inevitably make the same mistake again in identical circumstances ad infinitum. The term ‘locked’ in the context of machine learning signifies that a model has completed its training phase, and its parameters are fixed for deployment. Locking ensures that the model’s behaviour is stable and predictable when making predictions on new data in real-world applications. For the foreseeable future, systems will be locked because existing legal frameworks do not support ongoing and indefinite access to patient data for training. Further, allowing systems to use real-world data can lead to ‘concept drift’ which is a dynamic phenomenon that can cause systems to degrade as they deviate from the conditions of the original training data, thereby introducing potential safety risks. The balance between adaptability and stability will likely be met through locking systems, then monitoring and periodically updating the model.[88] Common components of reasonableness, such as taking appropriate care and recognising the importance of actions, are inapplicable to all computer programmes, however advanced. In practical terms, although a minor consideration, it is also more difficult to assess the normative requirements of a reasonable computer as AI systems do not have fiduciary duties and professional norms which play an essential role in shaping the standard of care.
In practical terms, machine learning systems may not have a credible comparator. The problem can be illustrated by considering a hypothetical scenario where two distinct AI systems have identical levels of accuracy in detecting cancer and recommending treatment. Even if both systems reached accuracy levels of 99 %, it does not follow that they will make identical errors in the 1 % that they misclassify because distinct AI systems will instantiate different mathematical models and it is likely that they will misclassify different patients. They will classify the core patient with paradigmatic features of the condition in the same way, but in less common and edge cases, they will classify differently. The fact that two equally effective AI systems disagree about a classification would not necessarily show that one of them has made an error which a well-functioning AI system should not make. Both systems are perfectly ‘reasonable’ as AI decision-makers. There may be situations where a healthcare institution uses a suboptimal or obsolete AI system to provide advice and there is a higher performing, more accurate and affordable system available. In that specific scenario, it may make sense to claim that a culpable error has been made if the AI system used does not make a diagnosis which the best available AI would have made. However, under those circumstances, that error should be attributable to the organisation that has negligently failed to update its systems. Deep-learning models require significant resources potentially leaving a small cohort of commercial entities with the resources and expertise to train large models.[89] There may not be a relevant competitors’ system to use as a benchmark: AI systems are not manufactured like laptops. Even if such a system existed, it would have been trained on different data and produce a different configuration of the billions of virtual nodes, meaning that it would inevitably make different errors. Even in the unlikely event that models were trained on identical data, the interplay of various factors during the model training process can lead to technically different models. Advanced machine learning systems should be considered unique and incomparable. Whilst this claim may also be made about human clinicians, the thousands of human clinicians engaged in healthcare creates and reveals the standards of care that can be established from aggregated clinical behaviours and explains the process for determining outliers in a way that is not possible with AI systems.
Therefore, in many cases, pointing to the exact nature of the error is potentially impossible, and even if it could be ascertained, demonstrating how this translates into a specific breach of duty is highly problematic, which means that clinical negligence regimes may fail to deliver justice, prompting calls for reform. The next section examines whether res ipsa loquitur can perform any better.
IV Res ipsa loquitur: a principled approach
The legal instrument of res ipsa loquitur or ‘the thing speaks for itself’ is often associated with other areas of personal injury law and has been applied to a wide range of cases including falling flour barrels,[90] collapsing cranes,[91] surgical paraphernalia left inside patients,[92] bricks falling out of bridges,[93] and train crashes.[94]
Two years after Byrne v Boodle, the case of Scott v London and St Katherine’s Docks[95] involved a man who was similarly injured by a falling object, although this time by a sack of sugar falling from a hoist. Earle CJ gave a classic exposition of the principle:
‘Where the thing was shown to be under the management or his servants, and the accident is such as in the ordinary course of things does not happen if those who have management use proper care, it affords reasonable evidence, in the absence of explanation by the defendants, that the accident arose from want of care’.[96]
There are some general restrictions on when res ipsa loquitur can be applied, which are set out in the case of Scott and Bennet v Chemical Construction (GB Ltd):[97]
The event is one that would not normally occur in the absence of negligence/fault.
The thing causing the damage must have been under the control of the defendant.
There is no evidence as to why or how the accident occurred.
If the three initial considerations are met, then the court may draw an inference of negligence against the defendant. The application of res ipsa loquitur in medical cases pre-dates the important modern clinical negligence cases of Bolam and Bolitho. Early medical res ipsa loquitur cases were often applied to egregious surgical errors, such as in Mahon v Osborne[98] where a surgical swab was left inside the patient’s body. Or, in Clarke v Warboys[99] when a patient suffered an injury of severe burns to her buttocks whilst undergoing an operation to remove a breast tumour and the surgical team were unable to give an alternative account of how the injury could have occurred.
The leading authority that declares res ipsa loquitur applicable to medical cases is Cassidy v Ministry of Health[100] where Denning LJ summarised the maxim’s application as ‘I went into hospital to be cured of two stiff fingers. I have come out with four stiff fingers and my hand is useless. That should not happen if due care had been used. Explain if you can?’[101]
Res ipsa loquitur cases still appeared post-Bolam but it became a more marginal legal phenomenon as judicial reticence to apply the maxim increased over time and peaked around the turn of the century. The logic of the courts flowed from advancements in record keeping, supervision and regulatory safeguards, which arguably reduced opacity within medical systems. The courts generally viewed res ipsa loquitur as a legal anachronism because it was considered both unnecessary and potentially burdensome to the medical profession. With all presumptive facts available, cases could be assessed by medical experts and battle lines drawn. In Delaney v Southmead Health Authority, Stuart Smith LJ argued that res ipsa loquitur would not be helpful in medical negligence cases, where unexpected results often occur in the absence of negligence,[102] such as risks inherent in surgery. This reflected parallel developments in some other jurisdictions, such as in Canada, where the Supreme Court in Fontaine v British Columbia held that res ipsa loquitur ‘should be treated as expired’,[103] and in South Africa, res ipsa loquitur was restricted in medical cases to reduce the litigatory burden.[104]
In Ratcliffe v Plymouth and Torbay Health Authority Hobhouse LJ argued that:
‘Res Ipsa loquitur is not a principle of law and does not raise any presumption. It is merely a guide to help identify when a prima facie case is being made out. Where expert and factual evidence is being called on both sides at trial its usefulness will normally have long since been exhausted’.[105]
The facts of the case involved a 48-year-old man who was given a spinal anaesthetic to manage the pain following ankle surgery, but subsequently developed a serious and unexplained neurological defect. In the Court of Appeal judgment, Brook LJ gives a comprehensive review of the res ipsa loquitur case law and, taking a different line of reasoning to Hobhouse LJ, the judgment of Brooke LJ argues that expert evidence would serve to strengthen a res ipsa loquitur case where an expert confirmed that the result would not normally occur in the absence of negligence and that explanations to rebut negligence would have to be greater than merely theoretically or remotely possible.[106]
The judicial observations about the prospective reduction in opacity of medical systems has not proved apposite and the notion that all relevant facts will be available in cases does not stand up to scrutiny; advancements in technology have increased medical complexity and machine learning systems will introduce an epistemic vacuum into the heart of medical decision-making. The argument that res ipsa loquitur could increase the burden on doctors is no longer valid with AI; without res ipsa loquitur, the clinician could be individually blamed for following every potentially negligent recommendation because there would be no other practicable route to a successful claim in negligence.
Despite the judicial scepticism, there have been a growing number of recent cases where a less sceptical and principled approach to res ipsa loquitur has emerged. The courts have been willing to accept arguments of res ipsa loquitur, not as part of a legal doctrine, but as a more flexible approach to permit negligence claims and ensure that claimants are not inequitably denied a remedy. For example, in Thomas v Curley[107] the patient suffered a bile duct injury sustained during a laparoscopic cholecystectomy. The operation was described as ‘uncomplicated’ and an injury was caused to the claimant, which was nowhere near where the operation took place. The Court of Appeal held that this fact ‘called for an explanation as to how that might have occurred in the absence of negligence’.[108] Lloyd Jones LJ went on to say that this approach has ‘nothing to do with the reversal of the burden of proof and nothing to do with res ipsa loquitur’.[109] Instead, it was held, by applying the same principles described throughout the res ipsa loquitur case law, that negligence had been proved by the claimant.
A similar approach was taken by Jackson LJ in O’Conner v The Pennine Acute Hospitals NHS Trust.[110] In an injury involving complications from a hysterectomy causing incontinence, the court held that the defendant had not put forward any plausible explanation for how the injury could have occurred in the absence of negligence. The court again stated that the facts did not reverse the burden of proof or invoke res ipsa loquitur. Jackson LJ held that that the defendant’s failure to provide an explanation was a matter which the trial judge was entitled to take into account, which supported the finding of negligence against the defendant.
Jackson LJ invoked the general principles of res ipsa loquitur but stopped short of following the logic of Thomas or Cassidy because he did not go so far as to say that the circumstances called for an explanation by the defendant. Like much of the legal wrangling around res ipsa loquitur, the argument is fundamentally semantic as the practical application of res ipsa loquitur in O’Conner is indistinguishable from Cassidy. Similarly, whilst denying that the decision in any way involved res ipsa loquitur, Jackson LJ, in ‘taking into account’ the lack of a plausible explanation of the defendant, practically amounts to an inference of negligence, even if it is not described as such. Even if res ipsa loquitur is not accepted as a true legal doctrine, there are clear foundational principles that can be applied to cases involving elements of inexplicability which would otherwise leave claimants without a remedy. Therefore, the courts are arguably taking a principled approach, and the application of res ipsa loquitur in recent cases does not amount to a significant development of the law, but simply reflects a semantic reorientation from ‘res ipsa loquitur’ to ‘common sense’, which draws from Meglaw LJ’s observation in Lloyde v West Midlands gas Board that res ipsa loquitur is ‘an exotic phrase to describe what is in essence, no more than a common-sense approach’.[111] The modern ‘common sense’ consensus is largely endorsed within legal textbooks, however this could and should be recognised as a ‘principled approach’. ‘Common sense’ is highly subjective, whereas principles emerge from goal-orientated ethical frameworks that are generally tethered to defensible jurisprudential aims.
The cases have established a clear framework for how res ipsa loquitur operates but have not resolved ontological problems about whether res ipsa loquitur ‘truly’ exists or what it is. As demonstrated previously, the courts in England and Wales have consistently rejected the notion that res ipsa loquitur is a rule of evidence but the language has been imprecise, at times fuelling confusion. For example, Griffith L’s assertion that ‘loosely speaking, this may be referred to as a burden on the defendant to show that he was not negligent.’[112] In some other jurisdictions, such as the way the Israeli Supreme Court has interpretated res ipsa loquitur, the doctrine goes beyond a permissive inference and becomes a requirement to allow a claim to succeed when the defendant does not disprove negligence once the res ipsa loquitur conditions are met. This difference in whether the defendant must bring evidence or persuasion is described by Porat and Stein as ‘hard’ and ‘soft’ res ipsa loquitur models.[113] In the Anglo-American ‘soft’ model, the historical doctrinal confusion relates more to how res ipsa loquitur should be described as opposed to having any meaningful effect. Whether a defendant must give an account to meet the requirements of an evidential rule, or because they may incur liability by inference of negligence if they do not, is neither here nor there to claimants.
V Applying res ipsa loquitur principles to AI errors
The foundational principles set out by Earle CJ, and the case law synthesised by Brooke LJ, provide the legal framework for applying the doctrine of res ipsa loquitur and the conditions under which defendants can rebut such inferences. This section addresses these considerations in turn. Firstly, to raise an inference under res ipsa loquitur, a claimant must demonstrate that:
The accident was not the type that would ordinarily occur in the absence of negligence and the facts about the act must ‘speak for themselves’.
The defendant was solely in control over the thing which caused the injury.
The defendant has no plausible and credible alternative explanation of what caused the accident to occur.
For this case to succeed, the court must find that the allegations of negligence are more probable than the defendant’s explanation. Or put simply, as the Solicitor General argued in ST Katherine Docks: ‘The true test is whether the case is more consistent with negligence than care.’[114]
The res ipsa loquitur framework is, in principle, a better fit for AI error than Bolam. It will be generally far easier to demonstrate that the accident would not ordinarily occur in the absence of negligence than to identify and evidence a specific breach of duty. If a patient is inappropriately discharged because of an inexplicable AI error, it would be easy to state that the discharge would not ordinarily happen. The facts would speak for themselves. Many of the putative applications for machine learning systems involve safety critical work, such as potentially assessing the severity of head injuries,[115] planning treatment for sepsis,[116] or detecting cancers.[117] When an error occurs, the consequences are likely to lead to significant harm which will be the ‘res’ presenting tangible evidence of negligence. But with res ipsa loquitur, there is no need to examine a counterfactual hypothesis about whether a human clinician, or a comparable AI, would have made the error to determine negligence. There are obvious cases which still may meet the criteria for liability under Bolam/Bolitho and there are cases where a misdiagnosis of a rare condition using AI advice is arguably not negligent and hindsight bias may lead to an increase in litigatory risks in that respect. Therefore, the contention is not that res ipsa loquitur is perfect or solves all problems with AI errors, but it reduces the costs in the process of proving negligence, which is important. But more importantly, it prevents cases being dismissed where patients are unable to obtain evidence introducing a de facto ‘right to trial’ and the courts have the discretion as to where to allocate the burden of the risks of error between the patient and the healthcare institution.[118]
The series of surgical cases that allowed res ipsa loquitur to succeed presents a useful comparator to AI errors because both involve an epistemic vacuum. In surgical cases, the patient will usually be under general anaesthetic and unconsciousness is also a black box, albeit the patient is on the inside. Where these unexplained surgical accidents occur, there is a justifiable practical limit to the evidence that a claimant can provide, and the courts appear to be more amenable to claims of res ipsa loquitur in such circumstances. For example, in Saunders v Leeds Western Health Authority, a four-year-old girl underwent routine surgery to repair a congenitally displaced hip. Mann J said ‘it is plain from the evidence called on her behalf that the heart of a fit child does not arrest under anaesthesia if proper care is taken in the anaesthetic and surgical processes’.[119] With machine learning errors, like the unconscious patient, the claimant will not be able to provide evidence of the technical nature of the error and demonstrate how this amounts to a breach of duty, but the courts should take this into account and allow the claimant to advance the case that negligence has occurred, and the claim ought to succeed. Res ipsa loquitur is a more patient-centric approach because the misdiagnosis or inappropriate treatment presents the ‘res’ rather than a specific act or omission of the clinician. Defendants can still offer evidence in rebuttal that subsequent harm to patients is primarily background or iatrogenic risks. As stated by Hobhouse LJ in Ratcliffe: ‘In pleading his case the plaintiff will only be expected to particularise his allegations of negligence in a way that is appropriate to his state of knowledge of what happened at the time of his pleading. Establishing the facts should not necessarily need expert evidence.’[120] In Ludlow v Swindon Health Authority, Hutchinson J accepted that in anaesthetic awareness cases, where patients remained conscious during surgery, the set of facts would raise an inference even in the absence of expert evidence that this type of error should not normally happen.[121]
The res ipsa loquitur framework requires that the ‘thing’ must be under the ‘sole control’ of the defendants. This arguably raises several more complex legal issues because it is not clear to what extent a machine learning system is under the control of the hospital, or whether it can be considered under the control of anyone at all, owing to its autonomous and self-learning nature. AI systems require a degree of post-market surveillance and testing, and hospitals may argue that they remain at least partially under the control of their creators as clinicians lack the relevant expertise, and that developers should be responsible for managing downstream effects of their actions. In practical legal terms, systems should be considered under the control of the hospital because they maintain overriding control in the sense that they can always turn them off. If they develop any misgivings about the system’s accuracy, they will have a duty to pull the plug. As hospitals are vicariously liable for expert doctors, it would be perverse to argue that they could not be justifiably held liable in negligence for AI that they have instituted into medical systems to perform the same tasks. Hospitals are generally liable for their employees across jurisdictions through the principle of respondeat superior but, without legal personality, AI systems are not employees or independent contractors. However, drawing from the existing debate around vicarious liability and the role of control, the principle of ‘let the master answer’ suggests that the hospital should bear responsibility because it has a duty to provide safe systems of care,[122] and the AI has no independence about where, when, or how it is deployed.
If the claimant has been able to raise an inference of negligence, then the defendant will be invited rather than required to rebut the inference. Establishing the components in principle does not necessarily mean that the claim will be successful because, as Meglaw J stated in Lloyde, ‘the res or thing, which previously spoke for itself, may be silenced or its voice may, on the whole of the evidence become too weak or muted.’[123]
As set out by Brooke LJ in Ratcliffe, the courts must consider several issues when determining if the defendant can rebut an inference of negligence which are broadly that:
The facts are contested meaning that the res does not speak for itself;
The defendant lacked exclusive control;
The defendant offers a plausible innocent explanation that does not connote any negligence; or
No rational explanation is possible, but the defendant can demonstrate that they used all reasonable care and skill.
With AI systems, the first criterion is likely to be the lesser problem because if the error involves a misdiagnosis or inappropriate treatment recommendation, there will be clear evidence within the patient records and in many cases it will be incontrovertible. It is only when the facts are disputed that this will cause a problem, and generally cases have been rejected because expert evidence is disputed by both sides,[124] or where surgical procedures are inherently risky and can go awry with no negligence at all.[125] Therefore AI systems used to plan or assist in surgical interventions face this additional difficulty in establishing a presumption of negligence that does not exist in errors in diagnosis and treatment planning. However, this is neutral in terms of fairness to patients because this is an existing legal risk when harm occurs in surgery.
Whether the ‘thing’ is under sole control could also be disputed by defendants who may argue that commissioning, using, and having the option to switch off a system does not constitute ‘control’. If this counter argument is successful, a difficulty could arise because early AI systems within the NHS involved partnerships between large technology companies and hospitals.[126]Res ipsa loquitur cases in construction suggest that complex contractual arrangements involving third parties can undermine res ipsa loquitur claims in negligence cases because parties can dispute being in ‘sole control’.[127] As already demonstrated, res ipsa loquitur requires that the question of the duty holder is resolved, and this may present ongoing legal uncertainty. Certainly, if the courts allow hospitals and developers to mutually deny sole control of the system, it could leave claimants without a remedy, and the position that the courts take will inevitably require a degree of ‘common sense’ to ensure that this does not happen.
The next criterion to examine is whether the defendant has a plausible alternative explanation. This is where facts may become contentious as the hospital may choose to assert that the fault lies with negligent actions earlier within the AI-lifecycle and attempt to join developers in the case. The hospital may advance the theoretical claim that the developers negligently allowed the system to develop inaccurate, biased, or unexpected outputs and that the result is a direct consequence of lack of their care and skill. But the hospital inevitably runs into the same evidential problems that a prospective claimant would.[128] The alternative explanation must not be merely theoretical. As the editors note in Clerk and Lindsell, defendants cannot throw up any theoretical possibility: ‘his assertion must have some colour of probability about it’.[129] There must be a logical basis for the rebuttal, which requires some evidence. Counterfactual hypothesis about what a machine learning system may have done but for a particular intervention is a highly speculative argument. As stated by Morris J in Lindsay v Midwestern Health Board, where an explanation is ‘entirely theoretical and based on speculation’, then the doctrine cannot be rebutted, and an inference of negligence will arise.[130]
The courts may also reject res ipsa loquitur if they are satisfied that no rational explanation is possible, but all reasonable care and skill was used. This could be open to a range of interpretations. On the one hand, the courts could determine that diligently implementing AI advice from approved systems could never be negligent even when it is wrong. This will present a tempting argument in rebuttal and could be used to avoid liability. But the counter argument is that the healthcare provider has deliberately introduced a complex tool with a degree of autonomy into the decision-making process. Serious misdiagnosis and inappropriate treatment would not ordinarily occur without negligence and res ipsa loquitur allows claimants to put forward this system level assertion because it does not need to focus on individual comparisons with the reasonably competent doctor. That the machine learning system has inexplicably produced erroneous treatment recommendations which were implemented is still a rational explanation. There is also some logical basis to suggesting that the type of ‘institutional explanation’ for AI systems presents a more feasible approach to how AI use should be justified in healthcare. Thuenissen and Browning argue that, taking the example of cancer detection being less effective on dark skin, ‘this information provides an explanation of not just how the machine has been trained and why it is reliable, but also lays out the contexts and for which patients its use is not recommended.’[131] Therefore this contextual justification of institutional liability translates into an analysis of balancing the risks of deploying AI systems against alternative courses of action and understanding where distributive risks in AI development fall within population cohorts.
If systems produce unexpected outcomes resulting in serious harm, not unlike falling barrels, innocent explanations may be hard to come by, even if the exact cause remains obscured by technological complexity. But the discretion around how to flexibly apply the principles may mean that the courts will restrict the use of res ipsa loquitur to cases that have an element of the indefensible about them, only providing a remedy when it would be inequitable to deny it. The enduring appeal of res ipsa loquitur is that sometimes the res speaks louder than legal orthodoxy, but claimants may remain cautious, unsure if res ipsa loquitur will speak loud enough for them.
The courts could, and should, take a principled case-by-case approach and interpret serious AI errors as prima facie evidence of negligence and the healthcare provider should ordinarily be responsible. The balance to be struck is that the courts are going to have reservations about allowing a negligence claim in 100 % of the cases where an AI system is not absolutely correct. The courts have been most cautious in cases when ‘ordinary’ risks inexplicably materialise in surgery, which can happen without negligence and uncertainty is a valuable commodity when it comes to rebutting an inference of negligence.[132] Therefore the risk for many claimants is that courts only find for the claimants in relatively extreme cases where defending the claim starts to drift into morally unjustifiable territory, or defendants are unscrupulously seeking to exploit doctrinal obstructions to a remedy.
VI Conclusion
Clinical negligence as it stands is not well suited to machine learning errors. There are compelling reasons why doctors should not be found negligent for following AI recommendations, but they are still likely to present the easiest target. Establishing what amounts to a breach of duty under the Bolam/Bolitho framework breaks down because of serious problems in using either human doctors or a reasonably competent computer as a comparator, and the complexity of the AI sociotechnical system makes establishing the nature of errors that have occurred in the development stage prohibitively difficult. There is a significant danger that AI developers escaping the gravity of negligence law leaves them with a heads-we-win, tails-you-lose liability model, and work will have to continue as regards how to hold developers to account outside negligence regimes.
The hospital or NHS trust should be responsible for the totality of the care including the use of machine learning systems for diagnostics and treatment planning. The purpose of AI is to improve the system of healthcare delivery within hospitals: it is a tool that they choose to use to achieve these ends. Where injury results from inadequate treatment or misdiagnosis by an AI system, claimants should plead res ipsa loquitur and the courts should be able to accept that when the injury was unexpected and does not ordinarily happen without negligence, that the claim could succeed in principle. If patients cannot rely on the res ipsa loquitur framework, then access to justice and fairness will be impacted by the technological, epistemic, and financial barriers to making a claim following an AI error. All systems, including human and AI agency, should be considered under the control of the healthcare trust. The risk that the trust may then either blame the developers or join them to the claim is an open problem, but they will face the same evidentiary burden relating to the distributed nature of agency earlier in the development process. The courts should apply the framework set out by Brooke LJ in Ratcliffe to determine whether a ‘common sense’ decision should succeed as a matter of principle. However, whilst res ipsa loquitur remains a principle applied on a case-by-case basis, rather than a doctrine, it makes it more difficult to settle cases sensibly because advice on quantum and liability will be less certain. Remedies will be contingent on the caprice of the courts.
Res ipsa loquitur continues to have inconsistent application in case law and an uncertain place in English and Welsh jurisprudence. It is not a solution hiding in plain sight for solving all legal intractabilities of machine learning systems. From a patient-centric viewpoint, res ipsa loquitur probably presents the best legal avenue available, but it is better construed as the least bad option. Res ipsa loquitur is still contentious, and the burden of establishing likely but unknowable facts is not removed. On the other hand, res ipsa loquitur may prove useful because it does not require legislative overhauls or regulatory redesign and this useful but incomplete solution lies within existing doctrine. If newer bespoke legal processes are implemented to resolve AI liability issues, res ipsa loquitur will not stand in the way. Whilst errors within complex neural networks and falling barrels of flour, sugar or tea seem factually far removed, there is much that a trawl through the cases of res ipsa loquitur can do to offer insights into responding to novel technologies. The important lesson is that inexplicable aspects of accidents and errors do not necessarily prevent the courts from allowing claims in negligence law. If AI systems are increasingly used to make a greater proportion of medical decisions, then res ipsa loquitur may increase in its utility to the point where its doctrinal credentials could be revisited: res ipsa loquitur is about to find new reasons to keep speaking.
Note
I would like to thank Prof Sarah Devaney, Prof Soren Holm, Dr Cath Bowden, and Dr Alex Mullock for comments on an earlier draft. I would also like to thank the anonymous reviewers and the editorial team.
© 2024 the author(s), published by Walter de Gruyter GmbH, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.
Artikel in diesem Heft
- Frontmatter
- The Knock-On Effect of Financial Supervision Law in Civil Liability Law
- How Much is Enough? Full Compensation and the Standardisation of Non-Pecuniary Damages
- Clinical Negligence in an Age of Machine Learning: res ipsa loquitur to the Rescue?
- Uncertain Causation, Loss of a Chance and Proportional Liability in Medical Malpractice Cases
- Vicarious Liability Restricted for Intentional Torts in English Law: Trustees of the Barry Congregation of Jehovah’s Witnesses v BXB
- 24rd Annual Conference on European Tort Law (ACET)
Artikel in diesem Heft
- Frontmatter
- The Knock-On Effect of Financial Supervision Law in Civil Liability Law
- How Much is Enough? Full Compensation and the Standardisation of Non-Pecuniary Damages
- Clinical Negligence in an Age of Machine Learning: res ipsa loquitur to the Rescue?
- Uncertain Causation, Loss of a Chance and Proportional Liability in Medical Malpractice Cases
- Vicarious Liability Restricted for Intentional Torts in English Law: Trustees of the Barry Congregation of Jehovah’s Witnesses v BXB
- 24rd Annual Conference on European Tort Law (ACET)