Opinion Letter on In Vitro to In Vivo Extrapolation Applied to Drug Safety Assessment

By: Dr. sc. nat. Bruno Filippi, VP Liver Safety, InSphero AG

May 20^th, 2026, Schlieren, Switzerland

The IVIVE question: From Cells to Patients.

The notion of translational toxicology, when applied to drug safety assessment, centres on a deceptively simple question: how can findings obtained outside the human body illuminate the risks faced within it?

In the context of in vitro models, this endeavour is commonly referred to as “In Vitro to In Vivo Extrapolation” (“IVIVE”) and seeks to determine whether the effects observed in cultured cells bear any meaningful resemblance to those occurring in patients.

Why IVIVE Seems Challenging

The skepticism surrounding this question is entirely justified. After all, for example, how comparable is a monolayer of one million hepatocytes grown on a polystyrene dish in two milliliters of chemically defined medium to a 1.5-kilogram human liver composed of countless interacting cell types, integrated into a highly interconnected physiological system, and continuously perfused by liters of pulsating blood?

And yet, despite this obvious uncertainty, numerous pharmaceutical companies have embraced in vitro models throughout the drug discovery and development process, albeit with varying degrees of enthusiasm, in the hope of improving the notoriously poor success rate of bringing new drugs to market. At the same time, a large body of academic research has been devoted to evaluating the translational value of these in vitro models.

The verdict, however, remains unresolved, and there is little indication that consensus is imminent. Unsurprisingly, the scientific community investigating IVIVE finds itself animated by a tension that echoes a philosophical divide more than two millennia old: Plato’s abstraction versus Aristotle’s empiricism.

Platonic Abstraction versus Aristotelian Empiricism

As a brief reminder, Plato’s Theory of Forms rests upon the idea that all observable phenomena are merely imperfect manifestations of “idealized forms,” and that predictive understanding may therefore be achieved through abstraction, and deductive reasoning, applied on a few observations. In the context of drug safety assessment, this mode of thinking might be expressed as follows: if a drug candidate induces oxidative stress in a 2D hepatocyte culture, it is inferred that it will likely induce oxidative stress in the human liver. Similarly, because hepatic transporters play a central role in maintaining bile acid homeostasis, a compound shown to inhibit these transporters in a vesicle assay is presumed likely to cause cholestasis. The logic is internally coherent and mechanistically elegant.

Unlike Plato, Aristotle’s commitment to empiricism rejected the notion that abstraction and deductive reasoning applied to limited observations could yield reliable conclusions about the natural world. Instead, he maintained that knowledge arises through careful, exhaustive observation and rigorous experimentation, while rational deduction serves a strictly subordinate role: before experimentation, to formulate hypotheses; and afterward, to interpret their outcomes. Applied to drug safety assessment, this mode of thinking might be expressed as follows: the theoretical assumption that in vitro liver models may provide insight into the hepatotoxic potential of drug candidates. This would be followed by the systematic measurement of numerous different in vitro biomarkers across extensive compound sets, and subsequently by statistical analyses aimed at identifying those signals that genuinely correlate with clinical hepatotoxicity. Within this framework, predictive biomarkers and thresholds are not derived mainly through rational deduction but rather emerge from empirical performance itself.

Dialectic on IVIVE

The conclusions reached by these two schools of thought can at times appear almost irreconcilable. For example, our research group published a large-scale in vitro/in vivo correlation study evaluating the ability of ATP measurements in human liver spheroid cultures to “predict” the hepatotoxic potential of orally administered small molecule drugs (PMID: 39397666). We determined the ATP IC50 values of 152 FDA-approved drugs in human liver spheroids and correlated the resulting margins of safety (i.e. the ratio of ATP IC50 to total plasma Cmax) with hepatotoxicity-related warnings contained in the drugs’ clinical labels. The study revealed a strong and statistically significant correlation (p-value<0.0001). From this dataset, a binary classification threshold could be empirically derived: compounds with a safety margin greater than 176-fold the total plasma Cmax were generally associated with safe clinical profiles (~90% True Negative Rate), whereas compounds below this threshold were significantly more likely to carry hepatotoxicity warnings (~80% True Positive Rate).

A Peripatetic would accept the finding at face value: exposing liver spheroids cultivated in vitro to supraphysiological drug concentrations and measuring ATP depletion appears, empirically, to inform hepatotoxic risk. The rational work (for those who care to understand why) comes afterwards: How come?

Mechanistic Coherence or Predictive Accuracy

And therein lies the debate within the toxicology community surrounding translation toxicology in general: should predictive value for human outcomes be derived primarily from mechanistic insights gained through a limited number of non-human experiments, and subsequently extrapolated through rational interpretation? Or should it emerge from large bodies of empirical observation, enabling robust statistical analysis, even when the underlying biology does not fully conform to our expectations?

The field continues to wrestle with these two intellectual traditions, each illuminating different facets of an extraordinarily complex problem.

Platonic Dominance in Modern Toxicology

Modern toxicology is deeply preoccupied with uncovering the mechanisms of action that underlie toxic effects. Such mechanistic insight reveals how a compound engages with biological systems, allowing researchers to trace the molecular and cellular pathways through which toxicity emerges.

Yet mechanistic experimentation in humans has become ethically impermissible in modern societies, while findings from animal studies have repeatedly shown only limited relevance to human biology. At the same time, many physiological cell-based systems in which such studies might be performed lack the scalability, reproducibility, and robustness required to generate large, coherent datasets suitable for meaningful statistical analysis, the very observational foundation upon which a truly empirical, Aristotelian method depends (Of course, InSphero’s microtissues are the exception to this rule!). Altogether, these limitations deprive toxicology of the breadth of observation it would require, rendering the classical empirical ideal seemingly unattainable. In the absence of a better alternative, the toxicological community has, in a sense, summoned the ghost of Plato: turning toward abstraction, modelling, and rational reconstruction where direct observation remains limited.

This matter of fact is even more striking given that the modern natural sciences remain generally and profoundly abiding to the Peripatetic school, with Aristotle standing among the great founding figures of physiology and biology through the vast and meticulous body of observations he assembled. But is modern toxicology truly trapped in a purely rational cul-de-sac?

Regulatory Toxicology as an Empirical Rosetta Stone

In reality, a vast and comparatively homogeneous body of human observations exist for marketed drugs, owing to the long and meticulous pharmacovigilance efforts undertaken by regulatory agencies in monitoring adverse effects. Admittedly, such regulatory data constitute a comparatively “crude” and imperfect form of assessment, insofar as they frequently lack detailed mechanistic insight and may vary across historical periods and regulatory frameworks. Yet the annotations contained within drug labels ultimately reflect the considered judgment of numerous pharmacovigilance toxicologists operating under rigorous scientific and regulatory standards; and, most importantly, they remain the most reliable body of human evidence presently available to us. Consequently, this large corpus of regulatory toxicology information may provide the key to systematically and empirically determining which in vitro approaches achieve useful predictive relevance and translational fidelity in the assessment of human adverse drug events, in keeping with the Aristotelian tradition.

Such a path would require modern toxicology to relinquish some of its reliance on mechanistic explanation and return, at least in part, to a more Aristotelian mode of inquiry. Rather than seeking first to explain, it would seek first to observe. Large numbers of marketed drugs, accompanied by well-characterized regulatory safety data, would be tested in physiological and scalable in vitro systems, and the resulting observations subjected to stringent statistical scrutiny. From such a vast body of evidence, patterns could emerge before mechanisms are fully understood, allowing prediction to arise from observation rather than deduction.

Artificial Intelligence and the Future of Translational Toxicology

Moreover, the breadth of these regulatory data permits stratification according to drug class, target organ, structural characteristics, or physicochemical properties. Furthermore, emerging large language models will facilitate the systematic mining of regulatory datasets in order to classify compounds according to patterns of drug-induced injury or susceptibility within specific patient populations, thereby establishing a broad range of thematic reference subgroups. Such stratification is of particular importance, as each subgroup may necessitate a distinct in vitro strategy, involving tailored biomarker measurements, media compositions, and culture conditions. For example, would it make sense to assess oxidative stress driven toxicity under atmospheric conditions (21% O₂), when hepatic oxygen concentrations typically range between 5 and 10%? Would it make sense to evaluate steatotic risk in “classical” cell culture media containing supraphysiological concentrations of insulin and glucose?

Here again, recent advances in artificial intelligence may prove transformative. They now make it possible to analyse vast numbers of variables simultaneously, thereby enabling the exploration of putative correlations between a wide spectrum of in vitro readouts and regulatory safety data, with the potential to uncover meaningful empirical relationships. Particularly tantalising is the prospect of employing machine-learning algorithms to investigate associations between the transcriptomic profiles of highly standardised and physiologically relevant in vitro models exposed to large panels of FDA-approved drugs, and the corresponding regulatory safety information linked to those compounds.

An Aristotelian Future for IVIVE?

True, the scope for pursuing IVIVE in toxicology through a purely Aristotelian approach remains limited. Yet the historical regulatory data accumulated from marketed drugs are real, substantial, and invaluable. They constitute a foundational bedrock for empirical IVIVE studies and serve, in many respects, as an ad hoc Rosetta Stone for human toxicology.

Unrelated Endnotes - Exempla maiorum

Ultimately before parting with our two old friends, it is worth recalling how the antagonism between Platonic and Aristotelian thoughts first emerged. Plato attempted, on three separate occasions and without success, to enlighten the Tyrants of Syracuse according to the principles of his philosophy. These failures came at considerable personal cost: twice he was held hostage, and even once he was sold into slavery! Yet Plato did not infer from these experiences that his approach itself might be flawed. Instead, he clung to an uncompromising creed of pure theoretical rationalism, attributing his failures not to any flaw in his philosophy, but to the hostility of circumstance. Later, in the eighteenth century, David Hume would brilliantly lay bare the profound frailties of the Platonic and Cartesian faith in pure reason mostly severed from experience.

Aristotle, though himself a pupil of Plato, drew from this inheritance a profoundly different lesson, one whose consequences would prove immeasurably fruitful for humanity. Turning away from his master’s abstractions and from the austerity of purely deductive thought, he turned instead toward attentive observation, empirical inquiry, and systematic investigation. In so doing, he helped lay the intellectual foundations of a tradition that would, in time, mature into the scientific method: the enduring framework through which we continue to question, test, and enlarge our understanding of the world.

Blog

Two Teams, One Goal: How the 3D InSight™ DIGIT Platform came to life

The center of gravity in drug development is shifting from animal-first to human-relevant by design. The landscape of preclinical therapeutic development is undergoing a fundamental transformation, driven by an unprecedented, coordinated regulatory and financial push from the United States government. The convergence of recent legislation, strategic regulatory roadmaps, and historic federal funding signals a non-negotiable shift toward human-centric testing and the imminent end of the era of mandatory animal experimentation.

Read the full blog post »