• Tomorrow's Dose
  • Posts
  • Edition 14 - AI Pathology: When Models Learn the Wrong Thing

Edition 14 - AI Pathology: When Models Learn the Wrong Thing

Explore AI’s adoption by the NHS, its role in postmarket monitoring, and the UK’s balanced approach.

  1. Pathology AI is cheating — and it's clinically dangerous

  2. Geneva's AI tool predicts cancer spread at 80%

  3. WHO draws the line on AI in mental health

  • Featured follow of the week

  • Top posts of the week across social

  • Meet the editor

  • Want a featured article?

Specialty: Pathology // Sub-Specialty: Digital Pathology // Body Site: Multiple Cancer Types

1. AI Pathology Models Are Learning Shortcuts, Not Biology — And It Matters for Patient Care

Researchers at the University of Warwick and Oxford analysed over 8,000 tumour samples across four cancer types — breast, colorectal, lung, and endometrial — and found that leading AI deep-learning pathology models are not learning what they claim to learn. Rather than detecting true biological signals such as gene mutations, the models exploit statistical correlations between features. In one key example, a model predicting BRAF mutations in colorectal cancer was not detecting BRAF — it was detecting an indirect association with microsatellite instability (MSI). The models appeared accurate on paper, but the actual performance advantage over simply using tumour grade was modest: AI reached approximately 80% accuracy versus approximately 75% using a measure already assessed by pathologists. The study concludes that until more robust evaluation standards are in place, these tools should not replace molecular testing.
Read Full Article

Paul’s Thoughts:

This paper should be required reading for anyone commissioning AI pathology tools. At GMI, we ask for evidence that a model has learned the right thing — not just a headline AUC figure. An AUC of 0.85 means very little if the model is detecting a surrogate rather than the target biomarker. The Warwick and Oxford group are demonstrating what the field calls "shortcut learning" at clinical scale, across four of the most common cancer types. The regulatory implication is significant: current conformity assessment requirements do not mandate causal validation of what a model has learned, only its aggregate performance. The question every clinical AI evaluator should now be asking is: what exactly is this model responding to?

Timescale: Acute | 0 Years

Specialty: Oncology // Sub-Specialty: Genomics / Prognostic AI // Body Site: Colon, Stomach, Lung, Breast

2. MangroveGS: Geneva Team Builds AI That Predicts Cancer Spread With 80% Accuracy Across Tumour Types

Researchers at the University of Geneva developed MangroveGS, an AI model that predicts cancer metastasis risk by analysing gene expression signatures from tumour RNA. Trained on colon cancer data, the model generalises to stomach, lung, and breast cancers — predicting whether a tumour is likely to spread with approximately 80% accuracy, outperforming previous prognostic approaches. The model exploits hundreds of gene signatures simultaneously, making it resistant to individual biological variation. Clinically, it works directly with hospital tumour samples: RNA is sequenced from cells, and a metastasis risk score is delivered securely to the clinical team. The findings were published in Cell Reports on March 21, 2026.
Read Full Article

Paul’s Thoughts:

The most interesting technical aspect of MangroveGS is its cross-tumour generalisation. Most prognostic AI models struggle to transfer between cancer types because the underlying biology differs so substantially. The Geneva team's approach of anchoring to gene expression rather than histological appearance may explain why it generalises when image-based tools do not. Clinically, an 80% accuracy rate for metastasis prediction is meaningful, but the critical next step is prospective validation in a diverse patient cohort before this enters treatment-planning workflows. When the input is molecular rather than visual, shortcut learning becomes harder — you cannot replicate BRAF status detection from RNA without actually reading BRAF. That distinction matters more than it currently receives in clinical AI discussions.

Timescale:  Early | 1-3 Years

Specialty: Radiology // Sub-Specialty: Global Governance // Body Site: Mental Health

3. WHO Convenes Global Experts to Set the Rules for AI in Mental Health Care

On March 20, 2026, the World Health Organization (WHO) published findings from an expert workshop on responsible AI in mental health, co-organised with the Delft Digital Ethics Centre at TU Delft — the first WHO Collaborating Centre on AI for health governance. More than 30 international experts in AI, mental health, ethics, and public policy identified governance priorities and safeguards for AI deployed in mental health settings. The WHO simultaneously announced the launch of a global Consortium of Collaborating Centres on AI for Health, with representatives from all six WHO regions, aimed at building shared infrastructure for AI governance grounded in ethics, evidence, and the needs of diverse populations. Dr Alain Labrique, Director of WHO's Department of Data, Digital Health, Analytics and AI, emphasised the need for accountability when AI systems interact with people in moments of emotional vulnerability.
Read Full Article

Paul’s Thoughts:

Mental health is one of the least-regulated and fastest-growing areas of health AI deployment, and this WHO initiative is overdue. Apps and chatbots offering mental health support are already in wide consumer use — often with no clinical validation, no escalation pathway, and no accountability framework. The significance of the Consortium is structural: it creates the multi-regional governance infrastructure needed to coordinate standards across jurisdictions, rather than leaving each country to regulate in isolation. From my work in Cyprus, navigating both EU and UK regulatory landscapes simultaneously, the absence of international coordination is a real operational problem. The hard question the Consortium must answer is enforcement — frameworks are only as strong as the mechanisms that make them binding.

Timescale: Early | 1-3 Years

Benjamin Nelms

President and Founder of Canis Lupus LLC

Inventor, scientist, and entrepreneur with a main focus in the radiation therapy industry

A round-up of some of the best posts we found online this week.

Was this email forwarded to you?
Our weekly email brings you the latest health trends and insights, combining top news and opinions into a straightforward, digestible format.

Want an article featured?

Have an insightful link or story about the future of medical health? Reach out below, and we may include it in a future release.

Reply

or to participate.