I attended the recent ‘Digital Health Re-Wired’ conference at Birmingham’s NEC last week. There was a lot of talk about AI – in fact I think the term pretty much featured on every stand and in every stage presentation at the conference. People are excited about AI and wherever you work in healthcare AI is coming to a clinical information system near you…

At this point I need to declare an interest – I absolutely hate the term Artificial Intelligence – I think it is a totally misleading term. In fact I’m pretty sure that there is no such thing as artificial intelligence – it is a term used to glamorise what are without doubt very sophisticated data processing tools but also to obscure what those tools are doing and to what data. In medical research hiding your methods and data sources is tantamount to a crime…

An Intelligent Definition

So what is artificial intelligence? It refers to a class of technologies that consist of certain types of algorithm paired with very large amounts of data. The algorithms used in AI are variously called machine learning algorithms, adaptive algorithms, neural networks, clustering algorithms, decision trees and many variations and sub-types of the same. Fundamentally however, they are all statistical tools used to analyse and seek out patterns in data – much like the statistical tools we are more familiar with such as linear logistic regression. In fact the underpinning mathematics of a learning algorithm such as a neural network was invented in the 18th century by an English Presbyterian Minister, Philosopher and Mathematician – The Reverend Thomas Bayes. Bayes’ Theorem found a way for a statistical model to update itself and adapt its probabilistic outcomes as it is presented with new data. The original adaptive algorithm – which has ultimately evolved into to today’s machine learning algorithms – which are given their power by being hosted on very powerful computers and being fed very very large amounts of data.

The other ingredient that has given modern machine learning tools their compelling illusion of ‘intelligence’ is the development of a technology called large language models (LLMs). These models are able to present the outputs of the statistical learning tools in natural flowing human readable (or listenable) narrative language – i.e. they write and talk like a human. Chat-GPT being the most celebrated example. I wrote about them about 5 years ago (The Story of Digital Medicine) – at which point they were an emerging technology but have since become mainstream and extremely effective and powerful.

Danger Ahead!

Here lies the risk in the hype – and the root cause of some of the anxiety about AI articulated in the press. Just because something talks a good talk and can spin a compelling narrative – doesn’t mean it is telling the truth. In fact quite often Chat-GPT will produce a well crafted beautifully constructed narrative that is complete nonsense. We shouldn’t be surprised by this really – because the source of Chat-GPT’s ‘knowledge’ is ‘The Internet’ – and we all have learned that just because its on the internet doesn’t mean its true. Most of us have learnt to be somewhat sceptical and a bit choosy over what we believe when we do a Google search – we’ve learnt to sift out the ads, not necessarily pick out the first thing that Google gives us and also to examine the sources and their credentials. Fortunately Google is able to give us quite a lot of the contextual information around the outputs of its searches that enables us to be choosy. Chat-GPT on the other hand hides its sources behind a slick and compelling human understandable narrative – a bit like a politician.

The Power of Data

In 2011 Peter Sondergaard – senior vice president at Gartner, a global technology research and consulting company – declared “data eats algorithms for breakfast”. This was in response to the observation that a disproportionate amount of research effort and spending was being directed at refining complex machine learning algorithms yielding only marginal gains in performance compared to the leaps in performance achieved by feeding the same algorithms more and better quality data. See ‘The Unreasonable Effectiveness of Data

I have experienced the data effect myself – back in 1998/99 I was a research fellow in the Birmingham School of Anaesthesia and also the proud owner of an Apple PowerBook Laptop with (what was then novel) a connection to the burgeoning internet. I came across a piece of software that allowed me to build a simple 4 layer neural network – I decided to experiment with it to see if it was capable of predicting outcomes from coronary bypass surgery using only data available pre-operatively. I had access to a dataset of 800 patients of which the majority had had uncomplicated surgery and a ‘good’ outcome and a couple of dozen had had a ‘bad’ outcome experiencing disabling complications (such as stroke or renal failure) or had died. I randomly split the dataset into a ‘training set’ of 700 patients and a ‘testing set’ of 100. Using the training set I ‘trained’ the neural network – giving it all the pre-op data I had on the patients and then telling it if the patients had a good or a bad outcome. I then tested what the neural network had ‘learned’ with the remaining 100 patients. The results were ok – I was quite pleased but not stunned, the predictive algorithm had an area under the ROC curve of about 0.7 – better than a coin toss but only just. I never published, partly because the software I used was unlicensed, free and unattributable but mainly because at the same time a research group from MIT in Boston published a paper doing more or less exactly what I had done but with a dataset of 40,000 patients – their ROC area was something like 0.84, almost useful and a result I couldn’t come close to competing with.

Using AI Intelligently

So what does this tell us? As practicing clinicians, if you haven’t already, you are very likely in the near future to be approached by a tech company selling an ‘AI’ solution for your area of practice. There are some probing questions you should be asking before adopting such a solution and they are remarkably similar to the questions you would ask of any research output or drug company that is recommending you change practice:

  1. What is the purpose of the tool?
    • Predicting an outcome
    • Classifying a condition
    • Recommending actions
  2. What type of algorithm is being used to process the data?
    • Supervised / Unsupervised
    • Classification / Logistic regression
    • Decision Tree / Random Forrest
    • Clustering
  3. Is the model fixed or dynamic? i.e. has it been trained and calibrated using training and testing datasets and is now fixed or will it continue to learn with the data that you provide to it?
  4. What were the learning criteria used in training? i.e. against what standard was it trained?
  5. What was the training methodology? Value based, policy based or model based? What was the reward / reinforcement method?
  6. What was the nature of the data it was trained with? Was it an organised labeled dataset or disorganised unlabelled?
  7. How was the training dataset generated? How clean is the data? Is it representative? How have structural biases been accounted for (Age, Gender, Ethnicity, Disability, Neurodiversity)?
  8. How has the model been tested? On what population, in how many settings? How have they avoided cross contamination of the testing and training data sets?
  9. How good was the model in real world testing? How sensitive? How specific?
  10. How have they detected and managed anomalous outcomes – false positives / false negatives?
  11. How do you report anomalous outcomes once the tool is in use?
  12. What will the tool do with data that you put into it? Where is it stored? Where is it processed? Who has access to it once it is submitted to the tool? Who is the data controller? Are they GDPR and Caldecott compliant?

Getting the answers to these questions are an essential pre-requisite to deploying these tools into clinical practice. If you are told that the answers cannot be divulged for reasons of commercial sensitivity – or the person selling it to you just doesn’t know the answer then politely decline and walk away. The danger we face is being seduced into adopting tools which are ‘black box’ decision making systems – it is incumbent on us to understand why they make the decisions they do, how much we should trust them and how we can contribute to making them better and safer tools for our patients.

An Intelligent Future

To be clear I am very excited about what this technology will offer us as a profession and our patients. It promises to democratise medical knowledge and put the power of that knowledge into the hands of our patients empowering them to self care and advocate for themselves within the machinery of modern healthcare. It will profoundly change the role we play in the delivery of medical care to patients – undermine the current medical model which relies on the knowledge hierarchy between technocrat doctor and submissive patient – and turn that relationship into the partnership it should be. For that to happen we must grasp these tools – understand them, use them intelligently – because if we don’t they will consume us and render us obsolete.

I have read two stories this week.

The first was written in an interesting, contemporary literary style – you know the sort – short sparse sentences almost factual, leaving lots of ‘space’ for your own imaginative inference, not making explicit links between facts and events but leaving you to do that for yourself.  It was a love story, rather charming and quite short, describing a familiar narrative of boy meets girl, invites her to the cinema and they fall in love (probably).  It could be described as Chandleresque in style – though it isn’t that good – in fact it could have been written by an 11+ student.  It wasn’t though – it was in fact written by a computer using a form of artificial intelligence called natural language generation with genuinely no human input.  You can read how it was done here.

The second story I read is a description of a falling out of love – of the medical profession with the IT industry and the electronic patient record.  This one is very well written by Robert Wachter and is a warts and all recounting of the story of the somewhat faltering start of the digital revolution in healthcare.  It is called ‘The Digital Doctor’ and I would highly reccomend you read it if you have any interest in the future of medicine.  It is not the manifesto of a starry eyed digital optimist, nor is it the rantings of a frustrated digital skeptic – he manages to artfully balance both world views with a studied and comprehensive analysis of the state of modern health IT systems.  His realism though extends to understanding and articulating the trajectory of the health IT narrative and where it is taking us – which is a radically different way of delivering medical care.  I won’t use this blog to precis his book – its probably better if you go and read it yourself.

From Data to Information to Understanding

The falling out that Dr Wachter describes really is quite dramatic – this is the United States the most advanced healthcare system in the world – yet there are hospitals in the US that advertise their lack of an EPR as a selling point to attract high quality doctors to work for them.  Where has it gone wrong?  Why is the instant availabilty not only of comprehensive and detailed information about our patients but also a myriad of decision support systems designed to make our jobs easier and safer to carry out – not setting us alight with enthusiasm?  In fact it is overwhelming us and oppressing us  – turning history taking into a data collection chore and treatment decisions into a series of nag screens.

The problem is there is just too much information.  The healthcare industry is a prolific producer of information – an average patient over the age of 65 with one or more long term conditions will see their GP (or one of her partners) 3 – 4 times a year, have a similar number of outpatient visits with at least 2 different specialists and attend A&E at least once.  That doesn’t include the lab tests, x-rays, visits to the pharmacy, nursing and therapy episodes.  Each contact with the system will generate notes, letters, results, reports, images, charts and forms – it all goes in to the record – which, if it is a well organised integrated electronic record, will be available in its entirety at the point of care.

Point of care being the point – most health care episodes are conducted over a very short time span.  A patient visiting his GP will, if he’s lucky, get 10 minutes with her – it doesn’t make for a very satisfactory consultation if 4 or 5 of those minutes are spent with the doctor staring at a screen – navigating through pages of data attempting to stich together a meaningful interpretation of the myriad past and recent events in the patient’s medical history.

How it used to be (in the good old days)

So what is it that the above mentioned hospitals in the US are harking back to in order to attract their doctors?  What is the appeal of how it used to be done when a consultation consisted of a doctor, a patient and a few scrappy bits of paper in a cardboard folder?  Well for a start at least the patient got the full 10 minutes of the doctors attention.  The doctor however was relying on what information though?  What the patient tells them, what the last doctor to see them chose to write in the notes, and the other events that might have made it into their particular version of this patient’s health record.  This gives rise to what I call a ‘goldfish’ consultation (limited view of the whole picture, very short memory, starting from scratch each time).  We get away with it most of the time – mainly because most consultations concern realtively short term issues – but too often we don’t get away with it and patients experience a merry go round of disconnected episodes of reactive care.


As a practitioner of intensive care medicine one of the things that occupies quite a lot of my time as ‘consultant on duty for ICU’ is the ward referral.  As gatekeeper of the precious resource that is an intensive care bed my role is to go and assess a patient for their suitability for ICU care as well as advise on appropriate measures that could be used to avert the need for ICU.  My first port of call is the patient’s notes – where I go through the entire patient’s hospital stay – for some, particularly medical patients, this might be many days or even weeks of inpatient care.  What I invariably find is that the patient has been under the care of several different teams, the notes consist of a series of ‘contacts’ (ward rounds, referrals, escalations) few of which relate to each other (lots of goldfish medicine even over the course of a single admission).  I have ceased to be surprised by the fact that I, at the point of escalation to critical care, am the first person to actually review the entire narrative of the patient’s stay in hospital.  Once that narrative is put together very often the trajectory of a patient’s illness becomes self evident – and the question of whether they would benefit from a period of brutal, invasive, intensive medicine usually answers itself.

Patient Stories

The defence against goldfish medicine in the ‘old days’ was physician continuity – back then you could  expect to be treated most of your life by the same GP, or when you came into hospital by one consultant and his ‘firm’ (the small team of doctors that worked just for him – for in the good old days it was almost invariably a him) for the whole admission.  They would carry your story – every now and then summarising it in a clerking or a well crafted letter.  But physician continuity has gone – and it isn’t likely ever to come back.

The EPR promised to solve the continuity problem by ensuring that even if you had never met the patient in front of you before (nor were likely ever to meet them again) you at least had instant access to everything that had ever happend to them – including the results of every test they had ever had.  But it doesn’t work – data has no meaning until it is turned into a story – and the more data you have the harder it is and longer it takes to turn it into a story.

And stories matter in medicine – they matter to patients and their relatives who use them to understand the random injustice of disease, it tells them where they have come from and where they are going to.  They matter to doctors as well – medical narratives are complex things, they are played out in individual patients over different timescales – from a life span to just a few minutes, each narrative having implications for the other.  Whilst we don’t neccessarily think of it as such – it is precisly the complex interplay between chronic and acute disease, social and psychological context, genetics and pathology that we narrate when summarising a case history.  When it is done well it can be a joy to read – and of course it creates the opportunity for sudden moment when you get the diagnostic insight that changes the course of a paient’s treatment.

Natural Language Generation

Turning the undifferentiated information that is a patients medical record – whether paper or digital – into a meaningful story has always been a doctor’s task.  What has changed is the amount of information available for the source material, and the way it is presented.  A good story always benefits from good editing – leaving out the superfluous, the immaterial or irrelevant detail is an expert task and one that requires experience and intelligence.  You see it when comparing the admission record taken by a foundation year doctor compared to an experienced registrar or consultant – the former will be a verbatim record of an exchange between doctor and patient, the latter a concise inquisition that hones in on the diagnosis through a series of precise, intelligent questions.

So is the AI technology that is able to spontaneously generate a love story sufficiently mature to be turned to the task of intelligently summarising the electronic patient record into a meaningful narrative? Its certainly been used to that effect in a number of other information tasks – weather forecasts and financial reports are now routinely published that were drafted using NLG technology.  The answer of course is maybe – there have been some brave attempts – but I don’t think we are there yet.  What I do know is that the progress of AI technology is moving apace and it won’t be very long before the NLG applied to a comprehensive EPR will be doing a better job than your average foundation year doctor at telling the patient’s story – maybe then we will fall back in love with EPR? Maybe…

On the 14th April 2003 biomedical scientists achieved the medical equivalent of the 1969 apollo moon landings – The first entire gene sequence of a human was published.  This was a phenomenal achievement and was the culmination of 12 years of intensive research – it was announced by the US President with great fanfare along with excited promsises of revolutionary advances in medicine.  We all waited with anticipation – and we waited.  Rather like the dawning of the space age – that first momentous step seemed to be followed by a quite a prolonged period of rather disappointingly mundane achievements (where are the moon colonies, hotels on mars?).  My entire medical school training and early career was filled with promises of the genetic age of medicine.  And whilst without doubt the technology of genetics has transformed our understanding of disease and created many therapeutic opportunities – the revoloution seems to have been largely confined to the laboratory and some very rare inherited genetic disorders.  The impact on most doctors (and patients) has been marginal to non-existent.  I do believe this is about to change though.

The First Two Ages of Modern Medicine

I am defining modern medicine as the era in which it becam possible ‘to do’ something to alter the course of disease and suffering.  It largely coincides with the medical profession’s mastery of pain and conciousness – allowing for the explosive development of modern surgery, and its mastery of infection – through vaccination, asepsis and antibiotics.  These triumphs of the late 19th and early 20th century brought about a rather (possibly justifiably) hubristic ‘doctor knows best’ attitude of the profession and a transformation from cynicism (just read the literature to find out what the victorians thought of their doctors!) to profound trust of society in the capabilities of the profession.  I will call the first age of modern medicine the ‘Paternalistic Age’.  Of course we eventually discovered that doctors don’t always know best, and that when confered with unreasonable trust – like all humans – doctors sometimes betray that trust.

The second age came about with the realisation that individual experts do not have privileged access to knowledge – and that true knowledge comes about through scrupulous collection of evidence, and when that process is bypassed serious harm can result.  This is best exemplified (but not exclusively) by the Thalidomide tragedy.  Another example of the consequences of unchecked, unjustifiable trust would be Harold Shipman.  Whilst the foundations of trust in the profession have not been completely undermined – there is now a healthy wariness of the claims of the profession.  The second era of modern medicine is the one I have been brought up in – it is perhaps best described as the ‘Evidence Based Age’.  It has been characterised by the ‘standardisation’ of medical care, the medicalisation of health (primary prevention – statins), increasing specialisation and a subtle shift in the powerbase in the consulting room to one of patient as consumer of medical care and doctor as informant and provider.  It has also been characterised by an proliferation of regulation as well as litigation and the practice of defensive medicine.

The two ages overlap of course – by a considerable margin – even as the third age dawns there are still doctors with unfounded self belief and patients that simply submit themselves unquestioningly to their fate at the hands of the profession.  It is also not entirely certain that the second age is always an improvement on the first.  We struggle with ‘evidence’ – it seems to change its mind, and our method of gathering it is expensive, laborious and many of the problems we need solving don’t seem to be amenable to the standard methods of evidence gathering.  This has resulted in the evidence being biased significantly towards therapeutic intervention with drugs – because that is where the evidence gathering resource lies.  We are over regulated – to an opressive degree – and we have managed to instil in our patients both very high expectation and complete dependence.  We are also conflicted – when the evidence (that we sometimes doubt) tells us one thing, our instinct tells us another and our patients have unreasonably  high expectations for something else – it can feel like we don’t have the license to do the right thing.  We end up bewildering our patients by showering them with evidence, risks and benefits – and then saying ‘over to you’ knowing full well that our patients are ill equipped to decide.

There must be a better way – and there is – but it requires the confluence of three revolutions to bring it about.

Three Revolutions

The first of these is one I have written about extenisvely – it is the information revolution as it applies to medicine and healthcare.  The revolution in gathering, processing, decision making and redistribution of medical information is just about getting under way.  However it has not even started to realise its full potential yet.

The second revolution is one I have also previously alluded to – which is the patient empowerment revolution – also just about getting underway if a little slowly.  This not just places the patient at the centre of care, it places them as master of their destiny through empowerment and education.  The medical professional task is primarily one of teaching self care backed up by judicious, co-comissioned intervention.

The third revolution I haven’t written about before – mainly because I have only really just learnt about it.    Whilst I have possibly been dimly aware of the concept of genomics – the reality of it has emerged into my conciousness in the last month as a result of two events.  The first of these was our very own consultant conference at which we were introduced to the launch of the 100,000 genome project.  The second – allied to this – was a meeting at the Institute of Translational Medicine in Birmingham where we were helping NHS England formulate a strategy for ‘Personalised Medicine’.

The Genetic Revolution Begins

So has it finally arrived – the age of genetic medicine – that I was promised as a medical student (blah years ago)?  Well not quite – and of course I don’t think that the third age of modern medicine is the genetic age that was promised.  However genetics – or more specifically Genomics – does form the third pillar of the dawning of our new age.

Returning to our space age metaphor – the 100,000 Gemone Project is the equivalent to the first manned mission to Mars.  The 100,000 people that enter the project are the equivalent to the 200,000 volunteers that have put themselves forward for that mission.  Notwithstanding that we don’t know who they are yet – they will be the pioneers of the third age of modern medicine.  They don’t quite know what they are letting themselves in for, or where in fact they are going.  What is certain is that the journey is most definately one way.

The first human genome sequence cost the US taxpayer $3 billion and took 12 years – technology has advanced somewhat since then and it now costs less than £300 and takes a couple of hours.  Thats little more than the cost of an MRI scan.  You can buy your genome sequence online – don’t ask your doctor what the result means though, they won’t know.  In fact you would be hard pressed to find anyone that can interpret the vast amount of information that is your genome.  This is where the 100,000 genome project comes in – the aim of the project is to give all that information some sort of meaning.

We are more than our genes – we are the manifestation of our genes but with a context and a history.  It is the interaction of our genes with the environment over a sustained period of time – plus the impact of pathologies and the attrition of time on our DNA that makes ‘us’.  A genetic sequence has no meaning until it is interpreted in that context.  The true power of genomics will be realised when we know how people ‘like us’ respond to environmental, therapeutic and pathological influences and the impact that genetic variance has on that.  To achieve that we have to ‘cross reference’ the vast data base that is the genome with an equally vast database that is the ‘phenome’ i.e. everything else.

The 100,000 genome project will start with recruiting people with conditions for which we know there is a genetic component either of the disease itself or the response to currently available treatments – this includes a variety of cancers and a (quite long) list of other rare diseases.  It will collect the ‘phenotype’ of these people i.e. comprehensive and structured information about individuals, their history the environment in which they grew up and live, their response to treatment and their outcomes.  It will probably do the same for their families.  It will process huge amounts of data – and it may not even directly benefit our 100,000 pioneers – much of the significance  of this information will only become clear after time and many more individuals have been recruited.

This is a new paradigm in bio-medical research – it is the science of ‘discovery’ rather than the more familiar cycle of hypothesis testing through randomised control trial.  It imposes a discipline on the way we practice medicine – in particular the way we collect information.  It makes every health transaction an evidence creating one.  It is a model of continuous learning.  What is really exciting is that it is happening right here in the diverse, metropolitan beating heart of the country – Birmingham.

Where will it end?  From what I can see it certainly won’t end at the 100,000th patient.  It is quite a long way to Mars…

Interpreting the Future

So the third age – is it the ‘Genomic Age’? No – although I believe the aims and design of the 100,000 genome project epitomise third age medicine.  I am going to call the third age of modern medicine the ‘Interpretive Age’.  By this I mean the future of medicine will be personal.  We will need doctors that can interpret the large amounts of information from genomics, phenomics, proteomics, theranomics and infonomics (only the last one is my invention) relating to individual patients and interpret them in a way that has meaning for the patient – and that starts with listening to the patient and understanding their context, their wants needs and aspirations (psychonomics? socionomics?).

In many ways good doctors already do this.  Are the GPs that don’t give statins to patients with a 10% risk of heart disease in the next 10 years (see Times Thursday 29th October 2015) – denying patients best evidence based care or are they practising personalised medicine?  Is it right to call someone only at risk of disease a patient?  Genomics is really simply another tool that gives an unheralded level of precision to the decision making we can make with our patients for what is best for them.  There are many tools in that box – some of them listed above – are we equipped to use them though?  I am certain that when we have have ‘precision personalised medicine’ brought about through detailed interpretation of genetic, therpeutic, informatic data, we won’t be giving 3.5 million healthy people statins.

Are you an ‘Interpretive Doctor’?


A shocking revelation today exposed the collusion between a national newspaper and an independent health information manipulator to be based on flawed data and groundless inference.

A Perfect Business Model

Dr Bluster Insinuation Ltd is a private company that uses freely available public information about hospital activity and applies sophisticated statistical analysis to it in order to draw inferences about the quality of care provided by the NHS. A spokesman for Dr Bluster said – “Our clever tools can literally turn garbage data into pure gold” – “We take this free public information, make a big fuss of it and then sell it back to the public sector and to national newspapers”.

Accounts filed with companies house for 2012 show that Dr Bluster Insinuation Ltd turned over £22 million. Another spokesman said “Its alright though, one of our major shareholders is the government – so the public sector profits from our profit really – no honestly”

Making a Meal of it

It was revealed today that Dr Bluster has a covert relationship with the Daily Meal – using this national platform as publicity for its services – exploiting the paranoia of the readership about public services and the NHS in particular. “Its a perfect partnership” said an insider “we feed them stories based on pure inference and they jump to the conclusions they want to – its a great way to sell papers”. It was revealed through our investigation that Dr Bluster has been releasing embargoed reports to the Daily Meal before revealing them to the Hospitals about whom they are making the groundless insinuations. “Its quite entertaining watching hospital spokesmen trying to respond to our allegations when they haven’t even seen the report or our analysis” – “By the time they get the reports and work out that our allegations are not supported by the data everyone else has got bored and moved on”

Data Undermining

The Daily Meal made the following statement:

“Our mission is to undermine public confidence in the NHS because we believe in the private provision of healthcare – Our readers would deservedly benefit from this because they are wealthy and pay taxes where as the poor don’t”

The Daily Meal claim that the NHS costs too much and isn’t very good – even though independent international comparisons of healthcare systems made by the Commonwealth Fund (A United States healthcare think tank) rate the NHS as the best value healthcare system in the World and ranks it second for health outcomes amongst Europe, United States, Canada, Australia and New Zealand.

“We don’t report stuff like that because our readers don’t want to know” said the Daily Meal.

In an unguarded moment Dr Bluster said the following:

“We are experts in ‘Big Data’ – we mine the huge volumes of data that come out of the NHS for stories we can sell – I suppose you could call it gold digging”

A Knight in Shining Armour

Professor White (Knight in Shining Armour) – a clever man who works in public health – has said that it is a scandalous misuse of statistics to make these claims. If you look at mortality data through a PRISM you can see that there is absolutely no correlation between the numbers and the actual quality of care provided to patients. If we really want to improve the care of patients in the NHS you have to look at every case and learn from the mistakes we make. One of the biggest mistakes we make is to deny people who are inevitably dying access to palliative care.

Authors Note

I hope you enjoyed this piece of fiction. I do not take any responsibility for any conclusions you may jump to about persons or organisations existing being referred to in this blog. The views mis-represented here are entirely my own and have nothing to do with my employer

I have just finished reading the astonishing book by Daniel Kahneman ‘Thinking, Fast and Slow’ – it is a book that takes you on a journey of thirty years of discovery in psychological science. Once you read it you will never ‘think’ about yourself the same way again. The central tenet of the book is particularly salient to the conversation in the health service about mortality and the statistics relating to it following the recent publication of the findings of the second Francis report.

Judgements, Biases and Heuristics

Kahneman describes three dichotomous concepts – Firstly two types of thinking ‘System 1’ (Fast, intuitive, associative, innate and effortless) and ‘System 2’ (Slow, analytical, calculating, learning and effortful); Secondly two types of people ‘Econs’ (rational, consistent, logical, ideal economic agents that always make the ‘right’ choice) and ‘Humans’ (reasonable but subject to biases of their thinking such as priming, framing, narrative fallacy, imabalanced attitudes towards risk of loss and gain, excessive weighting to ‘available’ evidence, relatively blind to statistics and the ‘external’ view); Thirdly two ‘Selves’ the experiencing self (the person reading this blog here and now) and the remembered self (the person your mind has created through the narrative stitching of remembered events and experiences). For more details – go and read the book…

What is particularly interesting about his research is that he has demonstrated unequivocally time and again that being an ‘expert’ in any field, be it healthcare, economics or even psychology itself, does not protect you from these innate biases of human thinking – even when you know they exist they still influence you, and in fact being an expert simply puts you in a position where these biases are more consequential (your biases harm other people as well as yourself). He does, however, describe a number of strategies that organisations can adopt that defend against the consequences of individual judgment bias, strategies that need to be adopted systematically and deliberately. He gives examples of spectacular corporate failure, through group think, where these strategies had not been adopted.

The Statistics of Death

Why is this all relevant to us following the Mid-Staffs crisis, Francis report and the renewed scrutiny of hospitals with ‘outlying’ mortality figures? In the pantheon of spectacular corporate failure the events in 2007 at the Mid-Staffordshire NHS Foundation Trust must rank prettily highly, and of course the principle theme of the second Francis report is that this was an NHS corporate failure, not just a hospital failure. One of the central themes of the crisis centres on the role, meaning and response to mortality statistics for the hospital at the time. It is worth, therefore, taking a little time to understand how mortality statistics are generated.

At the time there was a single statistical measure routinely used to compare mortality in hospitals across the NHS – Hospital Standardised Mortality Ratio (HSMR). HSMR is calculated by taking the observed death rate and dividing it by a calculated figure for an ‘expected’ death rate. The expected death rate is essentially the death rate in hospitals for the whole of the UK with a number of adjustments made in order to compensate for differences in case mix (the diagnoses patients are admitted with), age, co-morbidity and social deprivation. The ratio is multiplied by 100, so if your observed death rate is the same as the expected then the ratio is 1.0 and the HSMR would be 100. When observing mortality in a population it is important to recognise that over time mortality will always increase (we all have to die eventually) – therefore when comparing mortality in different populations we have to know over what time period mortality has been counted (usually 30 days, 90 days, 1 year or 5 years). One of the weaknesses of HSMR is that it does not specify the time period – it is simply the time spent in hospital – therefore a hospital with a longer than average length of stay will tend to have a higher mortality. Another weakness of HSMR is that it does not have a diagnostic category code for palliative care. The provision of out of hospital end of life services across the UK is at best patchy, where provision is poor patients are admitted to hospital to die, HSMR does not adjust for these expected deaths and so hospitals in this situation will have a higher than ‘expected’ HSMR. As a result of these statistical biases, HSMR is being replaced by a newer indicator – Standardised Hospital Mortality Indicator (SHMI). SHMI is very similar to HSMR but differs in two important aspects. Firstly it expands the number of diagnostic codes against which case mix adjustment takes place from 56 to 140, including one for palliative care. Secondly it fixes the number of days over which mortality is measured to 30 days after discharge.

The other important thing to recognise about mortality statistics – is that they are just that; statistics. As such reliability depends on sample size and confidence intervals. As a rule of thumb a condition with an expected mortality rate of 10% needs a sample size of about 200 cases before the confidence interval falls to a point where a doubling (or halving) of the observed rate can be explained by anything more than random chance. This is best illustrated graphically by the ‘funnel plot’ an example of which is below – and explained simply in this article here.


Two Stories About Mortality

Hospital A is a medium size district general hospital in a provincial town of the midlands. It is more than 30 miles from the nearest city and serves a relatively fixed population, bounded by open countryside, which is below the threshold for sustaining a comprehensive range of hospital services. It has recently successfully achieved foundation trust status by importing a new board that have railroaded a controversial savings program through that has helped make the long term finances appear sustainable. Many staff have been made redundant and staffing ratios on the wards have been reduced. Morale is low and sickness rates are high, the wards are busy with large numbers of elderly dependent patients. The hospital has always struggled with attracting high quality medical staff because of its geographical location, tenuous affiliation with a university hospital, low numbers of doctors in training and an unexciting specialty portfolio. There have been a number of recent complaints from relatives unhappy about basic standards of care on the wards – two of these have been reported in the local newspaper, with the suggestion that neglect of care was a contributing factor in the deaths of patients. The HSMR for this hospital is 127, the second highest in the country.

Hospital B is a large university trust in a major city in England. It has a number of highly specialised tertiary services for which it has a national and international profile. One of these is complex paediatric congenital heart surgery, for which it is one of only a dozen centres in the country. The unit is celebrated locally and has a loyal following of patients and their parents who have received treatment there. The surgeons often having taken on cases others have refused and in doing so averting what would have otherwise been certain death. The unit has been threatened with closure as a result of a national consultation on re-configuration of paediatric cardiac surgery – aiming to concentrate services from eleven to seven centres. The local newspaper is outraged and has rallied support from local and national celebrities and politicians to keep the unit open. The staff are highly motivated and capable and the surgeons have published acclaimed original research in international journals. The standardised mortality ratio for paediatric heart surgery at the hospital is 200, the second highest in the country.

Beside ourselves jumping to conclusions

Death is a powerful word – just reading it on the page is likely to result in both a physiological response (increased heart rate, blood pressure and dilated pupils) and an emotional one (fear, disgust and aversion). These responses are usually rapidly attenuated by the rational part of the brain (system 2), however the alertness brought on by the physiological and emotional response will have activated system 1 – your innate, intuitive, fast thinking brain will be in overdrive (primed to deal with the ‘threat’) as will all of its biases.

So what are you thinking about the two (completely fictional) hospitals above? At whom is your outrage directed in each of the stories? Would you allow your grandmother to have her fractured hip treated at hospital A? Would you recommend a friend have their child with VSD be operated on at hospital B? Are you angry with the hospital or the system that is trying to close it down?

Before you answer those questions it is worth taking some time to reflect on how your thinking may be being manipulated:

  • First of all – Priming – the stories I have written have a ring of familiarity to them, you have made an association with a previous experience for which you know the outcome, however hard you try to avoid it, this will be influencing your thinking in both situations.
  • Secondly – Framing – I have (rather clumsily from a literary perspective) set the scene in both stories in a way that influences your thinking, I have told you what some other people think about those hospitals and I have given you some limited facts that probably make you like or dislike each of them.
  • Thirdly – Narrative Fallacy – I have told a compelling story that ‘explains’ the numbers. We all like stories and are primed to discern patterns in randomness – there is a multi-billion pound industry built on this tendency, its called professional sport and is hilariously exposed on a regular basis by Daniel Finkelstein
  • Fourthly – Availability Bias – this is also variously known as positive reporting bias and / or economy with the truth. It is ruthlessly exploited by the pharmaceutical industry through the suppression of non-supportive evidence for the efficacy of drugs. They are by no means the only culprits though, most published academic research is subject to this bias as well. Even though you ‘know’ I haven’t told the whole story in both cases and can easily entertain the idea that there may be other facts, unreported, that could change your view of the situation – your view is firmly anchored by what I have told you so far and new evidence can only move you from this position.
  • Finally – Statistical Blindness – the fact is that whilst system 1 thinking is life saving, allowing you to act quickly, decisively and intuitively in almost all everyday situations – and many non-everyday situations – it is designed to jump to conclusions, to abide by rules of thumb, accept evidence, particularly statistical evidence, at face value. As much psychological weight is given to a statistic based on a handful of cases as many thousands even when there is no mathematical justification for doing so.
  • This last point is particularly pertinent to my two stories, because they both have one thing in common – they are both ‘small’ – one is a small district general hospital, the other is a small highly specialised unit. Their outlier status is almost certainly more to do with their smallness than their quality of care, although this should not be ruled out of hand either.

    A Prejudice Fuelled by a Bias

    Buried within the shrill, insistent and pervasive criticism of the Mid-Staffordshire Hospital Trust is a deeply held prejudice amongst leaders and policy makers within the health service; big is good small is bad. This prejudice is affirmed and re-affirmed on a regular basis by the apparent evidence of poor performance and outcomes – the smaller you are the more likely you are to find yourself at the top or bottom of a league table. The fact that this phenomenon is a mathematical inevitability is either not recognised or overlooked because it suits ‘the system’, it provides supporting ‘evidence’ that not only are small services expensive they offer poor value as well.

    Big of course is not inevitably good either – one of the advantages of being a large institution is that you experience the converse of the ‘small outlier’ phenomenon in your outcome statistics, you are generally a ‘large average’ institution, you routinely find yourself (reassuringly) ‘in the pack’. Reassured you shouldn’t be though – hiding in every large institution’s aggregated outcome statistics will be some great performance and some dreadful. Some of the shrillness of the commentary will be disguising the fear that every leader holds – that within their own institutions are lurking little bits of Mid-Staffs.

    The mortality figures at Mid-Staffs were probably the least surprising and least relevant part of the story. A great deal did go wrong at the hospital – particularly at board level where there is much evidence that it was the victim of ‘group think’ – almost all of the pre-requisites and risks were present and it would appear none of the defences. A topic I think I will come back to in a future blog.

    Outcome Statistics – A Health Warning

    As an intensivist I have lived and breathed outcome statistics (I’ve even written a book chapter on that subject as well). They are incredibly useful tools, but they take time to become useful, it took the intensive care community a good decade to start to understand the meaning and utility of the statistics produced by ICNARC. What is absolutely certain is that they can never tell the whole story, and in fact when constructing a story about the quality and safety of a service they should simply act as pointers for further and deeper investigation. There are many nuances even to the apparently binomial outcome of mortality that have to be unpicked and understood before coming to any meaningful conclusion. I thoroughly welcome the fact that mortality as an outcome has found the spotlight – it used to frustrate me immensely as a clinical director that I was held to account more for my financial balance than the number of deaths on my unit. But in finding the spotlight it has been picked up, sensationalised and put to political use – not just by the press but by people within the health service that should know better.

    The conversation about mortality in hospitals needs to be held in an intelligent, un-frenzied, non-political and unprejudiced environment – we risk doing immense harm to fragile services if we don’t.

    When thinking about mortality it is vital that we think slow.

    The sea reflected the almost unblemished sky with a dark, angry meridian blue. Only the slate-grey streak above the horizon belied the otherwise benevolent August day. The rocky outcrops, punctuated by deep black caves and lightly rusted with seaweed and lichen, glistened like tarnished silver in the midday sun. The mineral white surf thrashed with frenzied futility against the oblique buttresses of rock, throwing up foamy spray that blew about like a midsummer blizzard. Occasionally it would drift up over the cliff edge to the vivid green fields capping the headland, dotted with sheep chewing with bucolic nonchalance, oblivious to the seething battle only feet beneath them.

    The Atlantic rollers were splendid, coming with just the right periodic regularity, energised by the residuum of a distant hurricane reverberating it’s destructive existence from across the ocean three days before. Standing with my surfboard each wave announced it’s arrival at first with a powerful sucking force, dragging sand, seaweed and debris painfully around my legs. It would then rear up, a sandy turquoise colour latticed with submerged foam, darkening suddenly as it tipped into a breaking roller. If I timed it right it would pick me up and accelerate me forwards dangerously, exhilaratingly, thrilling in a way no cosseted roller coaster ride could possibly ever achieve.

    They just kept coming and I couldn’t drag myself away – addicted to the reliable adrenaline rush with each wave I caught. I must have not stopped to look around for some time because all of a sudden the sun winked out, engulfed by the dark grey blanket that had scudded in from the horizon. The mood of the waves turned from playful energy to menacing power and my anticipation became tainted with anxiety.

    I staggered with the drag as the water level dropped from mid chest to below my knees. This wave really towered, it was clouded with the churned sand in it’s turbulent core and seemed to suspend itself above me whilst I decided whether to dive through or try and catch it. Of course it was playing with me, laughing at me, as I decided a fraction of a second too late to try and catch it. I felt the weight of the water first – it crushed the air out of my lungs – before picking me up and turning me over feet first, tearing the surfboard from under me and snapping the wrist tie. I was submerged and tumbling, the force of the water pushing me face first into the gritty sand, before changing direction and picking me up again. I couldn’t breath and sandy salt water was forced into my nose and throat. It kept me under, shaking with contempt my rag doll attempts at swimming, long enough for the panic of imminent drowning to start rising from my solar plexus. Just as I began to think I couldn’t get out of this it dragged me front first into the shallow shore, sand filling the front of my wetsuit. The water hissed as it retreated away from me over the rippled sand, as if to dare me to try this again.

    I limped up the beach with my broken surfboard flapping forlornly, bruised, grazed and my head spinning slightly. I lowered myself on to the picnic blanket to a welcoming sweet biscuit and strong coffee as the light summer drizzle began to fall.
    “How is the surfing today darling?”
    “Brilliant – absolutely brilliant…”