Artificial Intelligence (AI) has been hailed to be the next saviour of the NHS , with new robotic overlords saving thousands of lives from cancer-related deaths over the next 15 years. AI has suddenly become panacea, solving everything from self-driving cars to detecting fake news - but what changed so fast?
On one hand, not much: most of the hype that people are selling as “AI” was only a few years ago termed “machine learning”, a change in part due to a previous desire to distance humans from something that was indistinguishable from one. On the other hand, a few specific advancements have aligned, computational power has increased and data is being generated at an unfathomable rate, with more created in the 2017 than in the history of mankind. In other words, it isn’t the ideas or the technologies that are new, but rather the opportunities.
Alan Turing did great work in both code cracking and with his ideas regarding AI; but the possibilities, and their impact, extend well beyond the eponymous Turing test. (CC 2.0 Jon Callas)
How to make a machine learn
Imagine you discover a set of notebooks from a disconnected civilisation that records temperatures with a strange set of numbers that make no sense. The numbers are far too high to be what you’re used to, but you’re confident they are consistent. As you read more of the notebooks, you manage to decipher from the markings the names of some cities you recognise along with some dates. You decide to call this archaic numerical system ‘Fahrenheit’, and you want to work out how to convert between Fahrenheit and Celsius. As a Gizmodo reader, you are cool and hip so you decide to use a machine-learning approach based on a linear model to solve this problem. You look up the temperatures in the cities you recognise on the same dates as recorded in the notebooks, and plot the data.
Teaching a machine to convert from Fahrenheit to Celsius is easier than you might think.
‘Machine learning’ and ‘linear model’ all sounds very fancy, but in reality this challenge is something that the average GCSE student will have seen at some point. By plotting the points on a graph (see above), we may draw a line through them (the linear model) and work out the formula for that line. The equation I got from just four data points wasn’t exactly right, but it was close: the classic formula for converting from F to C is T(°C) = (T(°F) - 32) × 5/9 (where 5/9 is equal to 0.556), while I got (T(°F) - 33.7) × 0.582. We’ve taught the machine to convert from this archaic system of temperature to something that is actually useful.
Bigger data is better data!
Why isn’t my answer exactly right? Two reasons: in the example I gave, we only had a small amount of data to learn from; and the data I used were only given to the nearest whole number, so inducing small amounts of error into each point. If we now collect more data from around the world and repeat the experiment, we can get closer to the right answer.
The data says we need more data – as we train our computer on more complex data, and if the data is good, then the predictions become more accurate (here, T(°C) = (T(°F) - 30.7) × 0.55). However, as I’m only using temperatures within a certain range to train the model, our predictions won’t work as well outside that range.
Our formula is not going to be completely reliable at the extremes, e.g. in space or in a volcano, but for room temperatures it is pretty good. This is a common challenge: the data we use to learn from is called our ‘training set’, and as long as we don’t predict beyond the experience captured in our training set then it will work well. Otherwise, we’ll make mistakes.
This doesn’t need be a problem, as long as we use both high quality data and it is measuring the correct things. Given we are in a world with an abundance of data, and for a positive outcome we only need to do better than the alternative, the success of these methods shouldn’t be surprising. In the case of self-driving cars, this means just driving better than a human, which - given a computer can pull information from cameras and sensors in every direction, and humans are actually pretty terrible at concentrating - it is about when, not if, self-driving cars will be on every street.
Patient data in lab biology
In our recent study, we developed a method named VULCAN. We had generated data studying how the Estrogen Receptor (ESR1) responds to hormone treatment in cancer cells and we wanted to find out what was helping to guide the process. Going through all 20,000 genes that responded to the hormone would have been a slow process so, as a combined computational and experimental research lab, we looked to machine learning to help us.
Underlying VULCAN is an algorithm called ARACNe, developed by the Califano Lab at Columbia University. By building on their methods, we were able to predict what other parts of the cell were helping to drive the hormone response. It feels a long way from learning how to convert temperatures, but it’s strangely similar. ARACNe uses the huge amount of data we have from sequencing tumours from patients. It then calculates if the two genes show a common trend between patients (see below left), and if they do show a correlation then ARACNe puts a link in the network it generates. The process is then repeated for gene after gene, to give the final network (e.g. below right).
Network Building - ARACNe looks at the levels at which a gene is turned on in each patient and compares it to a second gene (each point on the left is an individual patient, and the colour is the type of breast cancer). If there is trend between the two, shown here by all the points sitting along the diagonal, then ARACNE will add a line connecting the two genes on the network (right, credit Ponder Lab).
As a human being, looking through hundreds of patient samples for hidden trends is a nightmare both in terms of the volume of data and in the potential for bias in interpreting the results. On the other hand, a well written algorithm will start to highlight trends. This is exactly what VULCAN enabled us to do: it was able to analyse our data and suggest what parts of the network in the cell were turning on and off together to spot novel biological interactions, on the basis of what the patient data had taught it. That’s very cool because it means we didn’t just learn about breast cancer in a petri dish; we learnt about what drives cancer growth from real patients.
Programmers, Pigeons and Pathologists
Beyond the research lab, machine learning has the possibility to revolutionise diagnosis. A real challenge for treating cancer is working out exactly what disease the patient has. Patients present with lumps all the time, but what treatment - if at all - is a big question.
Pathologists (the people who often look at the patient samples) are the ones deciding if something is malignant or benign, and what form of the disease has presented. The challenge is that pathologists are human, so they make mistakes and can be overworked.
Dr Sanjeev Kumar
I asked Dr Sanjeev Kumar (Medical Oncologist and PhD student with the University of Cambridge) what the work of a pathologist usually entails. Depending on the experience of the person on the microscope, a more simple analysis would take “perhaps 2-3 minutes per slide”. More complex patient samples, however, might require looking at 1000 nuclei, which Sanjeev describes as “no mean feat” since individuals undertaking this work spend 20-30 minutes on each one.
Sanjeev says that, from his experience, automation is currently limited to the world of research and academic work where it plays a role in quality control and avoiding bias. All the same, he’s enthusiastic about the opportunities “since an enormous onus of responsibility is placed on the often subjective assessment of a pathology slide by a single clinician” who then determines the relevant course of treatment.
Machine learning isn’t going to fix everything, but machines can process much larger volumes of data than pathologists. They can work endlessly and consistently, and don’t get tired. Even better, we know it’s a skill that can be taught: in 2015, a team of scientists taught pigeons to analyse images of patient tumours. Using a special pigeon screen and a large supply of tasty pigeon snacks, the system managed to achieve over 99% accuracy - in other words, the same level as a pathologist and for a fraction of the cost. While pigeons won’t be replacing pathologists any time soon, computers certainly do have something to offer, and with much less bird poop.
Marcel Gehrung, a PhD Student at the Cancer Research UK Cambridge Institute, is currently using machine learning to solve the challenges in interpreting patient data. He described to me the challenge in improving patient diagnosis being of two parts: the front end and the back end. “The front end we’ve got pretty good at, however, getting patient samples of similar quality is a major challenge. That is the main reason why the implementation of AI in medicine is more than just creating good algorithms.” CytoSponge™, the device Marcel works on, is fantastic in this respect using a pill-on-a-string to allow rapid collection of cells from a potential cancer patent. Having solved this challenge, the next part is interpreting these results – 15 to 30 minutes per patient sample may not seem much, but within a whole hospital of patients it adds up fast. Marcel hopes that by using machine learning to analyse patient data, his research will form part of a single process from sample collection to diagnosis.
This isn’t the singularity, yet...
Despite the successes of AI, nothing is going to happen overnight, as we’ve seen with Uber and how, when lives are on the line, a tragic mistake can massively set back a promising idea. With medicine, the challenges are amplified: if people die as a result of a mistake with the machine learning, who is responsible? How do we deal with the trade off of lives saved against those lost?
The points of failure need thought too. As seen with my trivial example of temperatures, the training sets are critical to success, yet they are rarely representative. As studies often poorly represent minorities, that means our newly developed digital pathologist may be racist from the outset. If ignored, it would mean our AI might be saving more patients, but at the expense of the lives of those not represented in training data.
The challenges are made worse because, as the output of machine learning gets more complex, we probably don’t even know what the software is looking for in the images. This property of machine learning has been exploited by the development of adversarial objects, which looks relatively normal to human eyes, but are completely different to a machine. In 2017, a series of signs were developed to convince the algorithms in self-driving cars that a STOP sign was a 45 mph sign; however, to a human driver, it would look like just a bit of graffiti. A similar method was used to make a 3D terrapin that, when analysed by Google’s AI, would appear as a rifle from every angle it was viewed at.
The fact that humans and machines see things so differently not only presents problems with people manipulating the input, but that genuine failures can lead to unexpected results. A small mark on a slide could dramatically change a patient’s predicted outcome, while a human would know to ignore it.
Adversarial objects - how an algorithm developed by machine learning sees something is not always obvious. A whole field of research exists to confuse them, leading to STOP signs that look like 45 mph signs , or a 3D model of a terrapin that a computer sees as a rifle.
Machine learning in cancer detection still faces challenges: tissue slides are rarely prepared consistently between hospitals which can have unpredictable effects on computer algorithms, and even ensuring the AI is able to work out which part of the slide it should be looking at is not trivial. Nonetheless, if we resolve these challenges, it’s a technology that can literally save lives by increasing patient survival through better diagnosis in the coming years.
Getting AI from academia and into the clinic will be incremental, not a revolution. A starting point would be running the technology alongside trained pathologists, but also in training those who use the output to understand the statistics behind the results. That computer saying ‘No’ is not enough, and most of all we need to ensure that the training sets are diverse enough to ensure this is real progress for everyone.
Are clinicians looking forward to the future of AI in medicine? Sanjeev was clear: “there will be an ongoing role for machine learning, but not limited to imaging of clinical pathology. Radiological imaging in cancer, particularly in the area of early detection, lends itself to machine learning and has given birth to the idea of Radiomics”. That said, “there is no replacing the skilled clinician in assessing complex and often nuanced phenomena. Every patient is different!”
Like Sanjeev, few expect to make a system that can replace a skilled pathologist in the short term. The real value is in working alongside the skills we already have and making those skills go further, obtaining more accurate outcomes, and saving lives.