Genomics and big data – unlocking the code to new therapies (11 min read)
Since the first human genome was sequenced in 2003, knowledge about this blueprint of life has grown quickly. The ability to process big data sets of numerous genomes with various other patient information increases access to new life-changing therapies.
Apr 27, 2018
Listen to audio version (11:05)
“Genomics is not tomorrow. It’s here today,” says England’s Chief Medical Officer, Professor Dame Sally Davies. The ability to collect and access vast data sets, such as a patient's entire genome, has transformed the way we look at disease, and it’s driving change across healthcare systems. But just as elements of the human genome were largely a mystery before we began sequencing it, scientists and healthcare providers are still exploring the life-changing — and life-saving — potential of this code to life.
The pace of discovery is extraordinary. It took 13 years, a thousand scientists and USD 3 billion to sequence the first human genome. Today, we can map someone’s genome 30 times — to eliminate any scanning errors — in just 40 hours, and it costs less than USD 1,000. And soon it may be possible to decipher one of these ‘blueprints for life’ for as little as USD 100.
It was only in 2003 that the first genome sequence was completed, but this giant leap in knowledge is already transforming lives, particularly in the diagnosis and treatment of cancer and rare hereditary diseases.
Professor Tim Hubbard, who was a key player in the Human Genome Project and is now Head of Genome Analysis at Genomics England, says, “Given the effort that had been required to determine that first sequence, no one expected it so quickly to become practical and affordable to sequence individuals on the large scale possible today.”1
No one expected it so quickly to become practical and affordable to sequence individuals on the large scale possible today.
Sequencing the genome has already led to new diagnostics, treatments and previously unimagined avenues for further research – and the more we learn, the more potential we discover. Sequencing opened this new frontier in medical research because the 3.2 billion letters of coding identified approximately 19,000 genes, the sections of DNA which contain the code to build all the proteins which are necessary to create and maintain human life. For the first time, scientists have been able to access and explore the blueprint of human life.
Almost every cell in the body contains a complete set of genes, our genome. Half of this genetic data comes from our mother, and half from our father, which means that if either carry a gene fault associated with a hereditary disease, we could also inherit the problem or pass it on to our children. Some variants in genes, such as BRCA1 and BRCA2, are also known to increase the risk of cancer.
Scientists had been expecting to find at least 100,000 genes, based on what Professor Hubbard describes as a “back of an envelope calculation”2 but discovered that genes that code directly to build proteins actually make up less than 2% of our genome.3 Much of the genetic material in between was initially written off as ‘junk DNA,’ but we now know these sequences include complex layers of switches which turn genes off, or regulate their intensity when they’re on — and this is changing the way we look at cancers and rare diseases.4
Identifying the individual cause – and cure
The 100,000 Genomes Project, launched in 2012 and managed by Genomics England, is already helping patients across the National Health System (NHS) affected by rare conditions. Professor Hubbard explains, “We compare the whole genomes of the parents and child and look for the differences, the unique change in the child or combination of things brought together which have caused their disease.” The patients involved are those whose existing diagnostic tests did not reveal the cause. Genomic sequencing, and access to these huge data sets, is not only revealing the drivers for some of their conditions; it is already translating into new therapeutic approaches.
In about 20% of the children tested to date, scientists have identified the mutation, or series of mutations, likely to be responsible for their conditions. In some cases, this knowledge has pointed to a possible mechanism and, in a few cases, has even suggested a novel treatment. The hope is that by comparing genomes from the remaining patients with similar medical histories, the results may reveal more complex patterns and interactions, and point to even more potential treatments and therapies.
With help of modern sequencing machines, it’s possible to map someone’s genome 30 times — to eliminate any scanning errors — in just 40 hours. Credit: Getty Images/Monty Rakusen
This approach represents a major shift in focus. “We have been categorizing disease on the basis of what doctors observe and describe as Disease X, but in many cases it will turn out that Disease X is a range of different underlying mechanisms which manifest themselves as a similar set of symptoms and characteristics.” So, if there’s not one disease X, it may be that one treatment X is not the best solution.
This new approach has also been a key factor in the move towards increasingly personalized medicines, particularly in oncology. This is because our genetic code can also be damaged by environmental and lifestyle factors, and if this damage — or mutation — is duplicated through the proliferation of faulty cells, it may give rise to cancer. A number of targets – including HER2, VEGF and BRAF, as these regions are known scientifically – were identified before the genome was sequenced and led to the creation of new cancer therapies which target these specific faults. Now genomics and patient data are being combined to inform diagnosis and treatment, and to locate new targets for even more precise therapies.
Finding new therapeutic targets with help of big data
At the Munich Leukemia Laboratory (MLL), scientists sequence known “hot-spots” or a series of hot-spots in panels of around 80 blood cancer genes from patients with leukemia and lymphoma. They are also beginning to use Next-Generation Sequencing (NGS) — which analyzes millions, and even billions, of DNA strands in parallel — to investigate whole genomes. “For some of these genes from the panels, we have experience with more than 30,000 cases, so we have already seen everything that can be going wrong with that gene,” says MLL’s CEO, Professor Torsten Haferlach.
More importantly, MLL focuses on mutations for which there are already therapies available. As a result, says Haferlach, “More and more, we can provide clinicians with actionable, prognostic information to guide therapy.” This method also spares patients the risk of side effects from therapies which won’t help them. But caution is needed, says Haferlach: “When you look at genes or even a genome, you see so many things. You need to be very careful you interpret them correctly. You have to be sure you are identifying only malignant changes, not polymorphisms, which are simply variants of healthy cells.”
More and more, we can provide clinicians with actionable, prognostic information to guide therapy.
This is where big data — in the form of both genomes and phenotypes, or observable characteristics and medical histories — from large numbers of patients can be used not simply to find the needle in the haystack, but also to pinpoint which needles have sharp tips and which are blunt and therefore not dangerous at all.
To achieve this, MLL has teamed up with IBM and Illumina, developers of targeted NGS technology, to swiftly analyze gene sequences and patient histories in order to locate changes in only a small number of cells, or more subtle variants such as so-called “mosaic mutations” which have previously been virtually invisible. Using artificial intelligence in this way, they can also re-analyze genomic data from patients who appear to have the same disease, to identify even more specific sub-groups and pinpoint new therapeutic targets.
Professor Haferlach says, “Let’s say we have a tumor driven by only two or three genes, and the rest are passengers. As we collect more and more data on the genomic background of different diseases, we will become much better at identifying the drivers. Then we can address these genes with specific drugs, and if we can kill, or neutralize, the drivers, the passengers will also lose their power.”
Systems are beginning to be developed to help clinicians and their patients benefit directly from the large datasets now being aggregated. CancerLinQ, (Cancer Learning Intelligence Network for Quality), developed in conjunction with the American Society of Clinical Oncology, is making big data more manageable by connecting electronic records from hospitals, health centers and medical practices to provide a portal for doctors. There, they can access real world information from more than a million cancer patients.
Digital technologies will allow physicians to find personalized therapies which are tailored to an individual’s unique genetic profile.
CancerLinQ’s medical director, Dr. Robert Miller, adds, “The norm is that doctors care for cancer patients in their own clinics and own cancer centers, and the data is sitting in their EHRs [electronic health records]. It’s rarely accessed and certainly not shared across institutions. Through the CancerLinQ technology, we now have the ability to have this much larger aggregated dataset that everyone can benefit from. All of the data from these different types of sites can be blended into one data set that everyone can access.”
Looking at the big data picture in healthcare
Ideally, medicine would benefit from knowledge extracted from patients’ data worldwide, but in practice, this is much more difficult. Professor Hubbard says that there are many challenges to overcome, notably ensuring patient privacy is robustly protected and that patients trust the systems that ensure this.
There are also different legal requirements concerning patient data in different countries. And more fundamentally, widely different systems and standards of clinical data collection affect the ability of researchers to collectively analyze data.
Genomics and big data are already advancing our understanding of many diseases and escalating the development of new drugs. Improving and expanding access to this information through digital technologies is leading to increasingly personalized therapies which are tailored to an individual’s unique genetic profile. It will also allow physicians to prescribe much more patient-specific doses which will minimize the risk of both side effects and drug resistance.
One of the biggest barriers to the wider uptake of this tailored approach has been its price tag. But as the cost of sequencing and surrounding technologies continues to fall, these personalized medicines will become less expensive to develop and manufacture, which will make them increasingly accessible.
Genomic data could also be used to replace one-size-fits-all diet and lifestyle advice with personalized health plans which identify our individual risk factors and the strategies most likely to counter those risks. The possibilities are endless, but one thing is certain: The more we learn, the more we will discover there is to learn.