Biotechnology — which harnesses cellular or biomolecular processes to improve care — is helping healthcare reach new heights as novel collaboration and computational flexibility expand the tech's reach. This is according to James M. Ostell, director of the National Center for Biotechnology Information (NCBI), a division of the National Library of Medicine at the National Institutes of Health.
James Ostell, director of the National Center for Biotechnology Information, NIH. Photo courtesy of NIH.
Although he only stepped in as director last September, Ostell has a long history with NCBI, having been at the organization since it was established by Congress in 1988. Since then, he has helped to shape it into one of the most widely used biomedical resources in the world.
Four million users a day access NCBI's resources, which include major biomedical databases, such as the literature database PubMed and GenBank for DNA sequences. The organization also offers an array of computational and analysis tools related to genes' role in health and disease.
With an eye on biotechnology's future, Ostell offers insights on where the technology stands, and what it might help the healthcare industry achieve in the future.
HEALTHTECH: What is the mission of the National Center for Biotechnology Information?
OSTELL: Originally, the NCBI focused on basic research and algorithm development as well as developing resources, training and providing resources to the public. We built some of the early tools to compare DNA sequences so as to better understand the molecular makeup of certain cancers, which pushed the notion that computation is an important aspect of discovery.
Now, we're continuing to further genomics by helping to solve computational problems that arise when it comes to sequencing DNA. Currently, we have teamed up with the Food and Drug Administration and the Centers for Disease Control for a project using bacterial surveillance in DNA forensics. It requires that we match bacterial samples to trace a disease back to its source, which requires high-throughput, Big Data analysis. This collaboration has been incredibly constructive in tracing the source of a food outbreak earlier and much more accurately.
HEALTHTECH: How is biotechnology used for specific healthcare-related applications beyond research?
OSTELL: Bacterial and viral surveillance, such as those that can lead to a better understanding of the Zika outbreak, currently require a large amount of sequencing to identify DNA forensics for diseases — who's the culprit and where did they come from — as well as identify whether there's a gene in that bacteria that causes antibiotic resistance.
Further, genomics and genetic signals in healthcare are gaining speeds recently, particularly as it pertains to diseases that can be identified via mutations in the human genome, such as sickle cell anemia or cystic fibrosis. There's also a rise in use of genetics in cancer treatment because there are certain types of mutations in tumors that can occur in various different cancers. It's not so much the kind of cancer you have, it's that you have a particular mutation among the various ones that give you liver cancer, for example. So, it's a different way of targeting the therapy.
HEALTHTECH: What emerging technologies might have a large impact in the near future?
OSTELL: First off, we are currently developing some software-development practices that come out of industry: continuous integration/continuous deployment, agile development, etc. Complementary to that, we are starting to move onto commercial clouds, which will offer a number of benefits, including greater resilience.
Moreover, we expect greater flexibility. While we currently get around 4 million users a day, the cloud would enable greater flexibility for our back-end resources, so we no longer have to buy and manage all machines ourselves.
Another aspect to flexibility has to do with burst capacity. For example, we're collecting bacterial genomes at the rate of 70 to 90 genomes a day — about a terabyte of data each. There are tens of thousands of those genomes and we assemble, compute and put into clusters incrementally as the data comes in. If we develop a better algorithm for doing that, we'd want to go back over the existing corpus and recompute it consistently with the new algorithm. In a commercial cloud, it would mean we could burst out to 1,000 machines, do that recomputation, then shut them all down again.
Finally, there's the Big Data question. NIH consists of a large number of separate institutes and they each have their own budgets. … Some of these institutes are developing huge data sets that are much larger than we could afford to store or process ourselves, even though we are the archive. By moving onto commercial clouds with more flexibility, that unlocks the option for the large institutes to directly fund the hardware capacity, but we, in turn, could still help out by providing expertise and resources.
HEALTHTECH: What tools or technologies do providers need to make use of this technology?
OSTELL: To make use of biotechnology often takes large amounts of compute and storage capacity, so the majority of people interface with us through our website.
Some providers download the software tools that we develop, such as the BLAST (Basic Local Alignment Search Tool) search program. Most people run that on a Linux machine in a university center or computer center.
Currently, to download the data necessary for larger human genome sequencing projects could take up to a month. This is one of the reasons we're collaborating to push those up into commercial cloud platforms, which could expedite the process.