Health Data, Research Get a Boost in the Cloud
Cloud computing is becoming as important a factor for the healthcare research community as it has been for every other industry.
And why not? “Healthcare research organizations gain the same benefits as other users of cloud computing,” says J. Peter Melrose, an independent IT consultant to healthcare providers and a contributor to the Cloud Standards Customer Council’s white paper, “Impact of Cloud Computing on Healthcare Version 2.0.”
Those benefits include the use of state-of-the-art and automatically scalable resources, assurances of specific levels of availability and throughput from commercial cloud service providers (CSPs), and the ability to pay for services on a usage basis via operating budgets.
Additionally, Melrose points to a stand-out benefit for healthcare research organizations: “Making more basic and applied research findings available sooner to clinicians for the care and treatment of their patients, and the clinical results of care and treatment available to researchers in the pursuit of their research and development activities.”
The Crohn’s & Colitis Foundation sees the value of cloud computing for such ends. Its IBD Plexus initiative, which leverages the IBM Bluemix hybrid cloud infrastructure, has collected and made available research-ready data (including large molecular sequencing files) from four different studies to date. That includes one study which collected data from 20,000 patients to advance standards for inflammatory bowel disease care and another study with data from about 7,000 patients aiming to improve the ability to identify which specific treatment a patient is likely to best respond to, says Angela Dobes, the IBD Plexus senior director.
The cloud, she says, has made it possible for IBD Plexus to collect this data at an expedited pace, with near-real-time data transfers, and in a cost-efficient manner.
When the initiative formally launches later this year, stakeholders, including researchers and scientists in academia, pharmaceutical companies, and medical institutions and practices, will have access to really rich data sets that they’ll be able to provision in minutes and download directly from the initiative’s secure browsers.
Cloud Computing Leads to Healthcare Research Wins
“Before this, for the most part data was siloed and only accessible to the researchers who conducted the research,” Dobes says.
Now, data becomes a reusable asset, she explains, with “users from academia and industry gaining access to robust data sets to advance their own research projects, and sharing the project-specific data sets they generate back to IBD Plexus,” following an initial period of exclusivity. The cloud provides the platform for Plexus to prove that “centralizing the data doesn’t centralize the power, but disperses it so that everyone in the IBD community wins,” Dobes says.
Cancer researchers are winning, too, with the launch last year of the National Cancer Institute Genomic Data Commons at the University of Chicago. Already the largest data commons in the world, it relies on a cloud platform that leverages technology including OpenStack and the IBM Cloud Object Storage system.
In addition to its existing analytics toolsets, the project recently added a new data visualization front end to help researchers explore the more than 5 petabytes of cancer genome data from 39 different sources that it has harmonized. More data sets will be released over the next few months, making the cloud’s scalability even more valuable for the effort.
“This is a critical mass of genomic and associated clinical data to accelerate research for the cancer research community,” says Dr. Robert Grossman, a principal investigator for the project. “Genomic data sets have grown larger and it’s harder for individual researchers to download and analyze all this data themselves. We consistently process all data, identifying mutations and other genomic variations, to make it easier for the community to use the information.”
The Healthcare Cloud Is Safer Than You May Think
Security has been an issue in the cloud computing world in the past, and healthcare researchers understand that the protection of patients’ personal identifiers and protected health information (PHI) is critically important. IBD Plexus maintains PHI on scalable bare metal servers that are part of the IBM Bluemix servers, scrubbing, cleaning and anonymizing that data before it goes to the cloud for researcher access.
But that decision, made not long after the start of the project three years ago, was more to address the perception among some stakeholders that the cloud poses security challenges. “We had to make sure we were able to make them all feel confident the data is secure,” Dobes says. Other aspects of IBM’s solution, such as encrypting data at rest and in transit, satisfy requirements found in the Health Insurance Portability & Accountability Act (HIPAA), she says.
In fact, according to Melrose, CSPs typically provide the expert security staff that healthcare research organizations need for managing such sophisticated solutions as encryption more cost effectively than they could do on their own.
“CSPs marketing healthcare expertise and legal or regulatory certification will have direct and in-depth knowledge of such operating rules as the HIPAA Security Regulation, including physical, administrative and technical standards for data in motion and data at rest,” he says. “They also will be knowledgeable of implementation methods for such aspects of the HIPAA Privacy Rule as medical record de-identification, in order to prevent exposure of individually identifiable health information.”
As Grossman explains, everyone in the healthcare research field understands the importance of data security. The Genomic Data Commons works hard to balance security and data usability.
“For a lot of queries, we can make the data available through our point-and-click interface in the new data visualization portal in a very secure way,” he says. The raw data never leaves the data commons, with all the activity happening behind a secure, compliant perimeter.
“It really makes dealing with large amounts of data very secure, and more so than if every researcher built their own environment,” he says.