“One of the main benefits of the software is that it is based on industry-standard tools, so a lot of what would be run on CPU can now be run on GPU,” explains Harry Clifford, NVIDIA’s head of genomics product. “There is a huge acceleration factor with it being on GPU.”
That translates into more than 80 times the acceleration on some of those industry standard tools, he notes, adding that the software is also scalable.
“It’s fully compatible with all the workflow managers genomics researchers are using,” Clifford says. “There’s also the improved accuracy point, which is provided by artificial intelligence-based deep learning and high accuracy approaches included in the toolkit.”
Volume, Velocity and Variety of Data Pose Challenges in Genomics
Clifford points out that Big Data analysis can be split broadly into three pillars: the amount of data (volume), the speed of processing (velocity) and the number of data types (variety).
“First off, we have this huge explosion of data, this volume problem in genomics, and that’s why you need HPC solutions,” he says.
The second aspect of the Big Data challenge is velocity, as each sample that is run through a sequencer must be run through a sequencing process, a wet lab process and then through the computational analysis process.
“Those sequencers are now running so quickly that compute is the new bottleneck in genomics,” Clifford explains.