Training Needed for AI in Medical Imaging
The amount of training data in medical imaging, especially publicly available data, is a fraction of what is available in other fields. The shortage of curated and representative data sets is one of the largest impediments to developing meaningful AI solutions for medical imaging, and the protection of patient privacy adds to the difficulty.
Recently, companies such as NVIDIA and Google have created software tools to enable data-distributed techniques for training AI. One example is federated learning, which works by deploying AI models to each participating institution in a discrete group (or “federation”). Models are then trained individually at each institution through exposure to local data. During training, models are periodically sent to a central federated server, where they are aggregated together. The aggregated model is then redistributed to each institution for further training. This is the key step in preserving privacy, as the models themselves consist only of parameters that have been tuned to data, not the protected data itself.
Over time, this process allows AI models to receive the benefit of knowledge learned at every institution within the federation. Once training is complete, a single aggregated model is produced that has been, indirectly, trained on data from all institutions in the federation.
MORE FROM HEALTHTECH: What’s next for AI in healthcare?
Signs of Promise for Medical AI
In a study published this year, our team in the UCLA Computational Diagnostics Lab investigated using a federated learning architecture to train a deep-learning AI model to locate and delineate the prostate within MRIs using data from different institutions. We found that federated learning produced an AI model that worked better on data from the participating institutions and on data from different institutions compared with models trained on one participating institution’s data alone.
To understand the enthusiasm for federated learning, consider if organizations in a federation were smartphone users who had agreed to allow an algorithm to analyze the images on their phones, potentially allowing model training with distributed compute and data from a vast user group. From this perspective, one can imagine analogous scenarios in medicine in which patients can opt in to federations for compensation. This could speed up innovation within the medical AI space.
Successful medical AI algorithm development requires exposure to a large quantity of data that is representative of patients across the globe. Our findings demonstrate an alternative to the financial, legal and ethical complexities this has posed: Institutions can team up into federations and develop innovative, valuable medical AI models that can perform just as well as those developed through the creation of massive, siloed data sets, with less risk to privacy.