CASE STUDY
New frontiers in patient care and medical research
Exploring quantum & biomedical data
- With an increasing demand on the volume of DNA data being created, the world’s most powerful traditional supercomputers are unable to manage the volume.
- The collection, storage, and sharing of biomedical data presents significant challenges due to their sensitive nature and ethical considerations.
- Zaiku, the NQCC and OQC begin to address these challenges by exploring new approaches to biomedical machine learning that can leverage large and diverse datasets, whilst also ensuring data privacy and security.
- A promising novel solution to these challenges is to combine the power of quantum computing, with the benefits of federated learning (FL), namely, hybrid classical-quantum federated learning.
Rodrigo Chaves
ALGORITHM DEVELOPER
Rodrigo completed his PhD in Computer Science at the Universidade Federal de Minas Gerais (UFMG), in the group of Gabriel Coutinho. He worked with Quantum Walks and Graph Theory where he defined a family of graphs that contains nodes with zero probability of finding the walker. He joins OQC as the Algorithm Developer, working in the Research and Development team.
With an increasing demand on the volume of data being created, the world’s most powerful traditional supercomputers are unable to manage the volume catalysing the need to improve computational power through quantum computers.
Quantum’s ability to encode a magnitude of classical data in the qubit’s quantum states, combined with its potential to perform unstructured search with algorithms such as Grover’s algorithm, will generate the capability of managing and expeditiously searching vast DNA databases with speed and ease.
There are promising developments with quantum gate algorithms for DNA sequence alignment already developed and demonstrating the ability to handle the amount of data generated in the process.
Research estimates that the DNA Data storage market is projected to grow from USD 76 million in 2024 to USD 3348 million by 2030: growing at a CAGR of 87.7% from 2024 to 2030.
Data resourcing
Biomedical data is an essential resource for developing machine learning models that can aid in diagnosis, treatment, and prevention of diseases. However, the collection, storage, and sharing of biomedical data presents significant challenges due to their sensitive nature and ethical considerations.
Similarly, healthcare data is subject to strict regulations and privacy laws, making it challenging for researchers to access and share data. Moreover, machine learning models trained on a single dataset tend to overfit, and may not generalise well to new data, which limits their potential use in real-world applications.
The challenges of biomedical data
Biomedical data presents the following challenges for classical machine learning (Classical ML):
- Curse of Dimensionality: As the number of features in biomedical data increases, the feature space becomes sparser, making it difficult for classical ML algorithms to learn effectively without a proportionate increase in data samples. This can lead to overfitting, where the model learns noise rather than the underlying pattern.
- Computational Complexity: High-dimensional data requires more computational resources for processing and analysis. Classical ML algorithms may face challenges in terms of memory consumption and execution time, making them less efficient or even infeasible for large datasets with a vast number of features.
- Feature Interactions: Classical ML methods generally struggle to capture the complex interactions in high-dimensional spaces without substantial feature engineering or domain knowledge, leading to suboptimal performance.
Combining quantum with the benefits of federated learning (FL).
Zaiku, NQCC and OQC realised that to address these challenges, we must explore new approaches to biomedical machine learning that can leverage large and diverse datasets, whilst also ensuring data privacy and security.
A promising novel solution is to combine the power of quantum computing, with the benefits of federated learning (FL), namely, hybrid classical-quantum federated learning. This distributed quantum learning approach enables organisations to train hybrid quantum machine learning models on their respective classical datasets, without sharing raw data.
In federated quantum learning, the machine learning model training is distributed to individual devices or servers with access to QPUs to run the quantum part, which then trains the model on their respective datasets. The incorporation of quantum computing represents a strategic move to harness the immense computational power required for processing complex and high dimensional biomedical datasets that classical machine learning struggles with. Indeed, quantum computing offers the potential to significantly expedite the analysis and interpretation of genetic data, thereby enhancing the efficiency and accuracy when developing predictive models that leverage high dimensional biomedical data.
Classical computations have enabled breakthroughs in medical research and patient care including the early adoption of Electronic Health Records (EHRs), and will likely continue to do so, particularly when proactively employed with quantum.
A key area that will see the beneficial power of this is genomic data analysis: where the current time to determine the order of nucleotide bases in a DNA molecule can take days, despite being critical to personalise disease diagnostics and medicine. However, quantum-classical compute power has the potential to play a vital role in this analysis, especially in DNA sequencing, a principle process in computational molecular biology.
Employing federated quantum learning with classical data
OQC recently worked with the National Quantum Computing Centre (NQCC) and Zaiku through the NQCC’s SparQ programme to explore the feasibility of employing federated quantum learning with classical data (i.e., DNA sequences) using NISQ (Noisy Intermediate-Scale Quantum) devices.
The availability of NISQ devices over the cloud made it possible for Zaiku to implement a rapid proof of concept (POC) asynchronous federated learning framework with a sizeable number of clients capable of securely submitting jobs to Quantum Processing Units (QPUs) including OQC Lucy.
One of the key benefits of the POC is that each end-user/client in the federation is able to select via a simple script file a real quantum computer or quantum simulator of their choice before they start training the model. For example, one end-user/client may pick a superconducting-based quantum computer such as OQC Lucy, whereas another may decide to choose a trapped ion based one such as IonQ.
Why combine classical and quantum computing?
Simply stated, classical-quantum hybrid computing is combining the computational processes and architectures of both classical and quantum computers to solve a problem. While quantum computers will be the most powerful in the world, deploying the learnings and limitations of classical compute will allow for faster, nuanced performance of quantum problem solving.
The POC benchmarks gave Zaiku the hope that training a hybrid-quantum model in a federated learning setup requires less training data than a classical model on a classical federated setup. This also provided an interesting indication that a hybrid-quantum classical model, trained in a federated setup, seems to be robust to the noisy nature of the current generation of NISQ hardware when compared to a traditional hybrid-quantum algorithm model trained in a centralised fashion.
OQC Lucy’s QPU served as a catalyst to minimise circuit depth while maximising utility.
Due to their goals to minimise qubit resources needed to run the framework, OQC Lucy’s 8-qubit QPU served as a catalyst for their efforts to minimise circuit depth while maximising utility. This optimisation was crucial for implementing the federated quantum learning framework effectively.
OQC Lucy’s 8-qubit QPU and ring topology, with its supported native gates, created an exciting proposition for Zaiku’s team during the practical exploitation of equivariant embeddings for DNA sequences. This led to an interesting idea coming out of the NQCC’s hackathon in Birmingham, namely the exploitation of reverse-complement symmetry of DNA sequences. This is defined as an angle-based embedding with two features per qubit wherein the symmetry is represented on the Hilbert space via conjugation by a tensor product of Pauli-X gates.
Genetic information is encoded as linear sequences of nucleotides and differences between these sequences are identified through comparative approaches like sequence analysis: where variations can occur at the individual nucleotide level or collectively. Moreover, detecting these sequence differences is vital for understanding biology and medicine and utilising quantum to identify single nucleotide molecules is the first step toward the ultimate goal of DNA sequencing.
Join our newsletter for more articles like this
By clicking ‘sign up’ you’re confirming that you agree with our Terms & Conditions