Getting Ahead of Viral Evolution Using Artificial Intelligence

It has been nearly three years since the emergence of SARS-CoV-2, and it is plainly apparent the world faces a future reality of an ever-changing virus that is here to stay. We now have diagnostics, vaccinations and therapies to fight COVID-19, but the continued emergence of new viral variants means that we too must continually assess, adapt and respond to these threats. Prof Sai Reddy (ETH Zurich and BRCCH) and his colleagues are doing just that by using artificial intelligence to prepare for future variants.

Within the scope of the BRCCH’s Fast Track Call for COVID-19 research funding programme, Prof Reddy and his consortium have developed an artificial intelligence method, deep mutational learning, that predicts the ability of SARS-CoV-2 variants to bind to human cells and escape antibodies. This prediction for current and prospective variants may guide the future development of therapeutic antibody treatments and next-generation COVID-19 vaccines. The information can also be generated in real-time to aid faster public health decision-making. And from a global health perspective, the development of vaccines that protect against future variants improves their efficaciousness and may even help to address vaccine inequity.

In this conversation, Prof Reddy joins BRCCH to discuss this exciting new method. He explains how this research came to be, “This work was enabled by the generous support of the Fondation Botnar to promote, in the very early stages of the pandemic, innovation to overcome the challenges that were known at the time of the peak of the pandemic. But at that time, new challenges, like variants, breakthrough infections and the evolution of SARS-Cov-2, were not necessarily anticipated.

The Fondation’s early commitment to supporting the FTC programme enabled us to respond quickly to this changing pandemic landscape and actually to come up with a new strategy focused on SARS-CoV-2 specifically, instead of applying a band-aid approach using existing, and perhaps inadequate or outdated, methods.”

In the journal Cell, Prof Reddy and his team published this new method called deep mutational learning, a machine learning-guided protein engineering technology that can help us understand how a new variant will affect vaccinated or previously infected people, potentially in real-time as the variant emerges in a population.

Graphical abstract: "Selection and emergence of SARS-CoV-2 variants are driven in part by mutations within the viral spike protein and in particular the ACE2 receptor-binding domain (RBD), a primary target site for neutralizing antibodies. The researchers develop deep mutational learning (DML), a machine learning-guided protein engineering technology, which is used to interrogate a massive sequence space of combinatorial mutations, representing billions of RBD variants, by accurately predicting their impact on ACE2 binding and antibody escape. A highly diverse landscape of possible SARS-CoV-2 variants is identified that could emerge from a multitude of evolutionary trajectories. DML may be used for predictive profiling on current and prospective variants, including highly mutated variants such as Omicron, thus guiding the development of therapeutic antibody treatments and vaccines for COVID-19." -Taft et al. 2022


The ability to make these predictions have big implications for how we may face the future of the pandemic. For example, researchers could use this method to identify therapeutic antibodies or develop next-generation vaccines that have the broadest coverage and the potential to be most effective against current and emergent variants. From a public health perspective, the method could be used to perform surveillance and a real-time assessment, so that governing bodies could leverage that wealth of information to guide public health decisions sooner and more effectively.

“This method allows us to prospectively gather lots and lots of information about the potential evolutionary trajectories of any virus and that makes us a little bit more proactive, rather than always waiting and being a step behind the virus. This may even allow, someday, for us to get ahead of viral evolution.” - Prof Reddy

Additionally, this technology has the potential for societal impact and improving vaccine inequity. In the current global health situation, countries differ greatly in access to vaccines. The inequity is exacerbated by time. The more time that passes between getting a vaccine and when it was designed, the higher the probability of it being less efficacious against variants that have since emerged. In other words, people who receive vaccinations later run the risk of not being protected against the newest variants and only being protected from older variants that are no longer circulating in the population.

“Part of the importance of this method could be that we eventually can make vaccines that have broader coverage and have a longer shelf life of being useful. So even if populations do not get immediate access, they are still at least getting an effective vaccine. This is where there's an opportunity for us to make a difference in global and public health. Our research team cannot solve the problems with manufacturing, distribution and the economics of vaccinations, but we can at least contribute to the science behind the design of vaccines, which represents a highly important part of the science value chain on the path to global translation.” -Prof Reddy

The science behind the technology is indeed innovative, as ETH News explains:

Since viruses mutate randomly, no one can know exactly how SARS-CoV-2 will evolve in the coming months and years and which variants will dominate in the future. In theory, there is virtually no limit to the ways in which a virus could mutate. And this is the case even when considering a small region of the virus: the SARS-CoV-2 spike protein, which is important for infection and detection by the immune system. In this region alone there are tens of billions of theoretical possible mutations.

That’s why the new method takes a comprehensive approach: for each variant in this multitude of potential viral variants, it predicts whether or not it is capable of infecting human cells and if it will be neutralized by antibodies produced by the immune system found in vaccinated and recovered persons. It is highly likely that hidden among all these potential variants is the one that will dominate the next stage of the COVID-19 pandemic.

To establish their method, Reddy and his team used laboratory experiments to generate a large collection of mutated variants of the SARS-CoV-2 spike protein. The scientists did not produce or work with live virus, rather they produced only a part of the spike protein, and therefore there was no danger of a laboratory leak.

The spike protein interacts with the ACE2 protein on human cells for infection, and antibodies from vaccination, infection or antibody therapy work by blocking this mechanism. Many of the mutations in SARS-CoV-2 variants occur in this region, which allows the virus to evade the immune system and continue to spread.

Although the collection of mutated variants the researchers have analysed comprises only a small fraction of the several billion theoretically possible variants – which would be impossible to test in a laboratory setting – it does contain a million such variants. These carry different mutations or combinations of mutations.

By performing high-throughput experiments and sequencing the DNA from these million variants, the researchers determined how successfully these variants interact with the ACE2 protein and with existing antibody therapies. This indicates how well the individual potential variants could infect human cells and how well they could escape from antibodies.

The researchers used the collected data to train machine learning models, which are able to identify complex patterns and when given only the DNA sequence of a new variant could accurately predict whether it can bind to ACE2 for infection and escape from neutralizing antibodies. The final machine learning models can now be used to make these predictions for tens of billions of theoretically possible variants with single and combinatorial mutations and going far beyond the million that were tested in the laboratory. (Read full article)


This research was developed by a BRCCH-supported multi-disciplinary consortium working together and as part of a larger BRCCH research programme, the Fast Track Call for COVID-19 research (FTC). The programme aims to enable research that will help mitigate medical and public health challenges and contribute tangible solutions to reduce global disease burden due to COVID-19. Lead investigator Sai Reddy is Vice Director of BRCCH and a professor in the Department of Biosystems Science and Engineering, ETH Zurich

“This specific project was inspired and commenced as an extension of the original FTC project, which was initially based on developing a testing method for SARS-CoV-2 using a high throughput DNA sequencing method. As new variants emerged, such as Omicron this last winter, and the ensuing wave of breakthrough infections, we realised we could extend our work to develop technology and an approach where you could prospectively identify what combinations of mutations might escape from antibodies.” -Prof Reddy


Research article:

Taft JM, Weber CR, Gao B, Ehling RA, Han J, Frei L, Metcalfe SW, Overath M, Yermanos A, Kelton W, Reddy ST. 2022. "Deep Mutational Learning Predicts ACE2 Binding and Antibody Escape to Combinatorial Mutations in the SARS-CoV-2 Receptor Binding Domain." Cell. journal pre-proof.

Related articles:

ETH News article "Preparing for Future Coronavirus Variants Using Artificial Intelligence"

MORE about Prof Reddy’s FTC research DeepSARS

MORE about the BRCCH-supported COVID-19 research