Getting Ahead of Viral Evolution Using Artificial Intelligence

Getting Ahead of Viral Evolution Using Artificial Intelligence

It has been nearly three years since the emergence of SARS-CoV-2, and it is plainly apparent the world faces a future reality of an ever-changing virus that is here to stay. We now have diagnostics, vaccinations and therapies to fight COVID-19, but the continued emergence of new viral variants means that we too must continually assess, adapt and respond to these threats. Prof Sai Reddy (ETH Zurich and BRCCH) and his colleagues are doing just that by using artificial intelligence to prepare for future variants.

Within the scope of the BRCCH’s Fast Track Call for COVID-19 research funding programme, Prof Reddy and his consortium have developed an artificial intelligence method, deep mutational learning, that predicts the ability of SARS-CoV-2 variants to bind to human cells and escape antibodies. This prediction for current and prospective variants may guide the future development of therapeutic antibody treatments and next-generation COVID-19 vaccines. The information can also be generated in real-time to aid faster public health decision-making. And from a global health perspective, the development of vaccines that protect against future variants improves their efficaciousness and may even help to address vaccine inequity.

In this conversation, Prof Reddy joins BRCCH to discuss this exciting new method. He explains how this research came to be, “This work was enabled by the generous support of the Fondation Botnar to promote, in the very early stages of the pandemic, innovation to overcome the challenges that were known at the time of the peak of the pandemic. But at that time, new challenges, like variants, breakthrough infections and the evolution of SARS-Cov-2, were not necessarily anticipated.

The Fondation’s early commitment to supporting the FTC programme enabled us to respond quickly to this changing pandemic landscape and actually to come up with a new strategy focused on SARS-CoV-2 specifically, instead of applying a band-aid approach using existing, and perhaps inadequate or outdated, methods.”

In the journal Cell, Prof Reddy and his team published this new method called deep mutational learning, a machine learning-guided protein engineering technology that can help us understand how a new variant will affect vaccinated or previously infected people, potentially in real-time as the variant emerges in a population.

Graphical abstract: "Selection and emergence of SARS-CoV-2 variants are driven in part by mutations within the viral spike protein and in particular the ACE2 receptor-binding domain (RBD), a primary target site for neutralizing antibodies. The researchers develop deep mutational learning (DML), a machine learning-guided protein engineering technology, which is used to interrogate a massive sequence space of combinatorial mutations, representing billions of RBD variants, by accurately predicting their impact on ACE2 binding and antibody escape. A highly diverse landscape of possible SARS-CoV-2 variants is identified that could emerge from a multitude of evolutionary trajectories. DML may be used for predictive profiling on current and prospective variants, including highly mutated variants such as Omicron, thus guiding the development of therapeutic antibody treatments and vaccines for COVID-19." -Taft et al. 2022


The ability to make these predictions have big implications for how we may face the future of the pandemic. For example, researchers could use this method to identify therapeutic antibodies or develop next-generation vaccines that have the broadest coverage and the potential to be most effective against current and emergent variants. From a public health perspective, the method could be used to perform surveillance and a real-time assessment, so that governing bodies could leverage that wealth of information to guide public health decisions sooner and more effectively.

“This method allows us to prospectively gather lots and lots of information about the potential evolutionary trajectories of any virus and that makes us a little bit more proactive, rather than always waiting and being a step behind the virus. This may even allow, someday, for us to get ahead of viral evolution.” - Prof Reddy

Additionally, this technology has the potential for societal impact and improving vaccine inequity. In the current global health situation, countries differ greatly in access to vaccines. The inequity is exacerbated by time. The more time that passes between getting a vaccine and when it was designed, the higher the probability of it being less efficacious against variants that have since emerged. In other words, people who receive vaccinations later run the risk of not being protected against the newest variants and only being protected from older variants that are no longer circulating in the population.

“Part of the importance of this method could be that we eventually can make vaccines that have broader coverage and have a longer shelf life of being useful. So even if populations do not get immediate access, they are still at least getting an effective vaccine. This is where there's an opportunity for us to make a difference in global and public health. Our research team cannot solve the problems with manufacturing, distribution and the economics of vaccinations, but we can at least contribute to the science behind the design of vaccines, which represents a highly important part of the science value chain on the path to global translation.” -Prof Reddy

The science behind the technology is indeed innovative, as ETH News explains:

Since viruses mutate randomly, no one can know exactly how SARS-CoV-2 will evolve in the coming months and years and which variants will dominate in the future. In theory, there is virtually no limit to the ways in which a virus could mutate. And this is the case even when considering a small region of the virus: the SARS-CoV-2 spike protein, which is important for infection and detection by the immune system. In this region alone there are tens of billions of theoretical possible mutations.

That’s why the new method takes a comprehensive approach: for each variant in this multitude of potential viral variants, it predicts whether or not it is capable of infecting human cells and if it will be neutralized by antibodies produced by the immune system found in vaccinated and recovered persons. It is highly likely that hidden among all these potential variants is the one that will dominate the next stage of the COVID-19 pandemic.

To establish their method, Reddy and his team used laboratory experiments to generate a large collection of mutated variants of the SARS-CoV-2 spike protein. The scientists did not produce or work with live virus, rather they produced only a part of the spike protein, and therefore there was no danger of a laboratory leak.

The spike protein interacts with the ACE2 protein on human cells for infection, and antibodies from vaccination, infection or antibody therapy work by blocking this mechanism. Many of the mutations in SARS-CoV-2 variants occur in this region, which allows the virus to evade the immune system and continue to spread.

Although the collection of mutated variants the researchers have analysed comprises only a small fraction of the several billion theoretically possible variants – which would be impossible to test in a laboratory setting – it does contain a million such variants. These carry different mutations or combinations of mutations.

By performing high-throughput experiments and sequencing the DNA from these million variants, the researchers determined how successfully these variants interact with the ACE2 protein and with existing antibody therapies. This indicates how well the individual potential variants could infect human cells and how well they could escape from antibodies.

The researchers used the collected data to train machine learning models, which are able to identify complex patterns and when given only the DNA sequence of a new variant could accurately predict whether it can bind to ACE2 for infection and escape from neutralizing antibodies. The final machine learning models can now be used to make these predictions for tens of billions of theoretically possible variants with single and combinatorial mutations and going far beyond the million that were tested in the laboratory. (Read full article)


This research was developed by a BRCCH-supported multi-disciplinary consortium working together and as part of a larger BRCCH research programme, the Fast Track Call for COVID-19 research (FTC). The programme aims to enable research that will help mitigate medical and public health challenges and contribute tangible solutions to reduce global disease burden due to COVID-19. Lead investigator Sai Reddy is Vice Director of BRCCH and a professor in the Department of Biosystems Science and Engineering, ETH Zurich

“This specific project was inspired and commenced as an extension of the original FTC project, which was initially based on developing a testing method for SARS-CoV-2 using a high throughput DNA sequencing method. As new variants emerged, such as Omicron this last winter, and the ensuing wave of breakthrough infections, we realised we could extend our work to develop technology and an approach where you could prospectively identify what combinations of mutations might escape from antibodies.” -Prof Reddy


Research article:

Taft JM, Weber CR, Gao B, Ehling RA, Han J, Frei L, Metcalfe SW, Overath M, Yermanos A, Kelton W, Reddy ST. 2022. "Deep Mutational Learning Predicts ACE2 Binding and Antibody Escape to Combinatorial Mutations in the SARS-CoV-2 Receptor Binding Domain." Cell. journal pre-proof.

Related articles:

ETH News article "Preparing for Future Coronavirus Variants Using Artificial Intelligence"

MORE about Prof Reddy’s FTC research DeepSARS

MORE about the BRCCH-supported COVID-19 research





Conversation with Prof Sai Reddy

DeepSARS: Sequencing and testing at the same time

A new scientific platform called DeepSARS, developed by Prof Sai Reddy, BRCCH Vice Director and Professor of Systems and Synthetic Immunology at ETH Zurich, could support viral testing and tracking for future pandemics: While current diagnostic testing and genomic surveillance methods are being done separately, DeepSARS is able to perform both tasks simultaneously. This allows for earlier detection of emerging variants and profiling mutations at the population level. In this conversation, Prof Reddy joins journalist Irène Dietschi to discuss the consortium’s exciting new findings.

Graphical abstract: DeepSARS uses molecular barcodes (BCs) and multiplexed targeted deep sequencing (NGS) to enable simultaneous diagnostic detection and genomic surveillance of SARS-CoV-2. Image courtesy of Yermanos et al. 2022


The ongoing COVID-19 pandemic remains a major global health concern with novel variants of SARS-CoV-2, such as alpha, beta, gamma, delta and omicron, that are continuously emerging and resulting in new waves of infection. Potentially exacerbating the situation, many of the masking and social distancing rules are relaxed around the world.

The latest variant of SARS-CoV-2, omicron, with over 50 mutations is able to infect at a greater capacity than previous strains and often possesses substantial immune evasion, leading to many breakthrough infections in vaccinated or previously infected individuals. 

Genomic surveillance of SARS-CoV-2 has been a vital component in the monitoring of the pandemic, providing valuable information for guiding public health decisions. However, despite the fact that the SARS-CoV-2 genomes from infected patients have been sequenced at unprecedented levels, they still represent a small fraction of the total number of global infections. And countries vary widely in how they prioritize viral genome sequencing. What makes it even more problematic is that in the countries with low sequencing rates, they also tend to have high infection rates and are therefore at an especially high risk for new mutations to evolve. They are hotspots for new variants.

Key region: the receptor binding domain

That is where DeepSARS comes into play. "A number of researchers have previously embraced the idea that sequencing could be used not just for genomic surveillance, but also for diagnostic testing", says immunology professor Sai Reddy, Vice Director of the BRCCH. Prof Reddy, who is the principal investigator of the Laboratory for Systems and Synthetic Immunology at ETH Zurich, has actually pursued this idea to a practical level. In a paper published recently in BMC Genomics, he, his team and collaborators have established a scientific concept which has proven that sequencing and testing can be done simultaneously. Their system is called DeepSARS.

"One of the main challenges of the project was to determine which sites in the viral genome would yield maximum diagnostic and genomic information while maintaining sufficient coverage for each site", explains Prof Reddy. One region which is most associated with new variants is the spike protein of SARS-CoV-2, and within the spike protein it is particularly the so-called receptor binding domain – the site where the virus attaches itself to the host receptor ACE2 – where mutations most likely occur.

This receptor binding domain also happens to be the target site of neutralizing antibodies, generated from either vaccination or previous infection. Mutations which emerge from this region can influence variants’ attachment to host cells, potentially making them more transmissible and are allowing them to escape from neutralizing antibodies. "Therefore, when we are developing sequencing tests, it is very important to get information about the receptor binding domain", says Prof Reddy. "It was a central part for the design of DeepSARS."

Experimental protocol and ‘targeted deep sequencing’

Once this was established, Sai Reddy and colleagues proceeded in a stepwise manner to develop the DeepSARS system. "First we elaborated an experimental protocol, which is in fact very similar to the protocol in which all PCR tests are performed", says Prof Reddy. "The main difference is that we added little personalized markers to each sample of the PCR, so that when you perform a genomic sequencing experiment you know which patient it came from." This procedure is called molecular barcoding.

The second step the researchers undertook was to identify regions for what they call "targeted deep sequencing". Prof Reddy: "Targeted deep sequencing means reducing the viral genome from 30’000 RNA bases to about 20 percent of that number - 6000 bases. And it means identifying those 20 percent of sites which give us information that will likely track the evolution of the virus, as well as identify emerging variants." To achieve that, Reddy and colleagues performed an extensive computational analysis (i.e., computational phylogenetics). In this way they identified target candidates of the viral genome and then implemented them in the experimental protocol.

Proof of concept on patients’ sample

Next, they validated their data on synthetic RNA templates of SARS-CoV-2, based on genomic sequences recovered early in the pandemic. "That allowed us to do very precise experiments to detect the amount of viral material, which was as low as 10 copies of the virus per sample", Prof Reddy explains.

The final step was testing DeepSARS on samples from patients. "We had samples from 30 patients, and we were able to show, based on nasal swabs or saliva, that the DeepSARS testing assay was very close to performing at the same level of diagnostic detection as a traditional PCR test", says Prof Reddy. And: "It was able to provide enough information about genomic surveillance to classify viral evolution, and whether there were any variants or not." In short: DeepSARS works. With the proof of concept, its science and technology may be regarded as well established.

A solid foundation for future pandemics

The exciting question is, could DeepSARS be applied at a larger scale? Could it even be deployed in the current COVID-19 pandemic? "DeepSARS certainly has a lot of potential", says Prof Reddy. "But right now, while we’re still in the COVID-19 pandemic, it wouldn’t make much sense to replace the current infrastructure for diagnostics and genomic surveillance with a new system." As Prof Reddy explains, that would implicate numerous changes in the logistics and regulations of the on-going pandemic, which, among other aspects, would be far too costly and require extensive regulatory approval.

Yet Sai Reddy is collaborating with public health experts in Switzerland as well as with clinicians at University Hospital Basel and other partners, discussing future applications, including clinical testing of DeepSARS. The goal is to examine the system at a larger scale. "We have shown that DeepSARS can be rapidly adapted for identification of emerging variants and for profiling mutational changes at a small scale but for a pandemic, this requires population level implementation", says Prof Reddy. "Practically speaking, DeepSARS could be of immense benefit in for future pandemics or possibly as SARS-CoV-2 transitions to an endemic stage."


DeepSARS was developed by a BRCCH-supported consortium where bioengineers, immunologists, computational biologists and clinical scientists work together and as part of a larger BRCCH research programme, the Fast Track Call for COVID-19 research (FTC). The programme aims to enable research that will help mitigate medical and public health challenges and contribute tangible solutions to reduce global disease burden due to COVID-19.

This specific project aims for rapid transfer of these state-of-the art diagnostic methods across Switzerland and many other countries around the world, leading the way for innovative population level surveillance approaches of future variants and other disease outbreaks.


Interview: Irène Dietschi

Research article:

Yermanos A, Hong KL, Agrafiotis A, Han J, Nadeau S, Valenzuela C, Azizoglu A, Ehling R, Gao B, Spahr M, Neumeier D, Chang CH, Dounas A, Petrillo E, Nissen I, Burcklen E, Feldkamp M, Beisel C, Oxenius A, Savic M, Stadler T, Rudolf F & Reddy ST. 2022. "DeepSARS: simultaneous diagnostic detection and genomic surveillance of SARS-CoV-2." BMC Genomics. DOI:v10.1186/s12864-022-08403-0

MORE about the BRCCH-supported COVID-19 research by Prof Sai Reddy and consortium.




A New Frontier in Diagnosing Gut Health

A New Frontier in Diagnosing Gut Health

Every year, more than 200 million children worldwide do not reach their developmental potential. This is primarily due to infectious diseases as well as malnutrition and related disorders. Normal gut development and function are critical for determining a child’s development and health throughout life. Despite this, diagnostics that can measure the health status of the gut are severely lacking.  As part of the BRCCH’s Multi-Investigator funding programme, Prof Randall Platt (ETH Zürich), Prof Andrew Macpherson (University Hospital Bern) and fellow consortium members seek to develop a non-invasive, microbe-based diagnostic that is capable of sensing and recording the status of the gut.

In a new groundbreaking study published in Science*, Prof Platt, Prof Macpherson and co-authors have achieved the first critical steps towards making this ambitious idea a reality.

Behind this work is the innovative Record-seq technology pioneered by Prof Platt in 20181. The technology is based on CRISPR-engineered bacteria that can sense and create a molecular record of changes in their surrounding environment over time. These bacteria, or ''transcriptional recorders'' can then be analysed via sequencing approaches to reveal the history of events that they encountered. This technology holds enormous potential to provide real-time information on the status of the gut environment, which could then be harnessed to guide personalised therapeutic interventions.

In this new study, the researchers first set out to understand how the transcriptional recorders behave in a real-life gut environment and what they are able to report on whilst travelling through the intestine. The team demonstrated that these bacteria survive and traverse through the gut of mice, and that they can be successfully collected from faecal samples for further analysis. Importantly, the study revealed that the transcriptional recorders are able to capture important biological information throughout all regions of the gut. This represents a major advance over current omics-based technologies that are used to study the gut, as they are unable to provide insights into intestinal regions that are more difficult to access, such as the proximal colon.

The CRISPR-engineered bacteria (or transcriptional recorders) create molecular records of information about their surrounding environment as they transit through the gut. These bacteria can then be retrieved via faecal samples and their molecular records analysed through sequencing and computational methods. Image courtesy of Prof Randall Platt


Following these exciting results, the team then embarked on testing if the transcriptional recorders can reliably report on two elements which are critical for determining gut health: nutrition and inflammation.

To do this, mice were fed with different diets and the transcriptional recorders were collected from faecal samples over time. A Record-seq analysis revealed that these bacteria record unique molecular signatures that are diet-specific and are retained by the bacteria, even following a dietary switch. Therefore, not only are these transcriptional recorders capable of reporting on the real-time dietary status in vivo, these findings also suggest that they can provide a window into the nutritional history of the gut.

The researchers then took one step further by studying the transcriptional recorders in a mouse model of gut inflammation, to mimic the local environment in the presence of gastrointestinal disease.  Remarkably, the team discovered that the molecular signatures recorded by the bacteria could be used to distinguish healthy mice from those with gastrointestinal inflammation. Moreover, they could also provide a read-out for measuring the severity and biological indicators of inflammation within the gut.

Following this landmark work, we asked Prof Randall Platt about where the consortium plans to take Record-Seq from here:

''This highly collaborative and interdisciplinary project lays the groundwork towards realising the technology’s true potential for improving human health. The consortium is now focusing on translation, which primarily includes further rigorous testing in animal models of human conditions as well as ensuring robust safety and environmental containment of the genetically engineered bacteria''.


*Read the paper:

Schmidt F, Zimmermann J, Tanna T, Farouni R, Conway T, Macpherson AJ, Platt RJ: Noninvasive assessment of gut function using transcriptional recording sentinel cells. Science, 12 May 2022, doi: 10.1126/science.abm6038

About the researchers 

Professor Randall Platt is an Associate Professor at the Department of Biosystems Science and Engineering (D-BSSE) at ETH Zürich and the Department of Chemistry at the University of Basel.

Professor Andrew Macpherson is Professor and Director of Gastroenterology at University Hospital Bern.

Professors Platt and Macpherson, together with fellow consortium members, lead the BRCCH Multi-Investigator Project: Living Microbial Diagnostics to Enable Individualised Child Health Interventions.

1 Related articles

Recording device for cell history (ETH News 03.10.2018)

Bacteria with recording function capture gut health status (ETH News 12.05.2022)



Conversation with Prof Alexandar Tzankov

A Cell Fitness Marker for Predicting COVID-19 Outcomes

COVID-19 is unpredictable. Identifying which COVID-19 patients are likely to develop severe disease versus those at lower risk of complications remains a major clinical challenge. In a recent collaborative study*, BRCCH-funded investigator Professor Alexandar Tzankov (University Hospital Basel) and co-authors discovered a novel biomarker that could be used to predict the prognosis of COVID-19 patients more accurately. In this conversation, Prof Tzankov joins journalist Irène Dietschi to discuss          the consortium’s exciting new findings.


Assessing a patient’s risk of developing severe disease is difficult. Usually, individuals who test positive for SARS-CoV-2 are referred to their physician and sent home to isolate. Which patients will develop severe symptoms and require hospitalisation is largely unknown at this point. In a study published in EMBO Molecular Medicine, Prof Alexandar Tzankov and co-authors have now uncovered a means to predict the prognosis of COVID-19 patients more precisely: by using a genetic marker called hFwe-Lose, or simply Flower lose.

Behind this discovery is a relatively recent finding: cells constantly compare their fitness with each other in the body. Collaborators of Prof Tzankov, Prof Eduardo Moreno (Champalimaud Centre for the Unknown, Portugal) and Prof Rajan Gogna (University of Copenhagen), previously identified that the human flower gene (hFwe) can be expressed in different forms, which mark cells as either winners or losers. Fit or 'winning' cells express a form of the flower gene called hFwe-Win, whereas unfit or 'losing' cells express hFwe-Lose. This allows the body to identify unhealthy cells that need to be eliminated.

''The balance of expression of these flower genes is very important physiologically'' says Alexandar Tzankov. ''Their correct expression is critical in embryo and organ development, as well as in diseases such as cancer. hFwe-Lose is a kind of lifetime document for the whole body.'' It can provide insights into how fit a person’s body is at a given moment: What a person’s biological age is, how much cumulative toxicity they have been exposed to during life, if they have pathological obesity, how well does their body handle high blood sugar and hypertension.

In May 2020, Prof Tzankov had just published an autopsy study of 21 deceased COVID-19 patients, the first major observational cohort of its kind. Professors Morena and Gogna suspected that flower genes might play a role in the progression of COVID-19 and decided to reach out to Prof Tzankov. ''They suggested examining the tissues of deceased patients for hFwe-Lose, and that's what we did'' says Alexandar Tzankov. The team also examined hFwe-Lose in patients with co-morbidities such as hypertension, diabetes, obesity and chronic obstructive pulmonary disease (COPD). The results confirmed the researchers' original idea: ''In healthy lungs, the expression of hFwe-Lose is very low. In the lungs of patients with co-morbidities, its expression increases. In patients who died of COVID-19, it is very high'', Alexandar Tzankov explains.

hFwe-Lose is a genetic marker than can be used to predict outcomes in COVID-19 patients. Source: EMBO Molecular Medicine (2021) 13:e13714;


The researchers then decided to go one step further: They analysed hFwe-Lose levels in nasopharyngeal swab samples collected from 283 COVID-19 -at that time unvaccinated - patients in Wisconsin, USA during the early waves of infection. The team discovered that the higher the hFwe-Lose level was in the nasal sample, the more likely the patient went on to develop severe disease and to undergo hospitalisation and/or die of COVID-19. Remarkably, using computational modelling, the team uncovered that hFwe-Lose levels could be used to predict the risk of hospitalisation and death with a high degree of accuracy. ''For about 85% of people for whom the level of hFwe-Lose predicted hospitalisation, they actually had to go to the hospital. For virtually no one who died, mathematical modelling predicted that they would not have died'' explains Alexandar Tzankov.

hFwe-Lose is relatively straightforward to analyse via the same nasal swab used to test for SARS-CoV-2 infection. ''This makes hFwe-Lose a very useful biomarker for COVID-19 patients'' says Alexandar Tzankov. So what could this mean for clinicians? ''You could potentially identify at-risk COVID-19 patients early, instruct these patients to pay very close attention to symptoms and keep the threshold for hospitalisation lower. That way, emergency situations could possibly be avoided.''

''I admit that this is an optimistic scenario for the use of this marker - but it has the potential.''


Interview: Irène Dietschi

Research article:

MORE about the BRCCH-supported COVID-19 research by Prof Alexandar Tzankov and consortium.

MORE about COVID-19 research by the pathology team at University Hospital Basel.