A recent study posted on the bioRxiv* preprint server explored the glycan shield diversity of sarbecoviruses, furthering the knowledge for developing broad-range pan-coronavirus vaccines.
Background
The various coronavirus outbreaks, beginning with the severe acute respiratory syndrome coronavirus (SARS-CoV-1) epidemic in 2003 to the latest severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic in 2020, have emphasized the importance of vaccines in limiting the severity of these zoonotic diseases. Many vaccines developed to combat the COVID-19 pandemic targeted the SARS-CoV-2 spike (S) glycoprotein and built on existing knowledge of protein structure and protein engineering techniques.
Sarbecoviruses belong to the Betacoronavirus genus and include SARS-CoV-1 and SARS-CoV-2. Many sarbecoviruses with high similarity to the SARS-CoV-2 virus have been found circulating in various animal populations, such as bats and pangolins, and pose a high risk of zoonosis. Understanding the similarities and differences of the glycan shield of various sarbecoviruses would aid in developing vaccines to combat a broad range of sarbecoviruses.
About the study
The S protein, which enables the virus to enter the host cell, undergoes various modifications inside the host cell such as proteolytic cleavage, maturation, and N-linked glycosylation. The major components of the S protein include an N-terminal domain (NTD), a receptor-binding domain (RBD), and a transmembrane C-terminal domain, along with fusion peptide and heptad repeat 1 and 2. The proteolytic cleavage separates the S protein into S1 (NTD and RBD) and S2. Approximately a third of the S protein mass consists of N-linked glycans, which play an important role in the correct folding and stabilization of the protein. The N-linked glycans also play a major role in neutralizing antibody epitopes, and modified N-linked glycans, such as oligomannose-type N-linked glycans contribute to the density of the glycan shield.
In this study, the researchers selected 78 sarbecoviruses that share similar sequences with SARS-CoV-2 and investigated the conserved and variable N-linked glycosylation sites. To explore the variability of the glycan shield, they selected 11 sarbecovirus spike protein genes and introduced similar mutations as those used in current SARS-CoV-2 vaccines.
The soluble native-like trimers of the S proteins that were produced were then purified and analyzed using liquid-chromatography mass spectrometry. Aliquots of spike glycoproteins were also subjected to trypsin, chymotrypsin, and alpha-lytic protease to investigate the glycan-processing state of each site. They then modeled the N-linked glycan sites onto structural models of the sarbecoviruses to study the three-dimensional environment of these sites.
Results
Comparing the N-linked glycan sites of various sarbecoviruses with those of SARS-CoV-2 revealed that some glycosylation sites were highly conserved while others were highly variable. The researchers found that all the strains contained highly conserved regions on the S2 subunit. Most strains also contained conserved sites on one oligomannose-type glycan-rich site and two sites on the SARS-CoV-2 RBD.
The divergent sites were found on or close to the RBD, and also exhibited restricted glycan-processing. Subsequent variants of the SAR-CoV-2 virus have shown that the RBD is subjected to immune selection pressure, and antibody binding is reduced considerably with just a few regional mutations. The results indicate that the glycan shields are largely conserved and play a role in the structure and function maintenance of the S protein. However, determination of the glycan processing states revealed them to be highly variable, despite many N-linked glycosylation sites being conserved across all sarbecoviruses. The researchers also inferred that the glycan shield around the RBD of all the analyzed sarbecoviruses is sparse.
Conclusions
Overall, the study results showed that most of the N-linked glycosylation sites were conserved across multiple clades of sarbecoviruses. The variants of SARS-CoV-2 displayed variability in the N-terminal regions, with the gamma variant containing a glycosylation site (N20) not found in the original strain from Wuhan, but found in two other sarbecoviruses. Moreover, the S2 region of the protein contained many conserved sites and expressed low levels of oligomannose-type glycans. Certain regions that expressed high variability in terms of glycan processing were found to be highly conserved sites, indicating heterogeneity.
The study provides fresh insights into potentially novel target regions of the spike glycoprotein for developing broad-range vaccines against sarbecoviruses. It also highlights the highly variable regions of the RBD, indicating potential pitfalls of developing vaccines targeted to those sites.
With the rapid rate at which the SAR-CoV-2 variants have emerged, combined with the discovery of various highly similar sarbecoviruses circulating in nature and being potential zoonotic pathogens, there is a pressing need to develop a pan-coronavirus vaccine. While the researchers have mentioned that antibodies generated to these conserved target sites lack the potency of RBD-specific antibodies, the study emphasizes the importance and potential for exploring conserved regions for vaccine design.
*Important notice
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.