In a recent study posted to the bioRxiv* server, researchers performed a genomic study of confirmed Monkeypox virus (MPXV) cases diagnosed in Spain between 18 May and 14 July 2022.
Background
In the current outbreak, the number of disseminated MPXV transmission has remained consistently low, and it appears to transmit person-to-person (PTP) after a primarily localized rash appears. If the latter mode is responsible for MPXV transmission, it should also reflect at the genomic level. So far, all genomics studies have focused on describing MPXV’s evolutionary history and tracking its introduction to the virus in Western countries.
These studies revealed that the 2022 MPXV cluster diverged from the related 2016-2019 MPXVs by an average of 50 single-nucleotide polymorphisms (SNPs), with around 24 non-synonymous mutations, ~18 synonymous mutations, and a few intergenic differences. They attributed mutational bias mainly to the action of apolipoprotein B mRNA-editing catalytic polypeptide-like 3 (APOBEC3) enzymes. Moreover, they described MPXV genetic variations, including the deletion of immunomodulatory genes. Until now, the host exerted selective pressure, and the loss of virus-host interacting genes has driven the MPXV evolution.
Retrospective evolutionary studies have evidenced MPXV’s unique adaptive strategies; however, these studies have failed to resolve genomic regions with many repeats, especially the low-complexity regions (LCRs). LCRs are not randomly located in the genome and might be associated with adaptive changes related to MPXV transmissibility differences. They have various complexity levels, such as dinucleotide, trinucleotide, or more complex palindromic repeats. Previous studies have shown that genomic accordions are a rapid path for adapting poxviruses during serial passaging. There is an urgent need to study these genomic features in the context of MPXV’s evolutionary adaptation.
About the study
In the present study, researchers evaluated the effects of LCRs, including all short tandem repeats (STRs) and homopolymer, changes in the MPXV genome. They compared the distribution of LCRs between different major functional groups in the MPXV genome. Furthermore, the researchers assessed the levels of intra-host and inter-host variations in LCR.
The team first obtained a high-quality genome from an unpassed vesicular fluid of an MPXV case in Spain. Then, they used a conventional validated MPXV nested polymerase chain reaction (PCR) targeting the transferrin (TFN) receptor gene to confirm the presence of MPXV in cutaneous lesions swabs. Next, they combined three sequencing technologies, NovaSeq, MiSeq, and nanopore sequencing, to read the complete genome.
Study findings
Only two genomic samples, 353R and 349R, produced high-quality viral reads for an allele frequency comparison in most LCR areas. While LCR8 and LCR9 did not show any variation, LCR7 and LCR10/11 showed considerable intra-host variation and differences in the preponderant allele between samples. Based on heterozygosity of the whole genome, the magnitude and scale of variations were significantly higher in LCR than in SNPs, and the same occurred intra-host.
Four samples belonged to B.1.1, defined by an amino acid mutation in OPG094 R194H. They identified one sample as B.1.3, defined by the amino acid mutation R84K corresponding to position G190,660A and the remaining samples belonged to B1- Clade IIb. Among clade IIb, B1 strains consistently showed 16 repeats. A.1 strain were polymorphic, A.2 strains showed 23, 25, and 26 repeats, while older lineage A strains had 32, 43, 53, and 71 repeats. Although the researchers detected additional clusters among the Spanish isolates defined by a few SNP changes; however, they identified an epidemiological link in one case only.
Since the current study results showed a limited relation between SNP changes and virus epidemiology, a potential convergence effect has likely occurred for this class of viruses, which asks for a change of focus in their genomic epidemiology studies. The researchers identified 21 LCRs, 13 STR, and eight homopolymers. Comparing the degree of diversity among the 21 identified LCRs with SNP variability showed eight LCR areas (1/4, 2, 3, 5, 6, 7, 10/11, 21) with evident signs of intra-host and inter-sample variations.
Five LCRs were co-localized in two areas of the MPXV genome, with LCRs 5, 6, and 7 in the core of the MPXV genome. The LCRs 3 and 21 were localized in the immunomodulatory area, and the other three LCRs were inside the putative translated region of MPXV genes OPG153 (A26L), OPG204 (B16R), and OPG208 (B19R). The changes in the number of repeats observed in LCR3 and LCR21 followed the same pattern by extending the N-terminal region of an immunomodulatory open reading frame (ORF). Given the unusually long repetitive stretch of LCR3, Clade IIb MPXV might require large amounts of tyrosine: translational ribonucleic acid (tRNA) to translate the OPG208 gene. Perhaps, it is another potential method employed by viral tandem repeated sequences to adapt to a new environment via modulating ORF expression.
The researchers noted that the changes associated with LCR21/OPG204 were subtle. They did not observe changes in the number of repeats but in mutations. Thus, most Clade IIb viruses showed multiple alternative starts followed by a lysine amino acid, always encoded by the rarest codon. The most intriguing region of variability observed in ORF involved the OPG153 gene, which is a significant factor in OPXV transmission and virulence. For instance, it is the core gene that has been “lost” most during poxvirus evolution. The LCR7 repeat area, located in the central domain of OPG153, encodes for a poly-aspartic acid (poly-D)non-structured region, which is conserved among OPXVs; however, its length is highly variable. Interestingly, clade IIa MPXV strains have an extended 21 amino acid poly-D.
Conclusions
The study findings expanded the concept of genome accordions in MPXV evolution. The study demonstrated that LCRs of the genome, which are currently left unresolved during genomic studies, might be crucial to address changes in host range or virulence and contain important transmission information. Thus, there is a need to establish a new standardized approach to generate and analyze the sequencing data that prioritize these regions. More functional studies are urgently warranted to complement this comparative genomic study.
*Important notice
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.