What is a “variant” as it relates to COVID-19? (for non-geneticists)

If you watch or read the news, there are a flurry of reports of SARS-CoV-2 variants popping up in the US and around the world. Some reports generalize and might say something like “The South African variant was found in South Carolina” or “The E484 variant was seen in Massachusetts”. This can be confusing as the media’s coverage of this is a little vague and maybe a bit misleading.

In this post, I want to bring everyone up to speed about what qualifies as a variant and how it relates to public health in this COVID-19 pandemic.

What is a Variant?

Mutation: Usually describes a change in the viral RNA or protein. (For example, “the G nucleotide base at position 123 has now mutated to a T”.)

Variant: Usually characterizes the mutation a bit further as it relates to the viral proteins. (For example, “the D amino acid at position 12 is now a Q”.) This terminology is what most scientists are using to name particular variants (e.g., E484 or N501).

We find these variants using a method called “variant calling”, which compares a new sample’s sequence to a reference sequence and looks at positions in the genome that are different between the two. Most commonly, bioinformaticians are using an early 2020 sequence from Wuhan in Hubei, China (accession: NC_045512) as the reference sequence.

For example, let’s compare the reference sequence to a sequence collected in North Carolina (accession: MT52282). We first align the new sequence to the reference sequence and then look for changes. Notice a G>T change at position 25563 below:

Alignment shown in AliView using MUSCLE.

Then, we can translate these RNA sequences into amino acids to see if this changes the resulting viral protein. Spoiler alert: it does!

Notice the Q⮞H change at position 8521 (which, in terms of the specific protein ORF3a, happens to the position 57). Thus, this variant is called Q57H.

So, when you see the news talking about E484 or N501, you now know this means that there’s a change in the amino acid of the protein at that location.

Sequencing is Key

As sequencing becomes more common, we’ll be able to better understand which variants are common geospatially and, for any variants that are a cause for concern, plan accordingly with public health guidance.

SARS-CoV-2 variants in North America as of 2/4/21. Source: janieslab.github.io/sars-cov-2 and Ford et al. (2021)

Not all Variants are Created Equal

  1. Transmissibility — Certain variants may increase the interaction between a viral protein and human protein, making the virus easier to infect and spread. Specifically, changes in the S protein that increase the binding affinity between it and the human ACE2 protein in our lungs.
  2. Virulence or Pathogenicity — Certain variants may increase the severity of an infection. This could have an impact on the chances of survival or long-term effects of COVID-19.
  3. Vaccine Efficacy or Reinfection — Certain variants may cause the virus to be unrecognizable by your body’s immune system and therefore would require your body to create new antibodies. This means that some mutations may make it possible to be re-infected with SARS-CoV-2 if it is a variant that your body doesn’t recognize with existing antibodies. This directly relates to the efficacy of vaccines. Luckily, vaccine manufacturers have published press releases (see Moderna’s and Pfizer’s posts) around their vaccines’ effectiveness against new variants.
  4. Nothing at all — A mutation in a viral genome may not do anything at all or may not do much in regards to its effect on human life.

Co-Occurrence ≟ Travel

One other option is a phenomenon called “convergent evolution” or “homoplasy”. When we put selective pressure like treatments and vaccines on a pathogen, it possible that the pathogen will mutate in the same genomic location (or mutate in a similar way) in separate geographical locations out of necessity to survive and proliferate.

The Law of Smaller Numbers

So, stay home, wear your mask, get a vaccine ASAP, and slow the spread.

#nipitinthebud #staycurious

Cloud AI and genomics guy and aspiring polymath. I do research in machine learning and bioinformatics and I sometimes write things here.