Sunday, January 27, 2013

MERS-CoV: A virus is born

Update: In May 2013 the virus was offically named 'MERS-CoV'.  At the time of this writing, EMC/2012 appeared to be the most official name for the virus, so that is reflected here in the text. Thus, 'EMC/2012' in the text of this article is the same as 'MERS' or 'MERS-CoV' as it will be referred to in the media and literature from now on (31 May 2013).

Viruses are everywhere.  For every species of living organism on the planet, there are probably multiple viruses that can infect it.  There are even a few viruses that infect other viruses.  The Darwinian combination of random mutation and natural selection is exploited by viruses more rapidly than by their hosts, keeping them one step ahead in the race of adaptation.  The result is often new strains of an already existing virus, which can sometimes hop from one species and into another.  Case in point, this past fall the World Health Organization (WHO) reported the identification of a new human virus that has been fatal in over 50% of those unfortunate few who have been infected so far.  This virus was named EMC/2012 and it is a particular concern to the WHO because it is a coronavirus.  That is the same family of viruses that SARS emerged from a decade ago.  By comparison, SARS was fatal in 11% of the 8,422 cases in which it was identified.  And so how is EMC/2012 different and how is it similar to SARS?  Exploring this question can help gauge the probability of a new pandemic.

There are at least 29 classified (and dozens more yet to be classified) kinds of coronaviruses, which can infect a variety of mammals and birds.  Prior to 2002, only 2 of these were known to infect humans, and only with mild respiratory consequences.  However in November of that year, clusters of severe pneumonia cases were identified in people living in the Guangdong Province of China, and then Hong Kong.  Heightened concern led to urgent research, which in turn revealed the cause to be a third, previously unidentified human coronavirus.  By the time this suspect was identified as the SARS-virus, it had already fled to other parts of Asia, Europe, and North America with some ensuing fatalities.  But within 1-2 years, SARS had faded away and the face-masks began to come off.  Gone, but not forgotten, heavy research continued to address where SARS came from and how it became transmissible.  Incidentally, by 2009 two additional coronaviruses were found to be milling about the human population and causing colds, but we never knew about them because we never had a serious reason to look for them.  But last year a reminiscent series of events developed.  In June 2012 a man was admitted into a hospital in Saudi Arabia with severe pneumonia.  Eleven days later he died before the cause was known.  The following month another patient was found with similar symptoms in Qatar, and eventually became severely ill.  Thanks to further investigation, thorough documentation, and the keen eyes of healthcare workers, the two dots were connected and the World Health Organization announced in late September that the two cases were caused by the same pathogen – this time not SARS, but an unknown coronavirus.  Within a little over a month the successful isolation of the virus in laboratory cell cultures was reported, and it was named it EMC/2012.  And then by December the complete genome was sequenced.  In the meantime, 7 more people were found to have contracted the virus in the middle-east, with four more of them being fatal.  In time, EMC/2012 will probably be confirmed in other cases, but for the moment it does not appear to be widespread, and may not be overly transmissible among us humans.  Analysis of the EMC/2012 genome shows that it appears to have evolved from coronaviruses in the bat population.  Although some uncertainty lingers, there is significant evidence that the SARS virus also slipped into the human population from similar coronaviruses circulating in bats.  So, why bats?  Bats make up about 20% of all mammalian species, and about 40% of all mammals in Hong Kong are bats.  A large diversity of bats equals a large diversity of mammalian coronaviruses.  And many bat species adapt well to urban environments, which brings them in close proximity to human populations.

In order to understand how a bat virus can suddenly become a human virus we need to get to know viruses on a physical level.  Viruses are composed of biological components, but are not quite living themselves.  They are the undead, the zombies of the biological world.  Like a riddle wrapped inside an enigma, they are genes wrapped inside of proteins.  The proteins determine what they will infect, and the genes determine the proteins once the infection has taken place.  To elaborate on this concept, the genes may be DNA or various forms of RNA depending on the virus.  All proteins are continuous chains made of 20 possible amino acid links.  Analogous to how programming code determines computer software, the genetic code of DNA or RNA determines the sequence of amino acids for the proteins as they are synthesized within the cell.  Each amino acid is a like a small compound with inherent properties: some are positively charged, some are negative, and some have no charge; some are polar, some are non-polar; some react to pH, some don’t.  A smaller protein of say, 100 amino acids, would have (in theory) 20100 possible arrangements of those amino acids.  That amounts to a lot of variability in terms of the physical nature of each possible protein as a whole.  Thus, many possible protein shells allows for many possible viruses.

Coronavirus diagram showing the arrangement of spike (S), membrane (M), and nucleocapsid (N) proteins around a core of RNA genes.  Some coronaviruses may also have a hemagglutinin-esterate (HE) protein in the lipid membrane.  Small envelope (E) proteins can also be present in small amounts on a mature virus.

As for the coronavirus family, RNA is used for their genetic transfer, which is encased inside of a double shell constructed mainly of 3 or 4 distinct proteins.  One of these proteins is called (S) for “spike” because it protrudes from the virus, and this is what determines what type of cell the virus will stick to and infect.  Because (S) is usually about 1300 amino acids in length, this allows for considerable variability in the amino acid makeup among coronavirus “spike” proteins.  When a virus infects a cell, it is taken apart, and its foreign genes are copied over and over again.  But like a document that has been photocopied or faxed too many times, the fidelity is poor.  This means an abundance of mutations, and not all of the new ‘baby’ viruses will be genetically identical to their ‘parent’ virus, for better or worse.  So how many mutations need to occur to generate a new virus?  This can depend on a lot of factors, but the arrangement of amino acids on the ‘spike’ protein on EMC/2012 is 64-67% identical to the two bat viruses from which it is believed to have emerged from.  This demonstrates another complexity: often new viruses are the result of a species being infected simultaneously by two similar viruses, creating the opportunity for a genetic swap meet.  It is impossible to predict the epidemiological path that viruses will take, being the marauding mutants that they are.  But for now it appears that we can reasonably hope that EMC/2012 was just a blip on the radar.