Genetics

You are currently browsing articles tagged Genetics.

ResearchBlogging.orgSvante Pääbo’s group at the Max Plank Institute have a paper coming out in the February issue of Cell Biology. In it, they describe sequencing a complete early human mitochondrial genome from the Markina Gora specimen from the Kostenki 14 site in Russia. The remains date to around 30,000 years ago, not the oldest human sequence, but interesting nonetheless because the authors have identified new ways to determine if ancient DNA sequences are genuine vs. contamination.  This is especially important for more anatomically modern human fossils, who may have similar sequences to extant populations.

For Neandertal mtDNA, identifying contamination is relatively simple, because their mtDNA sequences fall outside the range of variation found in modern humans. Not so for more recent fossils.  So how can researchers identify true archaic sequences?

fragment length, deamination-induced sequence errors at ends of molecules, and purine-associated fragmentation represent features by which endogenous and contaminating populations of DNA molecules can be distinguished in at least some late Pleistocene specimens (1).

So, fragments sequenced from ancient samples are typically shorter than modern contaminants.  In many cases, the fragments are shorter than what can be amplified using PCR, meaning high-throughput direct sequencing methods are required to analyze these ancient samples.  In addition, the cytosine bases at the 5′ ends of ancient DNA fragments are susceptible to deamination (removal of an -NH3 group), causing those bases to be misread as thymine. The 3′ ends of ancient sequences have a commensurate increase in G-A errors. Finally, fragmentation of ancient sequences occurs more frequently at purine bases (guanine and adenine).

With these criteria in mind, the researchers determined that the Markina Gora sequence belongs to mitochondrial haplogroup U2, a haplogroup still present in Europe today.

Figure 3D from Krause et al. (2010) - with the EMH sequence highlighted in red.

The authors determine that it is unlikely that this sequence is the result of modern contamination, because the nucleotide difference between the Markina Gora specimen and the ancestral U sequence is much shorter than than seen between the root and modern sequences, which have accumulated many more mutations over time.  Their results also support the hypothesis of pre-agricultural genetic continuity in Europe, so that genetic lineages which were present on the continent prior to the Neolithic transition can still be found in modern European populations.

Krause J, Briggs AW, Kircher M, Maricic T, Zwyns N, Derevianko A, & Pääbo S (2009). A Complete mtDNA Genome of an Early Modern Human from Kostenki, Russia. Current biology : CB PMID: 20045327

I ran into an issue while doing some analysis for my dissertation.  I’ve been working on comparing genetic distances between populations using a variety of molecular markers (mtDNA sequences, Y-STRs, and autosomal STRs).  I wanted to generate several neighbor-joining trees to display the results, but I also wanted a way to test the statistical significance of the tree, or how accurate a representation of the underlying genetic distance data the tree actually was.

One way to do this is with bootstrapping, where thousands of random data sets are generated from the original data (by dropping data and recalculating the tree).  In the end you have a tree with internal branch values, showing how many times each node turned up in the analysis. It’s a standard technique, and is the method I used with my autosomal STR data. But the software I used to handle sequence data in particular (MEGA, Phylip), starts with the raw sequences and generates bootstrapped trees from that data.  The trees created show each sequence on its own branch, rather than each population.  With over 8,000 sequences in my data set, this type of analysis really wasn’t useful.

But last week I found TreeFit, a little Windows program that generates an overall R2 value by comparing the genetic distance matrix with the distances calculated based on the neighbor-joining algorithm.  Basically, it appears comparable to the STRESS value used for multidimensional scaling (MDS), measuring how well the representation of the data (the NJ tree) matches the variation present in the original distance matrix.  A perfect fit would generate an R2 value of 1.0, while anything above 0.90 is considered a good fit (or an accurate representation of the underlying data).  Values less than 0.90 suggest that another graphical display method (MDS) might be a better choice, as not all data fit the hierarchical model on which the NJ algorithm is based.

Using TreeFit, I got some reassurance that my NJ trees were accurate, and the statistical significance I needed to convince my committee that my data is not “merely descriptive.”

Technical specs:

  • OS: Windows (runs fine on my XP virtural machine)
  • Requires MS .Net framework
      edfa

    • if this is not installed on your system (as it wasn’t on mine), it can be downloaded from the Windows update site
  • Input file: any lower left genetic distance matrix, meaning that this program works with ANY type of genetic data.
  • Output: observed and fitted genetic distances, these can be plotted for a nice visual, plus overall R2
  • Reference: Kalinowski, ST (2009) How well do evolutionary trees describe genetic relationships between populations? Heredity (28 Jan 2009) doi: 10.1038/hdy.2008.136. (PDF available from the author’s publication page).

The Lost King of FranceThis was one of my Half-Price Books finds that had been gathering dust in my to be read pile for several years. I had tried reading it once, got bored, and put it away. But when I picked it up at the end of the semester, I really couldn’t put it down.

The Lost King of France tells the story of the son of Louis XVI and Marie-Antoinette, Louis-Charles, who was imprisoned in the Temple along with his parents and older sister during the Revolution. The tale is similar to what happened to the Romanovs in Russia at the turn of the 20th Century, but was one I had never heard. I knew, of course, that King Louis XVI and Marie-Antoinette had been beheaded by Robspierre’s government, but not that their children remained imprisoned for years after their parents’ deaths.

Their daughter, Marie-Therese, was released (and exiled) in 1795, but her brother had been secluded years before, and rumors ran riot that he had been smuggled safely out of the Temple. As with the Romanovs, there were pretenders to the throne, which Marie-Therese never openly acknowledged, due in part to the official record stating that the dauphin had died in the Temple in 1795. But no one was really sure what had happened to him. That is, until 2000, when geneticists analyzed a tissue sample from a child’s heart, reportedly taken from the Orphan in the Tower during the autopsy by the attending physician.

This was a great read, engaging, and combining two of my favorite subjects, history and genetics. Better still, it demonstrates how genetic analysis can be used to answer historical questions, unequivocally.

ResearchBlogging.orgPLoS ONE has a article this month titled The Phylogeny of the Four Pan-American mtDNA Haplogroups: Implications for Evolutionary and Disease Studies. There are several points of interest:

  1. The authors make use of data that is publicly available, either through GenBank or other DNA databases.
  2. Complete mtDNA sequences (ie., all 16568 bases) were used for phylogenetic reconstruction.
  3. Among 265 “novel” mtDNA sequences reported among Hispanics and African Americans in a recent addition to GenBank, 101 were of Native American origin.
  4. All four Native American founder lineages (A2, B2, C1, D1) date to between 18,000-24,000 years ago.

Their results suggest that human expansion into the Americas coincided with the decline of the Last Glacial Maximum (Ice Age), knocking another hole in the “Clovis-first” hypothesis. Given that all four lineages give similar coalescent times, this study may also contribute to the “waves of migration” debate.


Achilli A, Perego UA, Bravi CM, Coble MD, Kong QP, Woodward SR, Salas A, Torroni A, & Bandelt HJ (2008). The phylogeny of the four pan-American MtDNA haplogroups: implications for evolutionary and disease studies. PloS one, 3 (3) PMID: 18335039

Mendel’s Dwarf

Mendel's Dwarf

Simon Mawer’s Mendel’s Dwarf examines the ethical implications of genetic research through a fictional account of the discovery of the achondroplasia (dwarfism) gene. The title character is Dr. Benedict Lambert, a geneticist who also happens to be a dwarf and a distant nephew of the father of genetics, Gregor Mendel.

The novel skips between Ben’s research and Mendel’s work. The historical part of the novel was much more interesting to me, being the history of my field. The modern sections spent a little too much time focused on the one part of Ben that was “normal-sized,” and as a result, Ben isn’t a likable or sympathetic character. The actions in his personal life overshadow his work and accomplishments.

The novel did give me the opportunity to think about my field in a new way. I had heard before the Darwin never read Mendel, but it hadn’t occurred to me that Mendel probably read Darwin. On the Origin of Species was a famous book, not just in England but likely on the continent as well, so it makes sense that Mendel had access to it, given his interests.

Mention is made of the Russian geneticists who were prohibited from studying Mendel by the state, whose policy considered nurture above all, with no place for the possibility that some traits might be inherited. Those scientists who refused to toe the party line were either shot, imprisoned, or exiled to Siberia, some for upwards of 15 years. Scientists in the US have faced similar censorship in recent years, though not yet with such drastic results.

Mawer also draws parallels between the eugenics movement at the turn of the 20th century and modern “family-balancing” techniques, allowing parents to choose the sex of their offspring. He sees a slippery slope here, with genetic counseling being not so different from “purifying the genome” through ethnic cleansing. As a geneticist, I’m not sure I agree. But it’s definitely an issue worth examining, as we are only just beginning to consider the ethical implications of Mendel’s work.

Overall, the book is interesting, though it may make the reader uncomfortable. I think that, ultimately, may be Mawer’s intent.

Blog Widget by LinkWithin

« Older entries