Friday, December 28, 2012

AncestryDNA and a Possible Faux Pa: Part Deux

My mother's genetic journey began with one mystery, and led to another.

My mother's late father was a Cyr, but the family had long suspected that he was fathered by his mother's eventual second husband, a Levesque. The most pressing reason for my taking an autosomal DNA test was to find evidence of her father's paternity. I had researched both the Cyr and Levesque pedigrees, and hoped that among my genetic matches would be a cousin whom I could tie to one of the branches of one of these trees.

That mystery has since (tentatively) been solved, but another popped up when my AncestryDNA results came back. As I discussed in my earlier post, the results showed no evidence of my French ancestry. I speculated that the test could have missed (i.e. misidentified) my French genes, or else my grandmother could have been unfaithful. When AncestryDNA opened to the public in early November, I ordered a test for my mother, and two-and-a-half weeks later the results were in.

Recall that this was my "Genetic Ethnicity Summary":

My mother's summary is a bit more interesting:

Her mother was born to Finnish immigrants; her father was Franco-American. I've concluded that the Finnish/Volga-Ural came from her mother, along with the Eastern European and some of the Scandinavian. The rest came from her father. The test didn't recognize her Frenchness, though almost all of her matches who shared their family trees were of French-Canadian ancestry. In fact, two of her matches were 4th to 6th cousins through her Cyr line, making it less likely that her biological grandfather was "Grampa Levesque."

One mystery was fading, but the other deepened. Was her summary reflecting her father's "deep ancestry," or was it simply mistaken?

It wasn't AncestryDNA that helped me solve this mystery, but its rival 23andMe. I was able to apply that company's recently announced Ancestry Composition feature to test results I had previously obtained for my maternal uncle. To a point, the feature is similar to AncestryDNA's Genetic Ethnicity Summary. Beyond that point, AncestryDNA's weaknesses become apparent.

Ancestry Composition provides three resolutions: Global, Regional and Sub-regional. Sub-regional Resolution gives a summary much like AncestryDNA's:

This shows at least some French ancestry, which is better than AncestryDNA did for my mom. But notice that, whereas AncestryDNA assigned 99% of my mother's DNA to more or less specific regions, 23andMe assigns only about 77% of her brother's—the rest is labeled "Nonspecific European" or "Unassigned." 23andMe also seems hesitant to recognize Scandinavian ancestry. Obviously AncestryDNA is the better company!

But wait. I always try to be careful when expressing genealogical conclusions to couch them with words like "probably," "apparently," or "possibly." How certain is AncestryDNA of its conclusions? It provides confidence percentages for DNA matches, but not for its ethnicity calculations. We are told simply that the latter results are held to "an extremely high standard of accuracy," and (by implication) should not be doubted. Why not? Because SCIENCE!
There are a few reasons why your ethnicity may not be exactly what you expected:
  • Your genetic ethnicity may go back further than your family tree.
  • While your ancestors lived in a certain country, there may have been genetic influence from other places.
  • You don’t necessarily share common DNA with all of your ancestors.
In other words, if our conclusions don't match your expectations, your expectations were wrong.

Blaine Bettinger quotes a concession that I can't find on the website:
Right now, your genetic ethnicity may not look quite right, with some ethnicities under or over-represented. As scientists gain a deeper understanding of the data, our prediction models will evolve to provide you with more accurate and relevant information about your family history.
The website concedes only that the ethnicity summaries "may update over time as new genetic signatures are discovered," and that they might someday be able to show "more granular ethnic regions." It promises refinement—something less than correction.

To be clear, my issue isn't with the quality of AncestryDNA's prediction models, but with the presentation of its predictions. To present a prediction without qualification is to imply that we should trust that prediction without reservation. "A hard rain's a-gonna fall" is a very different statement from "There is a 60 percent chance of precipitation."

Let's compare AncestryDNA's approach with 23andMe's. Just above my uncle's Ancestry Composition percentages is a drop-down box that allows me to adjust the estimate.

Standard Estimate is the default, but I can also see a Speculative Estimate and a Conservative Estimate. Here's the speculative estimate of my uncle's ancestral ethnicity:

Based on this estimate (and assuming a genetic Maginot Line that kept out any German genes), he's at least a quarter French, which seems closer to the truth than 3.3%.[1] Even when speculating, 23andMe leaves a sizable chunk of his ancestry somewhat vague (though the "Nonspecific European" portion has shrunk considerably). The company provides a white paper that explains exactly how they arrive at these estimates, and what "speculative," "standard" and "conservative" mean. (Briefly, they indicate 50 percent, 75 percent and 90 percent "confidence thresholds.") Scroll down to the "Testing & Validation" section and you'll see that French and German ancestry is by far the most difficult to determine.
In the worst case, the French & German population, the recall is 7%, meaning that 93% of the actual French & German DNA was not labeled as such.
Scroll back up and you'll learn that "Finns are so distinct from other populations, they actually get their own reference population in Ancestry Composition." So, my mother's ancestry demonstrates the best- and worst-case scenarios for identifying specific European ancestries.[2] I would not have learned this from the AncestryDNA website. When your competitor is educating your customers, that's a problem.

With the introduction of the Ancestry Composition tool at 23andMe, there is one less reason to recommend AncestryDNA. I hope that is reconsidering its decision to "dumb down" its test results, and will begin providing more and better data for us to evaluate. It needs to pull its scientists out of the lab to tell us where these percentages are coming from. What reference populations are they using? How confident are they in their estimates? I would rather be confused by too much information than by too little. Blind me with science!

[1]The Iberian ancestry fits with my uncle's Y chromosome haplogroup, which is most prevalent on the Iberian peninsula and southern France.

[2]Having such distinct maternal and paternal ancestries provides an opportunity to distinguish which of each chromosome pair comes from the mother, and which from the father. (In labspeak, this is part of the "phasing" process.) 23andMe's new tool lets me view which of my uncle's chromosomal segments come from which part of Europe. The Eastern European shows up as part of his X chromosome, indicating that it came from his Finnish mother. I'm also able to see that those segments identified as Scandinavian in every case share a chromosome with those identified as Finnish. That means that these Scandinavian genes (if phased correctly) most likely came through Sweden rather than Normandy.

« Newer Posts       Older Posts »
Related Posts Plugin for WordPress, Blogger...