Counting Chromosomes
A blog of random musings on genealogy, genetics, science, and history

It's a riddle that scores of websites will try to solve for you: which DNA test should you buy? But today's answer may surprise you.

Confused
An image I'll no doubt find many other uses for. Thank you to Robin Higgins, from Pixabay.

How much advertising money is spent trying to convince you one DNA test is better than another? How many websites and "experts" will break it down for you—usually starting with accuracy of ethnicity predictions—and tell you which test you should buy?

Truth is, the only rational response from someone truly informed about genetic genealogy will sound like a corporate consultant: "It depends." But there's a good reason for that. Our currently popular, over-the-counter autosomal DNA tests are remarkably similar in the lab; there really are only three major variants. Choosing a test is more about personal goals and objectives, portability, comparison and analysis tools, where other family members have already tested, and the list goes on.

I usually don't reply to public questions about which test to buy. If asked privately, I'll start with, "It depends," and then dive into the list of considerations. Today, though, the matter came up and I broke my own pattern. I provided an explicit answer about which test, for a first-time buyer, I would recommend right now. As in today.

My response has nothing to do with any of the usual suspects (e.g., ethnicity pie charts, eye color prediction), and I thought it might be worthwhile sharing it here.

The net message first: 1) My prognostication is that Whole Genome Sequencing (WGS) will completely supplant our current microarray genotyping tests for genealogy by 2022 or so. But I wouldn't wait for that to happen. Leading to, 2) If the funds are available, I would get an AncestryDNA test as soon as possible, then consider also taking a test from 23andMe, MyHeritage, Family Tree DNA, or Living DNA.

Why the urgency with Ancestry? Well, we're due for an AncestryDNA shake-up very, very soon.

It all has to do with those little microarray, microscope-slide-looking things that are programmed to attract our nucelotides to line-up in specific places so that a small part of our genome can be digitized.

Microarray

At issue is the "Intel" of DNA testing companies, a manufacturer called Illumina. Right now, we have about 29 million people who have taken direct-to-consumer autosomal DNA tests. Of those tests, roughly 20 million or so have been performed using Illumina's OmniExpress chip; the next largest chunk were done on Illumina's Global Screening Array, or GSA, chip; and a small percentage use a chip made by Illumina's only real competitor, Thermo Fisher Scientific.

With no fanfare, and a seemingly successful exfil/infil unbeknownst to many genealogists, Family Tree DNA switched from the OmniExpress to the GSA chip last April. And, ta dah!, the Family Tree DNA lab does the testing for MyHeritage...which means that MyHeritage switched chips at the same time.

AncestryDNA is the only testing provider, large or small, that still processes samples on the OmniExpress chip. Why? Illumina had intended to retire the chip long before now. In fact, when Living DNA first began planning its operational debut circa mid-2016 they were advised by Illumina that the OmniExpress chip would soon no longer be available, so they went with the GSA chip from the outset (subsequently, as of November 2018 Living DNA became the first, and so far only, tester to use the Thermo Fisher Scientific Affymetrix chip). And on 5 September 2017, Roberta Estes wrote on her blog that Illumina "has obsoleted their OmniExpress chip previously in use, forcing companies to utilize their new Global Screening Array (GSA) chip when their current chip supply runs out."

That didn't happen, but it also didn't mean that Illumina had changed its mind about retiring the chip. That retirement is happening now. We can speculate that the unprecedented (and unexpected) explosion in direct-to-consumer testing spurred by Ancestry's lederhosen-for-a-kilt advertisements persuaded Illumina to postpone the inevitable for a time. But while genealogy may not be quite a rounding error to Illumina's bottom line, it doesn't represent a large share of their earnings. For Q4 2018, Illumina booked $867 million in revenues; revenues for fiscal year 2018 were $3.33 billion. The entirety of their microarray chip business makes up about 12% to 14% of that, the genealogy market being only a portion of that percentage, and the OmniExpress chip a portion of that portion. Add to that some slippage in the growth rates of DNA test sales beginning around April 2018 (see the latest DNA database numbers and analysis from Leah Larkin). Manufacturing a microarray chip that it has "tooled down" and wants to retire is costing Illumina substantially more money and resources to continue to produce and support than does the newer GSA chip.

Bottom line: the OmniExpress chip is going away. Completely. AncestryDNA will be (is currently being?) forced to make a very significant decision, and they really have only the two choices: switch to Illumina's GSA chip, or join Living DNA and go with Thermo Fisher Scientific. Either way, I believe the change will be made before the end of this year, if not by the fourth quarter.

Okay. So why is this a big deal to us? Both tests are accurate, both tests provide good, solid data.

SNP Venn Diagram
Venn Diagram displaying approximate in-common overlap of tested SNPs between the Illumina OmniExpress and GSA BeadChips

Ah! But they provide different data! Here's a simple Venn diagram I put together for a previous blog post on the subject that I believe makes the issue clear.

Both chipsets test a fraction (about 650,000, give or take) of the roughly 4 or 5 million SNPs (single nucleotide polymorphisms) that are meaningful in your genome. But most of the SNPs they test are not the same. Note that in the diagram, the maximum number of SNPs tested is used: each chip comes with a core set of SNPs preprogrammed for testing, plus the capability to customize the targets with a few tens of thousands more SNPs. Ergo, we can't know the exact overlap because AncestryDNA doesn't reveal what SNPs it has customized its OmniExpress chip with, and 23andMe and Family Tree DNA customize their GSA chips differently from each other.

The end result, though, is that the two different chipsets test only about 20% of the same SNPs. Without going into excruciating detail, this discrepancy of data has been keeping a bunch of people in the genetic genealogy community up at night, and is the core reason it took GEDmatch so long to get Genesis out of beta...and why they had to drop the minimum SNP count to validate shared segments from a default of 700, to "size will be adjusted dynamically between 200 and 400 SNPs." Generally speaking, a lower SNP density comprising a segment calculated to be equivalent in terms of centiMorgans, the less reliable is the validity of the segment.

Regardless of the imputation and estimates and margin-of-error allowed, the two chipsets simply do not play well together...which is intuitively understandable if 80% of what they measure is unique and not considered at all by the other chip.

That's the urgency. Somewhere around 70% of the atDNA results out there at the moment come from the OmniExpress chip. Barring waiting for common and inexpensive whole genome sequencing and the technologies to back it up for comparing and matching, the only way to get the best of both worlds is to test with AncestryDNA before they abandon the OmniExpress chip, and then at some point after go ahead and also test with a different company that uses the GSA chip.

What you'll end up with then is somewhere around 1.1 to 1.2 million SNPs tested. There are already standalone utilities that will help you merge the results, and GEDmatch Genesis has a "combine multiple kits into 1 superkit" as a standard option in its Tier 1 tools.

There simply is no highly accurate way to compare a set of OmniExpress data with a set of GSA data. While not apples and oranges, they're apples and...well, some other fruit in the Rosaceae family; call it a pear. Comparing the two sets of data require a boat-load of assumptions and estimates, even guesswork, and the results can be suspect.

If you're looking to employ genetic genealogy in the near future to form triangulation groups more distant than 2nd cousins, you'll need the ability to identify and analyze detailed segment information. The smaller the segment, the more you need assurance the segment is valid...and that means comparing apples to apples, not apples to pears.

Please Note: I have no affiliation with, or insider information about, AncestryDNA. This article is purely speculative and based solely on personal opinion drawn from marketplace observation. In no way should this be considered actionable advice or a definitive statement about AncestryDNA operations.