Counting Chromosomes
A blog of random musings on genealogy, genetics, science, and history
Red hair and freckles
Her parents both have brown hair. Is that proof DNA skips a generation?

Q: I've been in touch with a gentleman who says that he is related to my deceased father's family line. His family tree shows this—six generations back—but on GEDmatch he doesn't match me, my two siblings, or two known 1st cousins. He told me that DNA will skip a generation, and insists that if I don't match him I'll probably match his son, whose test results, autosomal and Y-chromosome, are pending. I have no male immediate family members to test the Y-chromosome. I know we don't all get the same DNA from our ancestors, but aren't we limited to the DNA of our parents? Nothing new could show up in his son that he doesn't have but that his father, the son's grandfather, did, correct?

A: Assuming the gentleman and his wife aren't related—e.g., their great-grandparents were a case of two brothers marrying two sisters, making them double 2nd cousins and allowing differing autosomal DNA segments to pass down from both lines—you are absolutely correct. Without pedigree collapse in our trees, and relatively recently generationally speaking, there should be no surprises in the son's results. I'm going to digress a moment before addressing the skip-a-generation thing.

Detour to Imputation

I'm hedging my bets by saying should because the outcome may depend upon where those results are viewed. If on GEDmatch, I can say "no" with greater assurance. With the recent algorithm updates at MyHeritage as an example of how assumptive math plays a greater role in our DNA interpretations than we may realize, they have begun using a form of genotype imputation to "stitch" together two small, otherwise insignificant segments into a larger one that becomes meaningful for matching purposes.

We'll talk more about imputation at a future date, but the basis of the theory is the same for MyHeritage's segment "stitching" as it is in the mechanisms that allow us to compare atDNA results from the Illumina OmniExpress chip (which was used in most of the 10 million test kits now out there) and the new GSA chip, even though only 23% of the SNPs (single nucleotide polymorphisms) tested are the same. The concept is simple: with a database that is large enough and comprehensive enough, we can assumptively but reliably fill-in the blanks.

The easiest example to illustrate this idea is language. If you are a native (or extremely fluent) speaker of English, you have many years of using the language in a structured way, have learned and ingrained both spellings and common phrasing and syntax, and have used English words millions and millions of times throughout your life. That's the comprehensive database you bring to the task. When you see this, your brain can almost instantly supply the missing letters for you:

Word Imputation

"Mary had a little lamb." Right? And 99.9% of the time that assumption would be correct. But it can never be 100% correct simply because there might be something in that one-tenth of one percent that throws us a curve. What if the girl in question had been named for an Irish ancestral line, O'Mara? And what if, for some unfathomable but entirely possible reason, she decided to conceal a very small tree branch?

"Mara hid a little limb" fits the defined criteria, even if unlikely. It's all about pattern recognition and, while very good, machine pattern recognition may never be completely perfect and incontestable. As a hypothetical, this is one way a son might seem to have atDNA matches on the paternal line that the father does not.

Digression into imputation done. Is the "skip a generation" thing only a genetic fantasy? Not at all. It has, however, nothing to do with what we typically consider matching data for genealogy. This is another digressive rabbit hole that I will only spend a little time on, but "skip a generation" refers to dominant and recessive genes, and the expression of those genes. Let's quickly consider a gene that determines hair color.

Red Hair and Freckles

Red hair is a recessive gene. A child's pair of genes can be either homozygous (identical from both parents) or heterozygous (one gene type from the father, the other type from the mother). Geneticists usually display the dominant with a capital letter, and the recessive with a lowercase letter; let's call them "R" and "r" in this case. If both the mother and the father are homozygous dominant, they would both be indicated by "RR," neither would have red hair, and neither have a recessive gene they could pass on to their children. None of their kids or their kids' descendants would have red hair.

But what if each parent were "Rr," in other words carry both a dominant and a recessive gene? Neither of them would have red hair because the dominant gene is in control, but they could both pass an "r" recessive to one or more of their children. So they could have three children, one each "RR," "Rr," and "rr." The first would not have red hair and could not pass along the red-hair recessive gene. The second, like her parents, would not have red hair but would be able to pass along either the dominant or the recessive gene. The third would have red hair, and could only pass along the recessive red-hair gene.

This is how DNA can skip a generation. Nothing actually skips anything, of course. But it can most certainly appear like it when one grandparent has red hair, both parents have dark brown hair, and their daughter then turns out to have lovely scarlet locks.

Bottom line is that, as you noted, you can only get your 3 billion base pairs of DNA from two places: your mother and your father. We're purposely going to ignore unusual cases here, like some forms of stem cell transplantation or future CRISPR genetic engineering. That said, you get only approximately 25% of each of your grandparents DNA. In fact, Graham Cooper at UC Davis looked at 1,500 real-world DNA tests and determined that roughly one in 200 grandchildren would obtain only about 10% of his or her autosome from one of the paternal grandparents. With each generation, things get shuffled up more and more.

Diminishing atDNA Returns

What this means is that if you and a cousin share an ancestor six generations back—as is possibly the case with your newly-discovered cousins—a lack of autosomal DNA matching is not evidence disproving the relationship. In fact, the odds we share any detectable DNA at all worsen rapidly the more distant the cousinship. A study headed by Brenna Henn (formerly the head scientist for 23andMe) demonstrated that only 45.9% of your 4th cousins will share any detectable DNA with you at all, and that drops to 14.9% for 5th cousins: your 4g-grandparents and six generations. At that level, your odds of matching an actual 5th cousin are 11:2, or two matches showing for every 13 cousins tested. This is one reason autosomal DNA triangulation, as a process, is really much more complex and detailed than some assume it to be.

If the common ancestor is in both your and your new cousin's direct paternal lines, then yDNA testing could assist with validating the relationship. The Y-chromosome doesn't go through crossover at meiosis, and is passed down intact from father to son unchanged save for occasional—but typically quite slow—generational mutations. This means the male taking the test doesn't necessarily need to be your sibling or father. If your father has a living brother or you have a known male 1st cousin on that paternal descendant line, then yDNA testing could give you and your new-found cousin evidence substantiating a relationship back to that 4g-grandparent MRCA.

Oh, and by the way. While red hair is recessive, having freckles is a dominant trait...even though the same gene, MC1R, is responsible for both! Ain't genetics grand?