Counting Chromosomes
A blog of random musings on genealogy, genetics, science, and history

People tend to hold overly favorable views of their abilities in many social and intellectual domains.... This overestimation occurs, in part, because people who are unskilled in these domains suffer a dual burden: Not only do these people reach erroneous conclusions and make unfortunate choices, but their incompetence robs them of the metacognitive ability to realize it.
     —David Dunning and Justin Kruger,
         Journal of Personality and Social Psychology,
        1999 Dec;77(6):1121-34

Dunning-Kruger Effect
Image courtesy of YLMSportScience

This will be somewhat of a departure in content: I have a serious point to make, then I need some cathartic comedy time. Welcome to the ride.

I'm sure most of you are familiar with the Dunning-Kruger Effect. Since its proposal as a theory in 1999, it's been borne out in multiple studies—more than 100 to date—across varying environments. In a nutshell, what it tells us is that we're not very good at evaluating our own levels of knowledge and skill. Illusory Superiority: we judge ourselves as being better than others to degrees that violate the laws of mathematics. Those with the least ability are most likely to overrate their skills to the greatest extent, as the little infographic at right shows.

For example, 88% of American drivers rated themselves as having above-average driving skills. Software engineers were asked to rate their performance. In two different companies, 32% and 42% of the engineers, respectively, put themselves in the top 5%. College debate teams who were in the lower quartile—they were losing four out of every five rounds—thought they were winning over 60%.

People who, when tested, are measurably poor at everything from logical reasoning to grammar, math, emotional intelligence, and even chess all tend to rate their expertise almost as favorably, or in some instances even more favorably, as actual experts rate themselves. Poor performers lack the knowledge and expertise to recognize how badly they're actually performing. It isn't an ego thing...at least, not most of the time. As people gain a better understanding of the subject matter, they begin to recognize how poorly they previously performed and how much they have yet to learn.

Actual experts become more aware of just how knowledgeable they are and can more objectively rate their expertise, though they may also be more likely to slightly underrate themselves. But they often fall into another mistake: they assume that others have a grasp of the subject similar to their own. The second chart at right is another way of visualizing the progression.

Progression of Understanding
An annotated representation of the Dunning-Kruger Effect and the acquisition of subject matter expertise

Genetic Genealogist Accreditation

I have long been an advocate of a formal method of accreditation for genetic genealogists. I supplied a position to BCG this summer about their initial proposal for a new genetic genealogy standard, and much of my opinion centered around this. Without BCG taking on the role of formal accreditation of genetic genealogists—arriving at a tested "CGG" certification, for example, to accompany CG and CGL—I fear the final version will not be able to go far enough. And I know of no other certification body at this time who might properly undertake the task of managing accreditation.

In part, the for-comment draft of the genetic genealogy standard includes:

The first five months of 2018 we went from 12 million direct-to-consumer DNA testing kits sold to about 17 million, an average of 1 million new kits per month. The extrapolated growth trend may slow, but it's still within reason that we might see 10 million tests sold through the course of 2018.

DNA Testing Update
Graph by Antonio Regalado, Senior Editor, MIT Technology Review. Click image to view at full size.

New data are in, and business is positively booming. Antonio Regalado, Senior Editor for the MIT Technology Review, has been keeping tabs on testing trends since at least 2016, and he posted this updated chart August 6.

These numbers aren't yet affected by GDPR because they run through the month of May; GDPR went into effect May 25. The Golden State Killer genetic genealogy brouhaha surfaced last April, so the trend might also be blunted somewhat by new privacy concerns. Time will tell.

At the beginning of 2018 we were indicating that about 12 million direct-to-consumer DNA tests had been sold, up from about 5 million as of January 2017, a 140% increase. As an averaged trend, we were selling about 583,000 test kits each month. I may not be a fan of the kilt or lederhosen "ethnicity" advertising, but it obviously works.

The first part of 2018 blows that performance out of the water. We went from 12 million kits to about 17 million, or an average of right at 1 million kits sold per month. I personally do expect the extrapolated growth trend to slow, but it's still within reason that we might see 10 million tests sold through the course of 2018. If that's the case, there would be as many tests sold in 2018 as there were in total from the opening of Family Tree DNA all the way through mid-year 2017!

The other thing the chart illustrates is just how rapidly the catalogs of data are accumulating. Just five years ago, in 2013, the numbers are barely a blip on the chart. Makes those of us who have been around genetic genealogy for a decade-and-a-half feel positively ancient. Speaking of ancient...

I have no source for numbers, but we're seeing a not dissimilar trend in peer-reviewed scientific studies of anthropology and population genetics. Huge technological changes—both hardware and software—have been going on in the background at companies like Illumina that make HiSeq sequencing of genomic data from small, ancient, and degraded samples possible, and at lower costs to institutions and universities than ever. David Reich, author of the excellent new book Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past, estimates that since the early 2000s the cost of analyzing ancient genomes has come down over 10,000-fold.

There is always a significant time lag from study conception and proposal to peer-reviewed publishing, but we began to see important new work start coming in around 2013 and 2014, and the rate of publication has been steadily escalating, so much so that I now have Google Alerts set to search for new pre-prints and papers on a daily basis to help me keep up. Some of these can't be included via the news aggregator service for The Tribune, but some, general-interest papers do find their way to that daily newspaper.

Back on the consumer homefront,

Market Research

If you follow genealogy and genetics news like I do, no doubt you have seen—multiple times over the past few weeks—notification of a market study from a company based, I believe, in India called Absolute Reports. The report is titled "2018-2025 Genealogy Products and Services Report on Global and United States Market, Status and Forecast, by Players, Types and Applications."

Sounds intriguing, doesn't it? It was published 22 June 2018; I first saw it advertised in July, and it is offered for the price of US$3,600 for a single-user license with the option to receive a limited, redacted preview. You can view a safe description of the study at Absolute Reports' website.

It remains unclear to me whether Absolute Reports is solely a reseller of externally prepared market reports, or whether they write any of the studies themselves. Regardless, the sample copy of this seven-year genealogy industry forecast that I received contained no attribution of authorship. In fact, section 15.4, "Author List," was redacted of all names or contact information. The only notice of copyright anywhere in the 79-page document (advertised as being 117 pages) is in section 15.3, "Disclaimer": "All trademarks, copyrights and other forms of intellectual property belong to their respective owners and may be protected by copyright."

The beginning of that particular section is, I believe, crucial to understanding the value of the report: "The information and opinions in this report were prepared by [REDACTED]. The information herein is believed to be reliable and has been obtained from authentic public sources.... [REDACTED] research and analysis publications consist of the opinions of [REDACTED]'s research and should not be construed as statements of fact. [REDACTED] disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose."

The redactions in this instance are not mine, but represent areas left blank in the sample report itself.

The summary of the report reads:

This report studies the global Genealogy Products and Services market, analyzes and researches the Genealogy Products and Services development status and forecast in United States, EU, Japan, China, India and Southeast Asia. This report focuses on the top players in global market, like:

  • DNAPrint Genomics, Inc. (USA)
  • Familybuilder (USA)
  • Family History Library
  • Family Tree DNA (USA)
  • Sorenson Molecular Genealogy Foundation (USA)
  • Ancestry.com (Formerly known as Generations Network, Inc.)
  • Genealogy.com
  • MyFamily.com
  • RootsWeb.com
  • WorldVitalRecords (USA)

Your own sense of the current genealogy and genetics marketplace no doubt has you tilting your head at that list of companies for focused research. The listing I received in the copy of the sample report was different, but not really any less brow-furrowing:

The goal of the collaboration is to gather insights and discover novel drug targets driving disease progression and develop therapies for serious unmet medical needs based on those discoveries.
     —GlaxoSmithKline

GlaxoSmithKline
GlaxoSmithKline headquarters,
Brentford, London, UK

UK-headquartered Big Pharma company GlaxoSmithKline (GSK) has invested $300 million (£228 million) for an equity stake in direct-to-consumer DNA testing company 23andMe. The official GSK press release describes the investment as a "multi-year collaboration expected to identify novel drug targets, tackle new subsets of disease and enable rapid progression of clinical programmes."

The GSK press release highlights three primary goals of the collaboration; quoting:

  • Improve target selection to allow safer, more effective "precision" medicines to be discovered. Genetic data can significantly improve our understanding of diseases, their pathways and mechanisms, supporting the design and development of

Taking the two precedents cited by Ancestry.com in their motion of dismissal, it seems their platform is that neither natural phenomena nor abstract ideas are patentable unless there is a new, additional, inventive concept involved. Maybe we'll end up with the "do it on a computer" argument, and the "DNA is DNA" argument.

Court of Law
Invoking two Supreme Court precedents, Ancestry.com files for dismissal

This week, Ancestry.com responded to the lawsuit from 23andMe by filing a motion to dismiss with a California federal court. The filing indicated the 23andMe patent consisted of "abstract and non-inventive steps" of collecting two DNA samples and then comparing them to find a correlation based on phenomena that occur naturally. The U.S. Patent in play is number 8,463,554, titled "Finding Relatives in a Database," issued 11 June 2013.

The lawsuit was filed May 12 by 23andMe demanding, among other things, payment for damages and invalidation of the "Ancestry" trademark. An article at Law360 said, in part:

The European Parliament will now be able, in an open debate, to improve the text and defend freedom of expression ahead of the next elections.
     —Diego Naranjo, Senior Policy Advisor at EDRi

Next Steps from the EDRi
Infographic from EDRi, the European Digital Rights organization

I wrote recently about the sea of criticism mounting against the Orwellian proposal by the EU for a "Directive on Copyright in the Digital Single Market" (see "Will the Proposed EU Copyright Directive Irrevocably Damage the Internet?" and "Controversial Copyright Proposal Passes First Step in European Union Parliament"). Yesterday, July 5, Members of the European Parliament (MEP) voted by a slim margin of 318-278 to remove from its Committee on Legal Affairs (JURI) the mandate to negotiate with the EU Council the proposed copyright directive as it is currently written.

In a plenary vote yesterday, 20 June 2018, the Legal Affairs Committee (JURI) of the European Union Parliament voted for a proposed copyright directive as presented, which includes measures to monitor, filter, and control—if not outright censor—uploads to the Worldwide Web. In an article titled "Will the Proposed EU Copyright Directive Irrevocably Damage the Internet?" I commented a few days ago on the highly controversial proposal for a "Directive on Copyright in the Digital Single Market," EU Interinstitutional File: 2016/0280 (COD).

EDRi Infographic
Infographic by the European Digital Rights (EDRi) organization, click image to view actual size

Considering the number of voices raised in opposition to the proposal, it is surprising to many that the current text passed without further alteration or amendment. In particular, Chapter 2, Article 13 (pages 56 through 60 of the 66-page proposal) is drawing dire warnings from many open-information notables. In essence, it will require that that providers of web services be responsible and liable for pre-screening everything people post online to make certain none of it potentially infringes on copyrighted material.

There is no such thing as an "international copyright" that will automatically protect an author's writings throughout the world. Protection against unauthorized use in a particular country depends on the national laws of that country. However, most countries offer protection to foreign works under certain conditions that have been greatly simplified by international copyright treaties and conventions.
     —the United States Copyright Office

Handcuffed at Keyboard

Just as we were all getting over the serious and painful surgery on May 25 that was enactment of the GDPR (General Data Protection Regulation), we have a new issue on the immediate horizon. Just three days from now, June 20-21, at the European Parliament meeting in Brussels, up for an initial vote on plenary approval is the controversial "Directive on Copyright in the Digital Single Market," EU Interinstitutional File: 2016/0280 (COD).

In particular, Chapter 2, Article 13 (pages 56 through 60 of the 66-page proposal) is drawing not only ire, but some dire warnings

To recap: About 9:00 p.m. EST on June 4, MyHeritage announced that a data breach of their systems had been discovered that affects 92.3 million accounts, users who had registered at MyHeritage up to and including the date of the breach, 26 October 2017.

Internet Security
The data breach affects all MyHeritage user accounts created before 27 October 2017

Summary of Events

About 9:00 p.m. EST on June 4, MyHeritage announced that a data breach of their systems had been discovered that affects 92.3 million accounts, users who had registered at MyHeritage up to and including the date of the breach, 26 October 2017. Approximately eight hours earlier, an independent security researcher notified the company that he had discovered a file "on a private server outside of MyHeritage" that contained the email addresses and so-called "hashed" passwords of these accounts.

Staggering to comprehend, but the company has stated that "other websites and services owned and operated by MyHeritage, such as Geni.com and Legacy Family Tree, have not been affected by the incident." Further information shows that they have added about 4 million new accounts since the breach

The popular online service for autosomal DNA matching and comparison, GEDmatch.com, has been facing not just pressure from the impending GDPR regulations, but also an unfortunate media backlash.

DNA
Forensic Genealogy is a term now moving into the mainstream media

Interest in and concern about the European Union's General Data Protection Regulation (GDPR), taking effect May 25, has rapidly escalated over the past several weeks, even for U.S.-based organizations, commercial or not. Small genealogy organizations and websites are feeling the pressure, so much so that some, notably Ysearch.org, Mitosearch.org, and WorldFamilies.net, are closing permanently.

The popular online service for autosomal DNA matching and comparison, GEDmatch.com, has been facing not just pressure from the impending GDPR regulations, but also an unfortunate media backlash over how law enforcement are using the tools to focus on investigating cold-case violent crimes. The now month-long media blitz began April 27 when Paul Holes, a retired investigator with the Contra Costa County District Attorney's Office,

Marketing ethnicity/admixture as the primary reason to take an autosomal DNA test is, frankly, a bit disingenuous at best; at worst, it might be interpreted by some as deceptive.

Kilt
Traditional kilt, Scottish Gaelic: fèileadh
first recorded in the 16th century

Q: My AncestryDNA test results came back, and they don't make much sense compared to our family history. My mother's father was Italian. His grandfather came to America from Italy, but I'm not showing anything at all that looks like that side of the family in my results. Should I take another test at a different company?

A: Thanks for the question. You're touching on a matter that is of concern to me, one that I believe is the primary downside to the marketing tack that AncestryDNA employed, that all others had to follow or see their market shares get eaten alive, and for which serious genealogists are paying the price. Marketing ethnicity/admixture as the primary reason to take an autosomal DNA test is, frankly, a bit disingenuous at best; at worst, it might be interpreted by some as deceptive. The whole "traded my lederhosen for a kilt" nonsense.