- Data Management
This first section looks at the organizational and physical structures of the Threlkeld One-Name Study. Section two discusses data management, including data ownership and copyright, as well as particulars about the data repositories used.
Organizationally, there are four key elements to the Study:
- Volunteer Driven
- Project Based
All aspects of the Study are free to use. To keep it that way, all work effort involved is contributed on a volunteer basis. And every volunteer effort applied is significant and appreciated, from getting the word out about the Study, to taking on a project management role.
For a description of how to volunteer and examples of how to get involved, see How to Participate.
A corollary to the voluntary nature of the Study is its integral aspect of collaboration. It's a joint effort, not the isolated work of only a few. Simply entering your lineage in WikiTree is a large component to bettering the overall understanding of the Study surnames. We want accuracy over volume, and welcome any and all discussions about source materials, ancestries, research conclusions, and Study objectives and directions. We chose WikiTree as our repository of record because it was founded as collaborative, one-world tree, and we created the Threlkeld Forum as a venue for open conversation about all things related—and some unrelated—to the Study.
We mention one collaborative aspect of the Study many times throughout, and it bears repeating yet again: surname variants. In a brief research paper to be released in early 2018, we feel we have identified the use in historical documents of well over 100 possible variants of the surname. Preliminary analysis suggests over 30 spellings are likely true variants; over 40 are often associated with the surname, but have different etymologies and origins; and the remainder are what the Guild of One-Name Studies refer to as "deviants": spellings, often one-offs, that appear to be phonetic errors or simply gaffs on the part of census enumerators, tax collectors, county bureaucrats, etc.
The Study, to be registered with the Guild, must use a single surname as its title. Threlkeld seems the oldest and best-documented of the believed-valid spellings. And, quite honestly, we couldn't title the Study with a string of 10 different surnames; that not only would never fit across the top of a page, but it would still leave out 20 or more variant spellings.
However, we can never emphasize this enough: the Old Norse words that gave rise to our surname have many, many variant spellings. We may never reliably prove the shared ancestry of even two of those spellings, but we're going to try. And all those with surnames similar to but spelled differently than "Threlkeld" are not only welcome, they are vital to our purpose and we seek them out.
This may, at first glance, seem an unusual term, something more suited to an academic, scientific study. And we mean it in somewhat the same way. The broad scope of the Study touches on many disciplines; to name only a few: history, linguistics, microbiology (DNA), population studies and demography, statistics, writing and editing, media composition and editing (photography, video, audio), marketing, social media, and, of course, genealogy.
Members may have particular passions or skillsets in a specific area, and wish to share that expertise to help develop the Study. We encourage leveraging those particular skills within projects that would benefit.
This does not preclude the individual family researcher who is concentrating only on deepening the understanding of his or her line. Well-constructed and sourced family trees at WikiTree are vital, and articles and biographies about specific ancestors or family branches are always sought as valuable additions to our collective body of knowledge.
The final key, as described in How to Participate, is that the Study is project-driven. By that we mean objectives are met not by trying to move forward under a cumbersome structure toward a broadly-stated goal, but rather in small, well-defined segments, bite-sized chunks, that have clearly defined, attainable objectives that can be met within a specified timeframe.
A project might be as simple as a Member choosing to write a short piece about a great-grandfather's emigration and arrival at Ellis Island; or it might be more complex, such as an investigation spanning 100 years in colonial Virginia to trace the activities and movements of a surname line, including related and associated families, and such a project might require the help of several project volunteers to divide the research tasks.
Once a project is approved—a quick and simple process—the Project Manager is the owner of the effort and the deliverable, and works without operational interference from the Study administrators. The Study's role is to facilitate projects, to help gather resources, and to monitor approved and active projects so that efforts aren't duplicated.
In this structure, we are loosely following the Project Management Institute's best-practice as described in the Guide to the Project Management Body of Knowledge, 5th Edition, generally referred to by the acronym PMBOK. We say "loosely" because this is an extensive and thorough framework. It is most certainly a valuable reference for any project-driven effort, but we don't want anyone to feel they need to be ready to take the PMP certification test to manage a project under the Study. The definitions of portfolio, program, and project management are important, as are a general understanding of what a project is and four of the five basic project management processes.
Everyone has been a project manager many times over; he or she just may not realize it. Within the Study the role is really quite simple, and we'll provide all you need to know in a very brief, two-page document.
The study is exclusively an online, internet-based effort. While we hope that, as a result, cousins who don't know each other have a chance to talk and possibly meet in person, all Study materials are digitized and communication and collaboration are via the website, email, Google utilities, or social media.
It would be prohibitively expensive to design and build a single, integrated web-presence that could do it all, so we utilize different platforms and third-party companies to meet our needs.
Some third-party sites, notably our social media presence, store no meaningful Study data. The following describes what data is stored where, and how it is managed.
Data is stored in multiple places, and precautions are taken for information continuity: data recovery and restoration should accidental or malicious damage occur.
The Study website is hosted on a managed SSD (solid state disk) VPS (virtual private server) cluster housed in a Tier 3 data center in Salt Lake City, Utah. Hosting and domain name registration is prepaid in two-year increments, and renews automatically. All systems are redundant with automatic failover.
SiteLock is employed for server vulnerability scanning and automatic malware removal. Sitting in between the data center and the internet is the CloudFlare CDN (content delivery network) that distributes our content globally via a cloud-based nodal structure to improve end-user throughput, and that includes a a built-in web application firewall (WAF).
The data center, as part of the management agreement, performs daily backups of all website files and databases. In addition, we run a specialized utility on the server that performs nightly website backups to Amazon Web Services' (AWS) S3 cloud storage. The cloud is distributed, but we use the U.S. East (Virginia) endpoint to physically separate the primary I/O destination from the Utah data center. All uploads to S3 are automatically encrypted using a DoD compatible 256-bit AES standard.
Most website development takes place in Houston, Texas where copies of the website and all developmental files are maintained on a NAS (network attached storage). Weekly, all files and databases are copied, using the same 256-bit AES encryption, to external hard drives and stored in a bank safety deposit box for physical, offsite storage.
The hosting data center runs weekly full backups and daily incremental (data added or changed) backups which, by contract, they maintain at their own facilities. On both the AWS S3 cloud and local backups in Houston, we maintain:
- An incremental backup for each of the last seven days
- A full backup for each of the last six weeks
- A full backup for each of the last 12 months
- Family Tree DNA
The Threlkeld DNA Project operates at Family Tree DNA, and yDNA results are posted automatically, as soon as the lab in Houston finishes the test and the results are published. FTDNA takes responsibility for all data on its site. However, as part of our local backups and offsite storage, a CSV file of all yDNA data is downloaded in included. Note that surname DNA projects at FTDNA have no ability to publish autosomal (Family Finder) test information, so these data/matches are not included in the backup.
WikiTree was designed and built as a "one-world" family tree for collaborative contributions and maintenance. For that reason, we use WikiTree as the repository of record for pedigrees in the Study.
However, it is suggested you still maintain your own family tree in whatever application you feel most comfortable using. The reason has nothing to do with WikiTree's capabilities, but with the ability to download the family Tree data. A GEDCOM download option exists, but WikiTree does not store information in a GEDCOM-compliant format. The vast majority of the information in individual profiles, including biography and source information will not, when subsequently uploaded to to a GEDCOM-compliant application, populate expected GEDCOM fields but rather will commingle in the "notes" section of records for individuals. The information is all there, but it won't appear as expected and likely will not be searchable by the application.
Prior to the weekly backups locally in Houston and on AWS S3, we export GEDCOMs of the descendants of a few of the older, major surname lines in the Study, and then upload them to our Assets & Archives area.
The Study's primary use of Google services is for collaboration and reporting on individual projects. If the Project Manager desires, he or she will be provided a Google Drive folder under their control, and a formatted and ready-to-use Google Sheet for assigning and tracking project tasks. Project-related files can be uploaded to the Drive and shared as needed with any other project members. The Study will not maintain backups of these folders during the project's lifecycle. When the project is closed, a compressed archive of the folder will be archived for permanent storage.
Additionally, it is possible to utilize Google Hangouts for real-time meetings among project members. Proceedings or transcripts of these meetings will not be archived.
- The Guild of One-Name Studies
The Guild provides registered studies space on their servers for webhosting. Our web presence is too complex and large to mirror at the Guild, but beginning sometime the first half of 2018 we will start copying certain elements of the Study there, probably on a monthly basis, for additional information continuity and preservation.
The Threlkeld One-Name Study is a not-for-profit endeavor, and seeks only to aid genealogists in their research. Use of the Study's website, or any associated feature or function, is done with the understanding that there is no warranty of accuracy or suitability for purpose. Use of the website is voluntary and on an as-is basis. The Threlkeld One-Name Study and CaseStone.com assume no responsibility or liability for errors or omissions of any kind.
For full use of the Study's web presence, we require registration. Registration involves both mandatory and optional information. The only mandatory information that is displayed on the Study's website is the Member-selected username, and the surname-related pedigree information the Member has chosen to share. The Member's real name, email address, and password are never revealed, and all are stored in encrypted format only, never in clear text. As an extra level of security, the password remains encrypted at all times; it is not visible even to website administrators. If needed, administrators can create a new password at the Member's request, but no one has the ability to see the existing password.
Optional information provided by the Member at registration or later carries with it the permission to publish. These are identifiers that make it easier for researchers of the Member's family to locate him or her. Examples are links to public family trees, kit numbers at public DNA matching/comparison sites, etc.
Our stance is that any Member posting to the Study's website, writing or contributing anything to the Study, or uploading any file—whether the media is text, photographic, artistic, video, or audio—is the owner of the copyright of that work as soon as it is created.
The Study's website bears a collective, or "blanket" copyright statement. That pertains to all material presented as a collection. Some published material may carry a copyright statement with the Study as owner, but if not specifically designated copyright ownership resides with the individual author(s) of the work.
Any contribution by any Member carries with it non-exclusive, irrevocable, royalty-free, worldwide rights (i.e., a license) to use the contribution in connection with the Threlkeld One-Name Study. If the contribution is not an original work, the contributor affirms the work to be free of other copyright or that he or she is in possession of appropriate rights to publish, and assumes responsibility and permissions thereof.
Similarly, unless otherwise stipulated, any work created by the Study for the furtherance of research into surnames or genealogies carries with it permission for Study Members to reproduce or share the intact and unmodified work with non-exclusive, irrevocable, royalty-free, worldwide rights. However, if the work is in the Members-only Assets & Archives area, the link to the file cannot be shared. We occasionally post works in areas where the links can be publicly distributed. If there is a specific item you would like publicly accessible, use the Contact Us form to request it and, after a brief evaluation, we will send you a public link if approved.
For Member-contributed works, we will provide whatever attribution is requested. Sometimes an upload is offered simply as a reference, and no attribution is desired. On the other hand, it may an original work or the result of a project where all authors should be acknowledged.
Any work created by the Study will include source citation information. That citation should be used as attribution whenever the work is linked to or shared.