Handout on Genetic Genealogy Getting Started in Genetic Genealogy Genetic genealogy is the process of using DNA tests to determine how people are related through shared DNA (by “blood”). To better understand this rapidly evolving field, the International Society of Genetic Genealogy (http://www.isogg.org/) has a useful guide and glossary for “newbies”. Also see “Definitions of the terms used in genetic genealogy” at the FamilyTreeDNA website for more definitions: https://www.familytreedna.com/faq/answers/default.aspx?faqid=21 There are four main types of DNA used for genetic genealogy. Autosomal DNA is the most useful for general genealogy in recent generations, although all types may help to answer particular questions. Y chromosome is passed only from fathers to sons, so it traces only a single line of descent (patrilineal). It is very useful for testing even distant relationships through that line. Mitochondrial (mtDNA): passed only from mothers to children—single line of descent only (matrilineal). It is useful for testing even distant relationships through that line, but challenging to use due to surname changes that generally make it more difficult to trace female lines. Autosomal (auDNA or atDNA): all chromosomes except the sex chromosomes—shared and recombined from parents, it represents all lines but is only reliable at detecting relationships within about 5-6 generations (see figure below). Figure 1: Inheritance of Y DNA (tall male figures), mtDNA (tall female figures), and autosomal DNA (other figures). Figure from Your Genetic Genealogist blog by CeCe Moore, http://www.yourgeneticgenealogist.com/2012_04_01_archive.html X chromosome DNA is passed from fathers to daughters or from mothers to either sons or daughters, giving it a unique pattern of descent (see figures 2A and 2B). It can be particularly useful in narrowing relationships identified through autosomal DNA tests. Tests for autosomal DNA include X-DNA, but only FamilytreeDNA, 23andme, and GEDMATCH show your X-DNA results (AncestryDNA does not). Jonathan W. Long 1 May 2015 1 Handout on Genetic Genealogy Figure 2A: X-DNA inheritance for a male from his mother’s side (shows average percentage of X DNA from each generation, but note that you should not necessarily “expect” the average, because the X chromosome recombines somewhat unpredictably (http://www.thegeneticgenealogist.com/2009/01/ 12/more-x-chromosome-charts/) Figure 2B: X-DNA Inheritance chart for females: (http://www.thegeneticgenealogist.com/wpcontent/uploads/2008/12/1b.png) (shaded boxes show possible sources of X DNA). A simple rule to remember is that X DNA cannot pass through 2 males in sequence. For these charts, the father is on the left side and the mother is on the right side above each cell Jonathan W. Long 1 May 2015 2 Handout on Genetic Genealogy below. Basics of Autosomal DNA Testing Strategies Each of the companies offers the basic DNA test for about $99 (note that there are often sales throughout the year, especially around the winter holidays, Father’s and Mother’s Day, and DNA day on April 25). Buying multiple kits can save too. This site compares the three major companies: http://www.isogg.org/wiki/Autosomal_DNA_testing_comparison_chart FamilyTreeDNA is particularly helpful for testing hypothesized relationships, because matches are generally responsive, and the site has the best tools for examining your matches in detail. AncestryDNA is particularly helpful for identifying unknown ancestors and relatives, because it has the largest database of users with family trees and will generally yield the most matches. However, the site lacks tools for examining your matches (see below for how to get tools!). 23andme may be helpful for finding living relatives, because they have a broad database of users (not just genealogists!). However, many users are anonymous and lack family history information, so it is challenging to use for genealogy. The latest version of their testing chip is no longer compatible with GEDMATCH.com, an important site for comparing results across platforms. One perk of 23andme is that their test does provide haplogroup information for Y DNA (for males) and mitochondrial DNA— their results for the Y are not directly comparable to Familytreedna’s Y tests, but those results can serve as clues that help to exclude some families. To get the most answers to your questions, you may choose all three, and FamilyTreeDNA accepts transfers from the other two companies for $39. The Geno 2.0 test from National Geographic provides “deeper” (=older) ancestry than is used by genealogists, but there is a free transfer to FamilyTreeDNA where the results may be useful, as they include some Y and mitochondrial DNA information. Interpreting Autosomal DNA Results The Centimorgan (cM) is a measurement of how likely a segment of DNA is to have been inherited from a common ancestor. >10 cM block indicates definite shared ancestry. 5-10 cM block probable shared ancestry (most companies and GEDMATCH are using a threshold of about 7 cM to determine matches. GEDMATCH also uses 7 cM as a common threshold for matches based upon X DNA, although it is more complicated to interpret those values because men and women have different amounts of X DNA). Smaller segments can indicate shared ancestry, but they may also be false positives (see post by Roberta Estes for more: http://dna-explained.com/category/ancient-dna/). NOTE that average amounts of shared DNA are based on assumptions that folks share a single ancestor or pair of ancestors, rather than multiple pairs of ancestor, in recent genealogical time (the past few Jonathan W. Long 1 May 2015 3 Handout on Genetic Genealogy hundred years). The expected values and thresholds for matching will be higher in endogamous populations—ones that have a high degree of intermarriage. Table A: Likelihood based upon length of shared segment Source: Tim Janzen: http://www.isogg.org/wiki/Identical_by_descent) Length of shared segment >30 cM 20-30 cM 12-20 cM 6-12 cM Likelihood you and your match share a common ancestor within 6 generations (values will be different for endogamous populations) 90% 50% 20% 5% <6 cM <1% Table B: Likelihood of matching actual relatives Sources: http://www.isogg.org/wiki/Autosomal_DNA_statistics and http://www.familytreedna.com/faq/answers/default.aspx?faqid=17#628 Shared DNA Average cM Shared 50% 3400 25% 1700 12.50% 850 Likelihood of Matching >99% >99% >99% Relationship Mother, father, siblings Grandparents, aunts, uncles, half-siblings, double first cousins Great-grandparents, first cousins, great-uncles, great-aunts, halfaunts/uncles, half-nephews/nieces 6.25% 425 >99% First cousins once removed, half first cousins 3.13% 212.5 >99% Second cousins, first cousins twice removed 1.56% 106.25 >90% Second cousins once removed 0.78% 53.13 >90% Third cousins, second cousins twice removed 0.39% 26.56 Third cousins once removed 0.20% 13.28 >50% Fourth cousins 0.10% 6.64 Fourth cousins once removed 0.05% 3.32 >10% Fifth cousins .01% 0.83 <2% Sixth cousins or more distant Triangulation is the process of determining that a particular autosomal DNA segment has been inherited from a common ancestor by identifying two or more cousins who share that segment. Note that this does not mean that all descendants of that ancestor will have that segment, but it suggests that the segment might be an indicator of descent from that family line. Tools for Triangulation GEDMATCH.com is a free, donation-supported site for comparing results across the 3 major companies. By donating $10, you can become a “Tier 1” member that has some additional tools, including Triangulation. Jonathan W. Long 1 May 2015 4 Handout on Genetic Genealogy The Autosomal DNA Segment Analyzer here: https://www.dnagedcom.com/adsa/index.php triangulates your FamilyTreeDNA matches. Genome Mate allows you to keep track of your matches across the platforms from FamilyTreeDNA, 23andme, and GEDMATCH. Matches in common: finding all matches shared by two or more individuals. This feature is available at FamilyTreeDNA and GEDMATCH. Some Common Questions Why does my known cousin not appear as a DNA match? The odds of matching depend on the degree of the relationship (see table above)—known cousins may appear closer or more distant due to random inheritance of DNA. As a result some cousins, even as close as 3rd cousins, may not appear in your match lists. Also, non-paternity may also account for a lack of a relationship—the presumed relatives actually had different fathers and/or mothers than what was expected. Why does my sibling have different matches than I do? Because autosomal DNA is randomly inherited from one’s parents, siblings will have somewhat different autosomal DNA. Also, note that females inherit X DNA from their fathers and their mothers, while males inherit X-DNA only from their mothers, so brothers and sisters have different X-DNA results. For this reason, it may be helpful to have results from your siblings in addition to your own results. Why do I have a relatively close match to someone, yet we cannot find our relationship? In cases where people actually know their recent ancestors, this result may reflect having more than one shared line of ancestry. You may look for “cousin marriages” in the trees of such individuals, which will increase the DNA passed down by those ancestors in common. Jonathan W. Long 1 May 2015 5 Handout on Genetic Genealogy Supercousins or “Up cousins”: A person one or more generations higher than anyone alive in your direct line, whose DNA results can help you make connections. Two pathways for searches Records to DNA DNA to Records 1. Find a suspected common ancestor based upon records or family lore 1. Find matches in common with one or more shared DNA segments 2. Find one or more descendants to test 2. Search the trees of those matches for families and places in common 3. Get DNA results 3. Identify likely common ancestors 4. Look if shared segments and matches in common support the hypothesis 4. Determine if paper records support a connection Jonathan W. Long 1 May 2015 6 Handout on Genetic Genealogy Ten strategies for using genetic genealogy to break through brick walls 1) Secure samples from the oldest generations: In your immediate family, recruit DNA samples from the highest generation available on the line of interest. Once processed and stored with a company like FamilyTreeDNA, DNA samples may be used for additional testing in the future. a. Note that siblings will have somewhat different results so it can be worth getting samples from each. In particular, males and females have different X DNA results. b. For general searches, I recommend starting with AncestryDNA and transferring results to FamilyTreeDNA ($39) and GEDMATCH.com (free or $10 to get triangulation). c. If you want to validate a hypothesized relationship, going straight to FamilyTreeDNA may be a more efficient solution because they offer more sophisticated analysis tools. 2) Build a cousin network for genetic genealogy: recruit 1st, 2nd, 3rd, 4th, and 5th degree cousins who share descent on your lines of interest to take DNA tests. Especially seek out “supercousins” from higher generations who carry more DNA from the ancestors of interest. a. Note that results from cousins who have multiple lines of descent from the same ancestors will have greater power to detect matches with those ancestors. b. Cousins whose ancestors were half-siblings of your ancestor of interest will have a weaker match, but their results can help you to isolate that paternal or maternal line. c. Living cousins who would have an X DNA, Y DNA, or mitochondrial DNA connection may be particularly valuable for validating relationships, including ones that may be too distant for autosomal DNA to reliably trace. 3) Find the cousins to fill out your network: To identify cousins who would be helpful in your search, you can use Wikitree and other online family trees to identify living descendants who have tested or might be willing to test. Genealogy sites will generally yield higher responses, but even general sites like Facebook can work, although response rates can be low. 4) Share your information so others can help you: Link your DNA results and all of your known surnames to complete family trees so that folks can better find points of connection—let them help you! Avoid posting partial trees (for example, your paternal or maternal family only) and clearly identify lines that represent a known or suspected adoption. 5) Contact your matches but give them details to understand the connection: When you contact individuals with whom you share a match, be sure to identify the type of match (i.e., autosomal) and the name associated with the kit. Genetic genealogists often manage results from many individuals. 6) Systematically search your DNA results using multiple strategies: a. Find matches in common with known relatives; make notes associating those individuals with the shared surnames and/or locations. Note that even if you can’t trace a particular matching individual to your family, you may be able to figure out where they connect to your tree, and triangulate on their results to find other matches in common. b. Search matches by surnames, particularly relatively rare ones (all three companies permit this, although Ancestry works the best, and AncestryDNA Helper tool allows you to search by full names. Based on paper genealogy, you may have hunches about which Jonathan W. Long 1 May 2015 7 Handout on Genetic Genealogy families are connected to yours. If you can find someone with a rare surname in that family, you might try searching for that surname in your matches. c. Search matches by placenames, particularly when relatively rare (Ancestry.com is best for this kind of search) d. Search matches by shared DNA segments (use the tools under “Triangulation”). 7) Group and sort your results: Generate lists or spreadsheets that show clusters of shared matches by family group. You can also generate spreadsheets showing shared matches by DNA segments on each chromosome. If someone unknown shares one of those segments, you may be able to guess to which line they relate. 8) Use the hints (shaking leafs) at AncestryDNA to identify folks who appear to share a common ancestor—contact those people and encourage them to upload their results to GEDMATCH so that you can search for matches in common and compare matching DNA segments. 9) Break out the advanced tools: If you have AncestryDNA results, install Jeff Snaveley’s “AncestryDNA helper” tool (available in the Google Chrome store for use with the Chrome web browser) to automatically obtain lists of matches and ancestors of matches. If you have multiple kits in your Ancestry account, you can use this to easily identify shared matches. 10) Hunt for new leads: You can use the “Ancestors of Matches” results from the AncestryDNA Helper to find individual names (first and last name combined) that are particularly common in your ancestry. You can sort by the “incidence” column to determine individual names that appear multiple times. This is currently one of the best options when trying to trace a common surname like Smith. Final word: Genetic genealogy adds a powerful scientific tool for family historians. You will want a skeptical frame of mind when pursuing possible leads and matches—do not discount that folks may be related along multiple lines or that family trees may have errors. If you have questions about any of this content, you can email me at longjonathanw@gmail.com Jonathan W. Long 1 May 2015 8