GENEALOGY DNA REPORT FOR JOSEPH PHILIP RHEIN (McKinney and Stewart Website) In the study of the ancient ancestry of humans, scientists focus on "haplogroups", the classification of all humans into ancient family clans based on the unique pattern of genetic markers called "SNPs" (single-nucleotide polymorphism) found in their DNA. SNPs are small changes in the DNA which occur naturally over time. Once a SNP occurs, it becomes a unique lineage marker that is passed down to all future generations. Humans who have descended from the same ancient family clan will share the same pattern of SNPs. Using SNPs, scientists have been able to plot the haplogroups of all humans living today into a single phylogenetic tree of mankind which shows how all humans are connected to each other in a complex worldwide tree that stems from Africa over 150,000 years ago. Dozens of haplogroups have been discovered to date, each haplogroup representing a major branch in the phylogenetic tree of mankind. Each haplogroup can be further refined into "subclades" (finer sub-branches of the tree). As new SNPs are discovered, the phylogenetic tree becomes increasingly detailed with finer branches and enhanced resolution. By testing individuals from around the world and analyzing their precise placement in the phylogenetic tree of mankind, scientists are attempting to piece together the intricate puzzle of ancient human connections and migrations. My Results The results of my three DNA tests by Family Tree DNA ordered on December 28, 2013 follow. They are listed in order of their relevance for those who follow this website. Mitochondrial DNA (mtDNA) This tests a man or woman along their direct maternal line. Known individuals in the past five generations of my mother’s maternal (and mine) were born in the Province of Pennsylvania in Colonial America or later in the United States. Their forebears emigrated almost exclusively from Scotland, the north of Ireland, and Germany. The seven are my mother, Mabel Florence McKinney (1901-1996), her mother Rosa Linda Stewart (1868-1943), her mother Christena Hilliard (18401911), her mother Anna Maria Schick ( (1811-1869), her mother Anna Margaret Mueller (1786-1817), her mother Susanna Lauback (1757-1837), and her mother Margaretha Elizabeth Jansen (born 1730), my maternal 5th great-grandmother. 1 My mtDNA Haplogroup is H5a1. There are three levels of matching along with the number of matches that I have at each level. Level Matches HVR1 - 52 generations about 1,300 years. 96 HVR2 – 28 generations about 700 years. 331 HVR1and2 and Coding Region – 5 generations about 125 years. 300 For the above match in each of the levels, I have the name of the individual and his or her maternal forebear. Unfortunately the majority of these individuals do not include any detailed information on the forebear nor do they post a GEDCOM so I am unable generally to determine any specific relationship line. In addition to the above I have listed these maternal forebears on several project sites on Family Tree DNA with the following results. German, listed in the name of my 5th great maternal grandmother, Margaretha Elizabeth Jansen, born May 15, 1730 Danville, Province of Pennsylvania in Colonial America. I have a direct match with Susannah Wheeler, born 1677 in Long Compton, Warwick, Wickshire. Ireland, I have direct matches with Almira Porter; Ellen Carol, born 1810; Mary Manning, born 1873, County Cork, Ireland; and Keziah Campbell, born about 1755, North Carolina. My maternal grandmother, Rosa Linda Stewart, is listed on Stewart Clan Society of America. There are no matches. Autosomal (atMDA) This tests a couple for all ancestry. It covers both the maternal and paternal sides of the family tree, so it covers all lineages. It is the combined DNA ancestry of my father, Joseph Peter Rhein and my mother Mabel Florence McKinney and their forebears. It has a feature called Family Finder that includes a component called Population Finder. It applies principal component analysis to the same autosomal data to conduct biogeographical analysis (BGA) of the autosomal DNA. The results of this 2 test provide percentages of a DNA associated with general regions or specific ethnic groups (ex. Western Europe, Asia, Jewish, Native American, etc.). Unlike some other testing companies, Family Tree DNA chose to strip out markers for mendelian medical issues, mtDNA results, and Y-DNA SNP results. X-Chromosome results are available for download, but are not used by Family Tree DNA's matching program. I have 555 matches, generally 2nd to 5th cousins. None of these appear to be on my paternal side. My mix is as follows: British Isles 40%, Scandinavia 28%, Southern Europe 19%, Eastern Europe 6%, Asia Minor 4%, and Central Asia 3%. Again, the majority of these individuals do not include any detailed information on the forebear nor do they post a GEDCOM on any of the websites so I am unable to determine a specific relationship line. I assume they have not done a significant amount of research on their genealogy lines and are looking for a direct match. Y- Chromosome (Y-DNA) Genealogy This tests a male along his direct paternal line. It requires a male sample provider. My initial Y-DNA Haplogroup was E-M44. This line originated in Africa or Asia over 60,000 years ago and the successor groups were later in what are now the Balkans, Middle East and Southern France 22,000 to 18,000 years before current time. I have indentified three individuals in Haplogroup E whose early forebear has the same first twelve markers as my forebear but that relationship dates back at most a few thousand years and there is no other information. The predecessor or ancestor Haplogroups are DE the D then E1 then M96 then E-44. The chart on the St.Clair/Sinclair web site following illustrates the movement of Haplogroups DE, then successor D, then successor E, E1-M96, E-44, E1a1 my line. Note the migration of the R groups principally to Europe, Scandanavia and Britain. http://www.stclairresearch.com/images/ChartPath-F.jpg 3 On July 28, 2015 I was advised by ftdna that my Halogroup was changed to EL632. I have listed my DNA results on the following projects at ftdna. Project Website http://www.familytreedna.com/public/Alsace/ Alsace Approximately 50 individuals. I am in Halogroup E with two other individuals E-V13. Other groups are G with 1, I with 4, J with 1, Q with 1, R 1b1a2 with 8, One individual ungrouped. R1b1a2 that include three individuals who married a Herman, Frey, and Gross surnames married to Rheins in Herrlisheim.. http://www.familytreedna.com/public/e1a1/default.aspx/ E1a1 This is Doug Phelps’ group. http://www.familytreedna.com/public/frenchheritage/ French_Heritage_DNA Germany-YDNA Approximately 2,500 individuals. Primary group is R. Also shown are E-M78, EL117,E-M34, E-M183, E-V13, and E-CTS6143. I am ungrouped at E-L632. http://www.familytreedna.com/public/germany/ Approximately 3,000 individuals, principally R, G and I. E-M34, E-M35, E-M78, E-M81, E-V13, and E-V22 are also represented. I am ungroped at E-L632. http://www.familytreedna.com/public/HaplogroupE1andE/ E1a-(M33, M132) Approximately 96 individuals. There are four of us from your group currently listed. http://www.worldfamilies.net/surnames/rinehart/ Rinehart and Rhein Approximately 35 individuals. Majority in R1b. I am in E along with Two other individuals. Switzerland http://www.familytreedna.com/public/switzerland/ Approximately 200 individuals, principally R. I am ungrouped at E-L632. **** 4 Project Website Also enclosed is a listing of E-L632 from your group. Joe Rhein July 31, 2015 I am also registered on Clan Stewart Society. Over 750 individuals listed. I am in Group E1b – Haplogroup E1b1b1 Unassigned. https://www.familytreedna.com/public/Stewart/default.aspx?section=yresults I am not registered on the site for Greece. They show eight individuals with E; L117 (3), M78 (2), L542 (2), and DE-M145. Recently in Ergolding, Bavaria, Germany, not too far Southeast of Alsace, an archaeological dig by the Bavarian State Department of Monuments and Sights revealed more than 440 graves. So far, DNA analysis of six of the men of early adult age was performed. “These six men were buried together in a wooden chamber, a grave identified as #244. The individuals were marked as 244A to 244F. Individuals found in the western part of the chamber (244A, 244B, and 244C) lied straight on the back, body-by-body, and all 3 men were buried with swords, spears, shields, and spurs, like heavily armored mounted warriors. Historic value of the artifacts found in the grave 244 makes this place one of the richest Bavarian burial sites from the lateMerowig period. Grave 244 dates to the period around 670 AD. The eastern part of the burial chamber with the individuals 244D, 244E, and 244F was robbed and therefore no valuable artifacts were found. Of the six skeletons tested, four were of the R1b Haplogroup. Two were of the G2a Haplogroup. It’s unusual to find this later group in this part of Europe, but it may match up with the Sarmatians, of Persian nomadic tribes, which moved gradually from the Caspian plains to Eastern Europe. They lived on the plains between the Black Sea and the Caspian Sea, north of the Caucasus. These people of the Steppes, with their horsemanship, armor, and female warriors, were early precursors of the knights of the middle ages. This seems to fit with the fact that three of these skeletons were found with swords, spears, shields, and spurs. The archaeologists said they appeared to be heavily armored warriors.” 5 Twenty four Y-chromosome haplotypes were obtained for most of the men. The estimated haplogroup for four of the men is R1b and for the other two is G2a. I matched seven of the markers for the Rb1 men and six for the G2a men, not enough for anything conclusive. I mention this as an example of the sort of investigative work that needs to be done in pursuing ones Y-DNA. The R1b Haplogroup is the dominant paternal lineage marker of Western Europe. It is the dominate marker for the Y-DNA German Project that contains approximately 3,000 individuals. G2a is also frequently listed. There are a number of males with the E haplogroup (including mine) also listed but no matches. It is the dominate marker for the Y-DNA Alsace Project that contains 28 individuals. I am listed with one other individual in the E haplogroup but no matches. A number of the forebears of Germans and French that have been Y-DNA tested appear to have left Africa going farther North to Persia, then later to what is now present day Kazakhstan and later to Europe; the Visigoths, the Alemanni, the Huns, the Anglo(Angles)-Saxon Invaders, etc. A number later went to Scandinavia, the Vikings, following the end of the Ice Age some 18,000 to 8,000 years before current time. And, there are those who came to Gaul and to Britain with the Roman Legions, some remaining there. Some Limited Background On My Paternal Forebears I am the 8th great-grandson of Johann Gaspard Rhein, born 1595 in low Alsace in the jurisdiction of Hannau-Lichtenberg, the House of Hesse-Darmstadt, the Holy Roman Empire, later Bas-Rhin, France, later Alsace-Lorraine, Germany, and again later Bas-Rhin, France. My grandfather, Joseph Rhein, born 1866 in Herrlisheim, Alsace-Lorraine, immigrated to the United States in 1890 following his military service in the German Army. On the basis of the research I have done over the past 45 years (none of it resulting from DNA testing) my grandfather appears to be the only male descendant of Johann Gaspard Rhein that immigrated to the United States. I have been unable to locate a living male descendant of Johann Gaspard Rhein in Europe. 6 I have a paternal web site at http://www.rheinandlaeng.net/index3.html that contains 1,178 individuals – there are no living persons listed on the site. Some of the early settlers in Herrlisheim, low Alace may have been there in the year 743 AD when it was conveyed to the Abbey of Wissembourg under the name of Hariolfesvilla, the farm of Hariolf (Harold). In 1251, the village is the property of the Counts of Oetigen, landowners of low Alsace who ceded it to the lords of Lichtenberg in 1332. In 1480 with the death of Jacques de Lichtenberg the heritage is divided between Phillipe de Hanau and Simon Wecker the Count of Two-Bridge-Biche. The village is incorporated in 1570 with the property of Hanau-Lichtenberg with the extinction of Two-Bridge-Biche. On September 17, 1570, Phillipe IV of Hanau-Lichtenberg, one of the largest jurisdictions in low Alsace, orders the prohibition of Mass and imposes Protestant religion in the area. This was known as "cujus regio, ejus religio" ("whose religion, his religion"); that is the religion of the prince is the religion of the land. The former Roman Catholic church buildings and benefices were taken over by Protestant Churches. Subsequent to 1570, Hanau-Lichtenberg became a part of the house of Hesse-Darmstadt. In December 1621 and January 1622 during the Thirty Years War, Mansfeld's mercenaries raze the area and the inhabitants of Herrlisheim and Drusenheim take refuge in tents on the islands of the Rhein River. In the year 1681, Herrlisheim was converted by force from Protestantism to Catholicism. Herrlisheim lies on a fertile plain between the Vosges and the Rhine River. I have 11 great (plus) grandfathers in Alsace whose male descendant married a female descendant of Johann Gaspard Rhein so my y-DNA or mtDNA testing is not applicable for these lines. Some background on the area now known as the Balkans. Archaeological evidence indicates that the area in what is now the Balkans was populated well before the Neolithic Period (New Stone Age; about 10,000 years ago). At the dawn of recorded history, two Indo-European peoples dominated the area: the Illyrians to the west and the Thracians to the east of the great historical divide defined by the Morava and Vardar river valleys. The Thracians were 7 advanced in metalworking and in horsemanship. They intermingled with the Greeks and gave them the Dionysian and Orphean cults, which later became so important in classical Greek literature. The Illyrians were more exclusive, their mountainous terrain keeping them separate from the Greeks and Thracians. Thracian society was tribal in structure, with little inclination toward political cohesion. In what was to become a persistent phenomenon in Balkan history, unity was brought about mostly by external pressure. The Persian invasions of the 6th and 5th centuries BCE brought the Thracian tribes together in the Odrysian kingdom, which fell under Macedonian influence in the 4th century BCE. The Illyrians, ethnically akin to the Thracians, originally inhabited a large area from the Istrian peninsula to northern Greece and as far inland as the Morava River. During the 4th century BCE they were pushed southward by Celtic invasions, and thereafter their territory did not extend much farther north than the Drin River. Illyrian society, like that of the Thracians, was organized around tribal groups who often fought wars with one another and with outsiders. Under the Celtic threat they established a coherent political entity, but this too was destroyed by Macedonia. Thereafter the Illyrians were known mainly as pirates who disturbed the trade of many Greek settlements on the Adriatic coast. The Romans were also affected and took police action, annexing much of Illyrian territory in the early 3rd century BCE. An Illyrian kingdom based in modern-day Shkodër, Albania, remained an important factor until its liquidation by Roman armies in 168 BCE. The Romans were different from other major conquerors of the Balkans in that they first arrived in the west. Later attacks were launched from the southeast as well, so that by the 1st century CE the entire peninsula was under Roman control. At the height of Roman power, the Balkan peoples were the most united of any time in their history, with a common legal system, a single ultimate arbiter of political power, and absolute military security. In addition, a vibrant commerce was conducted along the Via Egnatia, a great east-west land route that led from Dyrrhachium (modernDurrës, Albania) through Macedonia to Thessalonica (modern Thessaloníki, Greece) and on to Thrace. The northwestern part of the peninsula, including Dalmatia along the Adriatic coast as well asPannonia around the Danube and Sava rivers, became the province of Illyricum. What is now eastern Serbia was incorporated into Moesia, which reached farther eastward between the Balkan Mountains and the Danube all 8 the way to the Black Sea. The southeastern part of the peninsula was ruled as Thrace, and the southern part was brought into Macedonia. The Romans largely regarded the Danube River as their northern frontier, but in the 2nd and 3rd centuries their authority was extended northward into Dacia, in what is now western Romania. Dacia had been the home of a people closely related to the Thracians. The Dacians had suffered invasion by a number of peoples, including the Scythians, a mysterious people probably of Iranian origin who were absorbed into the resident population. In the 3rd century BCE they managed to contain Macedonian pressure from the south, but in later years they were much less able to fend off Celtic invaders from the northwest. By the 1st century CE a substantial Dacian state extended as far west as Moravia and threatened Roman command of the Danube in the Balkans. The extension of the Dacian state and Dacian raids across the river into Moesia prompted the emperor Trajan in the first decade of the 2nd century to march into Dacia, obliterate the Dacian state and Dacian society, and establish a Roman colony that lasted until barbarian incursions forced a withdrawal back across the Danube beginning in 271. Christendom The abandonment of Dacia in the second half of the 3rd century was a symptom of Rome’s decline, leading to major changes in the 4th century. In 330 the imperial capital was moved to Byzantium, so that any tribe intent on attacking the seat of Roman power and opulence would thenceforth move through the Balkans rather than into Italy. In 391 Christianity became the official religion, and in 395 the empire was divided in two. The dividing line ran through the Balkans: Illyricum went to the western sector under Rome; the remainder went to the eastern half and was ruled from Byzantium (by this time named Constantinople). This deep and long-lasting division did little to alleviate the barbarian incursions of the times. The 5th century saw devastation by, among others, the Alani, the Goths, and the Huns. Most of these invaders soon left or were assimilated, but such was not to be the case with the Slavs, who first arrived in the 6th century. The Slavs were settlers and cultivators rather than plunderers and within 100 years had become a powerful factor in the region. They separated into four main groups: Slovenes, Croats, Serbs, and Bulgarians (the last being a Turkic tribe, the Bulgars, that was eventually absorbed by Slavs who had already settled in the 9 eastern Balkans). Although in 681 the Bulgars established their own state, the Slavs acknowledged the suzerainty of the emperor in Constantinople. In the second half of the 9th century, Christianity was adopted by the Bulgarians and the Serbs, both of whom chose the Byzantine rather than Roman variant of the new religion. To the north of the Danube, the Romanians, though not Slav, made the same choice, while the Croats, together with most of the rest of what had been Rome’s section of the divided empire, became part of the western Christian community. The Albanians, isolated behind their mountain chains, were not much affected by either branch of Christianity. The divisions and competition between Rome and Constantinople intensified, with the two communities separating irrevocably in 1054. The dividing line of 395 was thus reinforced: the Croats and Slovenes became an integral part of Roman Catholic Europe, with its Latin script and culture, and the Serbs, Bulgarians, and Romanians joined the Greeks in their allegiance to Eastern Orthodoxy. Conclusion Of the many genealogy DNA software packages on the market today Family Tree DNA, in my view, has the most features and functionally and offers access to many surname and geographically project sites. I encourage those of you on the list to do some research on the benefits and limitations before ordering any DNA testing to make sure it meets your needs. The price differential can be significant. Genealogy DNA has come a long way but it is not quite the exact science that it would appear. It is an evolving work. The hundreds of thousands of individuals who have been tested have overwhelmed the limited resources of companies in this field who analyze and categorize the results. The results should improve in the coming years as the technology gets better. For those of you who may be interested, I use Family Tree Maker software, currently year 2014, Version 22.0.0.120. I have 9,954 individuals listed, including the family of my daughter-in-law and the family of my son-in-law. I am pleased to report that Kevin W. Stewart, President of The Clan Stewart Society in America and a descendant of our Lt. William Stewart has been DNA tested and will be reporting his result to me shortly. Also, Alec Stewart a descendant of Lt. William Stewart is to be DNA tested and will be reporting his results. Kevin and Alec were born in Clarion County, Pennsylvania and the three of us are cousins. 10 There are four Stewart sites with a total of 1,345 individuals on Family Tree DNA. There is at least one listing for a descendant of each of the following three individuals. John Stewart of Jedworth and DeForesta (1350-1402) my 16th greatgrandfather. Sir John Stewart of Bonkyl (killed at the battle of Falkirk, July 22, 1298) my 19th great-grandfather. Walter Fitz Alan (1106-1177) the First High Steward of Scotland, my 23rd great-grandfather, and beyond. I am hopeful Kevin’s and Alec’s test results will confirm our male lineage as reported on the McKinney and Stewart website. Joe Rhein Sarasota, Florida July 31, 2014 Updated July 31, 2015 Addendum Attached is an excellent article on Y-DNA testing. http://www.isogg.org/wiki/Y-DNA_SNP_testing_chart Following are some observations of the Big Y test that I recently completed. This article is by Ray Banks. Each male has Y-DNA. It is composed of 30 million sites. The DNA exists as double strands with a molecule at each end of the strand. Every now and then the two strands unwind to be part of some process. The strand and the two compounds at either end are called a base pair. These molecules vary in 12 combinations and each component is designated A, C, G or T representing the first name of the molecule. For the type of DNA testing in which we are interested, they force the DNA to unravel. They then add enzymes that make numerous copies of everything. Then 11 they try to get the fragments of DNA to align to known sequences of molecules. A molecule like this is called just a base pair once it is separated from the other molecule. And they try to make a series of about 100 adjacent base pairs to align to a known sequence. Helping in this process is that about only 1 of 20 million stable base pairs will be replaced by another each generation. However, there are also lots of sections where the same base pair will be repeated multiple times. This means that sometimes a section with lots of repeats of the same base pair will look somewhat similar to multiple parts of Y-DNA and multiple different fragments will align with a reference section. And I mentioned 20 million sites are stable. There are about 10 million useless, unstable sites which apparently do not serve any purpose and suffer repeated mutations. One of the most striking problem areas involves the sites numbered 11 million to 12 million. These are so bad that they never report the results. They are in the centromere where a short arm of the chromosome joins. At the beginning of the chromosome in numbers 1 through 2.6 million is a poorly understood area that is never reported. At the huge end of the chromosome is another section that is almost never reported from about 28 million to the end. In between there are problem areas. The mentioned area around the centromere can be problematic. The area from 24 million to 28.4 million comprise palindromes, as are several much smaller sections. These palindromes can be complicated to explain. But often there will be multiple components. These palindromes also tend to be far less stable than other parts of the Y-DNA. They can't be tested in many types of tests. In the very first scientific YDNA work, multiple mutations in the 26 million palindrome area were created as tests and used to define branches of the Y tree. We have had to retire almost all of these (such as P16, P18, P20) because close relatives were getting varying results. So in my analysis of your Big Y, I no longer include any comments for the sections reported by Family Tree DNA which fall within most of the palindromes. One final section of immense problems is the one from 22.21 to 24.45 million. The other companies that do Y analyses have typically reported everything in this section as useless. This is the final section with underlying problems for us. It is not clear why Family Tree DNA chose to report this section to you, but it often contains almost half their reported items. Quite often most customers will have a 12 positive reading at the same sites in this section over and over because it is so unstable. I have stopped making comments on this section as no use can be made of it. But they have also included this section when doing matching analyses. And because of this, half of the men who show matches to their Big Y at Family Tree do not have the man who is truly nearest to them listed as the nearest match. The Analysis When Family Tree DNA reports your results, you will find three sections. The first section is titled Known SNP mutations. We are currently aware of about 110,000 shared SNPs which have names. Family Tree reports only a small percentage of the truly known SNPs in this section. In almost all cases, we are anyway only interested in one or two of these that will identify your most precise categorization. In some cases, the SNPs best categorizing you will not be in this section The second section is titled Novel Variants, meaning new mutations not seen before. This is not a good title. For the great bulk of customers, we are already aware of many, many of these. Where there is already a named SNP representing a site, I will list this in column E of your results analysis. These named SNPs occurred earlier than your most recent branch and are shared by lots more men. Also in the Novel Variants section are lots of items that are actually useless. This is because Family Tree DNA does not compare your results with all the men in the database. These useless items are seen in multiple haplogroups and the underlying site is unstable. It is also may be a site known to provide inconsistent results or conflicting results. But also within the Novel Variants section are the truly useful mutations, which I will designate as just new. These are so far unique to you, but whenever someone else in your haplogroup shares the same mutation, it will be the basis of a new subgroup and tree branch. The Big Y provides results for about half the 20 million useful Y-DNA sites. So you will actually have twice as many unique mutations than they report. It is thought that a Y mutation in Big Y occurs about once every 150 yrs. So you can multiply the number of your new unique mutations by 150 to see how long it has been since you shared a common paternal ancestor with someone else who has had 13 sequencing. Identifying this branching allows us to trace your ancestral migrations as more and more men are tested. What is a positive reading? It is not possible to use an electron microscope to determine one by one the results for each of the 10 million Big Y sites. The labs instead rely on software to aggregate the info on what base pair was found at each site. In Big Y this is attempted 50 times on average. When all 50 reads have the same result, there is no question the man is either positive or negative. The result is compared to a reference sample from haplogroup R to make the determination of positive or negative. But this whole process is not otherwise 100% clear-cut. If a man has 48 reads saying he has C at site 14,343,888 and two that say T, it is extremely likely that the C reading is correct. But what if there are 35 saying C and 15 saying T. To me this should be reported as inconclusive, but the labs that do analyses have here and there actually instead reported a false positive or false negative by not requiring a huge majority of the reads to be one way or the other. I suspect this is due to the software. The software is really intended for the medical community who in their research do not want to miss the possibility of a mutation. Your Family Tree DNA report will almost always provide enough new mutations for future use. However, Family Tree will not report items unless they were able to get at least 10 reads, thus over looking a mutation. Another complication is that there may be a valid mutation present at a particular site but the alignments muddy the picture or the software gives a mediocre score to the site. With individual testing of the site, it may prove eventually a useful site. So the whole process involving sequencing is very good, but not perfect. The problem of them only reporting an item with 10 reads can be overcome by paying up to $50 at YFull or Full Genomes Corp to have a separate analysis done which could uncover a few more mutations. However, I have seldom needed an additional analysis for my work, and I simply do not have time to enter alternative data from yet another source into the master spreadsheet. While Family Tree DNA reports all your positive results, there are only abt 800 pertinent to a haplogroup G man and abt 1200 to C & D men. Most of these results can be predicted in advance if we can pinpoint your subgroup. But there are results available for 10 million sites in your raw data file. It is only in the raw 14 data file where we can distinguish a negative from an inconclusive from no read at all. This data file is called a BAM file, and it is huge, often 1 gigabyte in size. At some point, I will ask for access to this. It is also the only place in the future where we can look up your results for a new mutation someone else has. I commonly use the BAM file to confirm that a man within a subgroup is negative for a new mutation found in another member of the subgroup. My analysis. At this point, I have analyzed the sequencing results from about 2,500 samples with the help of others, including about 60 Big Y tests. I am the content expert at ISOGG for haplogroups C, D, and G. I am the only person who can approve a new subgroup on these Y trees there. This is the one for haplogroup G http://www.isogg.org/tree/ISOGG_HapgrpG.html These Y trees have now been cited in over 200 publications. This is not the case, for example, with the trees shown at Family Tree DNA, YFull, 23andMe, Britians DNA, etc. These ISOGG trees are used by scientists frequently, and we have a few who insist that their new mutations be included in the index before they appear in publication. I do not know why you chose to do a Big Y test, but one of your goals should be to have your results used to better identifying tree branching pertinent to you and have that incorporated into the ISOGG tree. I am also your haplogroup project administrator or co-administrator. I work fulltime in identifying new Y-DNA branches. And in this role, I always go through results to see what new can be identified. Presently on 11 Jan, I am 3-4s week behind in Big Y processing. Most of the problem is related to a separate project to provide for the first time a mega Y tree that tries to merge all the subgroups identifying in about a dozen sources. This is a huge undertaking. But I only have 1 of the 30 haplogroups yet to do. At the top of each ISOGG page you will find a link to my Composite Y tree. I am also co-author of a paper with one of the most ambitious Y-trees yet attempted. http://biorxiv.org/content/early/2013/11/24/000802 I covered 8 haplogroups for this. The lead author is Dr. Gregory Magoon, a graduate of M.I.T. in the United States who does analysis for Full Genomes Corp., and the senior author is Dr. 15 Andy Grierson from the University of Sheffield, a molecular biologist with a special interest in population genetics. My analysis is free to you. I am a volunteer with no staff. Final Goals There are two sets of goals for me, and I hope they will also be your goals. The first is to improve the knowledge of which mutations comprise a subgroup. This may also include finding the results from one subgroup for a sister subgroup. In some cases, the subgroups are well defined, but in other cases not enough men have provided sequencing to even figure out yet how they fit together. The second goal is to create yet another subgroup from the SNPs that are unique to you. For this to work, we need to have a second man fairly near to you in marker values but not a relative. He must have at least five marker values different at 67 markers to qualify. But once a man has 10 or more markers different, the chances of him sharing one of your new SNPs randomly chosen significantly diminishes and the cost of testing lots of SNPs can become quite expensive. ISOGG criteria requires this 5 marker values diversity in the two men validating a new subgroup. Criteria also require that the submitted evidence involve individual testing (the $39 test at Family Tree DNA, the $35 test at YSeq.) Big Y and Geno 2.0 are not reliable enough to validate the new subgroup. These individual tests use what is called Sanger sequencing and allow making sure there is a correct alignment. Family Tree DNA will allow testing new SNPs but it has to be a small number of them. We generally restrict such testing to obvious new subgroups. Where we are randomly testing new unique SNPs, we have to use YSeq. http://www.yseq.net/ YSeq will test any SNP. It is operated by persons who used to do this same type testing at Family Tree. If the additional testing involves testing of both obvious subgroups plus expected additional testing randomly of unique SNP, YSeq is the first choice. We have been subsidizing this type testing mostly from donations, and we are adding about one new subgroup a week to the ISOGG trees. Ray Banks dnagrouper2@gmail.com 16