Monday, July 25, 2016

Are 111 Marker Tests Better at Predicting Relationships? A Case Study Perspective.


Two years ago, I looked at whether Y-DNA genetic distance was an adequate predictor of relationships. Using a number of participants at both 37 and 43 markers, I concluded that genetic distance was an insufficient predictor of relationship range. This post examines the following question: “Would an STR marker test at 111 markers enhance the ability to predict relationships?”

In the past two years, the Owston/Ouston DNA project has had the opportunity to upgrade fifteen men to 111 markers. At this resolution, our 15 project members only have matches within our surname group, which indicates it is a sufficient tool to narrow the results to a particular family group. By comparison, each of us matched several men with a number of surnames at 67 markers. At 37 markers, we matched nearly 400 men with different surnames. We can testify that the 111 test has been sufficient enough to separate the wheat from the chaff in our one-name project.

Our 15 subjects represent three families of a low-frequency surname group that totals an estimated 296 males. These 296 men and boys, yes I counted them, have residence in the UK, the USA, Australia, Canada, New Zealand, Finland, and France. Testing subjects represent all the aforementioned countries except France; however, the son of the lone Frenchman, who lives in England, has tested. A brief synopsis on these families is found below.

SHERBURN FAMILY

A total of 19 men from the Sherburn family have tested: six exhibited ancestral non-paternity events, two remain to be upgraded from 43 markers, and 11 have been tested at 111 markers. Of those 11 subjects, the relationships range from siblings to 11th cousins, once removed. The Sherburn family’s most recent common ancestor was alive in the 1550s and died in 1602. Seventy percent of all Owston/Ouston males descend from this family. The most distant relationship within the family is that of 13th cousins.

The Sherburn family includes the Cobourg line that is well represented in the project with seven participants – six at the 111 level. The Cobourg line shares a most recent common ancestor who lived between 1778 and 1857. The most distant relationship represented in this project is that of fourth cousins, once removed. The most distant relationship among all males of this line is at the sixth cousin level. A total of 22 Owston and non-Owston relatives in the Cobourg line have also participated in autosomal testing. This is the author’s line and the reason for its over representation – convenience sampling.

GANTON FAMILY

A total of four men from the Ganton family have tested: two were tested at 43 markers, but died before upgrading at FTNDA. The remaining two, who have tested at 111 markers, are fifth cousins. The family’s most recent common ancestor lived from 1753 to 1823 making the family’s most distant relationship that of seventh cousins. The Ganton family represents 21% of living Owston males.

THORNHOLME FAMILY

A total of five men from the Thornholme family have tested and all four extant lines are represented. Two men exhibited ancestral non-paternity events, one has not yet upgraded from 43 markers, and the remaining two have tested at 111 markers. These two individuals are seventh cousins once removed.

The family’s most recent common ancestor died in 1739 and was probably born in the 1670s. Being the smallest of the three families, the Thornholme family has only 25 males, which constitutes 9% of the total number of Owston/Ouston males. The most distant relationship found within the Thornholme family is that of 9th cousins.

INTERFAMILY RELATIONSHIPS

Data for this one-name study comes from the individual and combined research of Timothy J. Owston, Roger J. Ouston, and James M. Owston. While each began researching the surname in the 1970s, their combined efforts began in 1990 when they crossed research paths.

As noted, a total of 15 men have tested at 111 markers; this represents 105 relationships. While intrafamily relationships are easily tracked, the difficulty arises in cross-family relationships, as records prior to 1550 are spotty. Matching Y-DNA has confirmed that the three families are related and they are from the same region; however, documentation on connections among the three families does not appear to exist.

To address the interfamily relationship problem, I have created a plausible tree based on naming conventions from the three current families and two extinct families who have originated in the Vale of Pickering that spans the historic border of the former North and East Ridings of Yorkshire. The first reference of the surname in this region appeared in 1452. I am confident that the relationships of these lines are within two generations (further distant) than I’ve charted. For this analysis, I used the closest possible relationship that could be presumed.

RELATIONSHIPS

The following charts and table enumerate the known (and assumed) relationships in this family.

Click image for a larger version.


RelationshipsNumber
 Brothers1
 Uncle/Nephew2
 2nd Cousins2
 2nd Cousins, Once Removed1
 4th Cousins7
 4th Cousins, Once Removed3
 5th Cousins1
 7th Cousins, Once Removed1
 8th Cousins5
 8th Cousins, Once Removed8
 8th Cousins, Twice Removed2
 9th Cousins6
 9th Cousins, Once Removed7
 9th Cousins, Thrice Removed 1
 10th Cousins1
 10th Cousins, Twice Removed7
 11th Cousins, Once Removed2
 12th Cousins3
 12th Cousins, Once Removed8
 12th Cousins, Twice Removed6
 12th Cousins, Thrice Removed1
 13th Cousins7
 13th Cousins, Once Removed16
 13th Cousins, Twice Removed1
 14th Cousins4
 14th Cousins, Once Removed2

Click image for a larger version.

RESULTS

By comparing the results of 15 subjects at 111 markers and the additional five participants at 43 markers, a modal haplotype has been constructed. Three participants shared the modal signature at 111 markers: Ganton03, Ganton04, and Cobourg08. The late Ganton01 also exhibited the modal haplotype at 43 markers. Several others who shared the modal haplotype at 37 and 43 markers did not at 111 markers.

There was a noted convergence with Cobourg08 who had a back mutation on DYS643 to 12 repeats, which was found in the modal haplotype. All of the other matching Cobourg line members have an 11 at this marker. This back mutation attributed to some of the outlying results in this analysis. The genetic distance (GD) for the 105 relationships at a 111 marker resolution ranges from 0-9. A GD of 2, however, was not recorded for any of the relationships.

Click image for a larger version.
  
The following table delineates the generational range, mean, the adjusted mean relationship, and the standard deviation for the results.

GDMIN
TMRCA
MAX
TMRCA
MEAN
TMRCA
ADJUSTED MEAN
RELATIONSHIP
STANDARD
DEVIATION
0114.56.06  5th Cousins        5.15
151511.4610th Cousins, Once Removed        4.68
3315.510.60  9th Cousins, Once Removed        4.88
4514.512.4811th Cousins, Once Removed        2.34
55.514.511.6810th Cousins, Once Removed        2.71
691510.53  9th Cousins, Once Removed        1.99
79.515.512.5011th Cousins, Once Removed        2.17
811.51412.8312th Cousins        0.98
99.51310.38  9th Cousins, Once Removed        1.80


Outside of a GD=0, the adjusted mean relationship for genetic distances of 1 to 9 ranges from ninth cousins, once removed to 12th cousins. The plot below provides a visual representation of the interquartile range and the outliers based on genetic distance (GD) and the time to the most recent common ancestor (TMRCA).

ANALYSIS

Largely due to two outlying relationships because of convergence and three very close relationships in the Cobourg line, those sharing a GD=0 have the greatest standard deviation (SD) of 5.15 generations. GD=1 and GD=3 are not far behind with standard deviations of 4.68 and 4.88 generations respectively. The relationships that are represented by these three genetic distances (0, 1, & 3) are more heterogeneous. This heterogeneity indicates that relationships at these levels are likely to be more different than similar.

Contrariwise, those with a greater genetic distance have a lower SD and are more likely to be similar in relationship. With a genetic distance of four through nine, the SD ranges from .98 to 2.71 generations. The most homogenous group is GD=8 with the lowest SD of 0.98 generations. Therefore, it appears that the greater genetic distance, relationships become slightly more predictable at least within our surname.

CONCLUSION

While 111 markers aided in fine tuning our connectivity to those sharing our genetic and genealogical roots, genetic distance was not an accurate predictor of most relationships. Outliers can and do happen, as experienced with a GD=0; however, 78% of the participants at a GD=0 fell within the predicted level of six generations or less with a p ≤  .01, Two did not, and as explained earlier, this was due to convergence. We have seen close relatives (5th cousins and closer) having genetic distances up to 5, while 13th cousins, once removed have a GD=0.

The caveat is that this is one family of meager size from one haplogroup I-M253 (fine tuned with SNP testing to I-A10207). It may not be representative of everyone’s experiences; however, this information can be used as a reminder to exercise caution in using genetic distance as an indication that someone is more closely or more distantly related than he actually is. A prediction based on genetic distance alone might just be wrong.

EPILOGUE

It may be suggested by some that SNP testing is a more accurate predictor; however, we have not experienced this with our four participants thus far. We have seen genealogically closer participants not sharing the same SNPs as those who are more distant. We have tested two Sherburn family members (one from the Cobourg line), one Ganton, and one Thornholme.

The Ganton participant and one of the Sherburn family members share two extra SNPs (at acceptable quality) that are not shared by the other Sherburn member and the Thornholme participant; all other acceptable SNPs are shared by all four. The two Sherburn participants are eighth cousins. We need to continue to test others with the BigY; however, our results thus far are inconclusive in determining relationships or even identifying known family branches. Time will tell.

Friday, June 10, 2016

Exogenous Ancestry – Proposing a Replacement for NPE


If I were genetic genealogy king for a day, I would replace the term “Non-Paternity Event (NPE)” with a more comprehensive term – specifically, “Exogenous Ancestry.”

Exogenous ancestry? That’s a mouthful, but what does it mean?  Well, it’s a term that I have borrowed from biological studies to explain some of the discontinuity of single source surnames with Y-DNA from outside of the family in question.  I have been contemplating for some time of using a different term from what is now commonly used in genetic genealogy – non-paternity event (NPE).

Bryan Sykes and Catherine Irven (2000) first used non-paternity event in the context of genetic genealogy to explain haplotypes that differed from the typical Y-DNA signature of a surname.  It was a borrowed term as well, as it was used in anthropology and sociology where the presumed father was not the father of a child.  Generally, this referred to infidelity on the part of the mother. 

In genetic genealogy circles, the International Society of Genetic Genealogy’s Wiki cites least 13 different categories which have been considered as non-paternity events.  While infidelity is one of these, there are other scenarios where genetic genealogists have used this moniker to describe the discontinuity between surnames and ancestry.  

What's the Beef?

The term non-paternity event and its synonyms don’t neatly fit every situation where it is used.  It assumes that the designated father (and even the child) is unaware of the child's ancestry.  This is not always the case. 

In some cases, there may not be a father in the picture and the surname traveled from mother to child.  The birth father’s name was not associated with the child and there was no “official” father from whom false paternity could be claimed.  It wouldn’t be a surname discontinuity as it continued from the mother; it would be a Y-DNA discontinuity.

In the case of complete adoptions, not only would the paternity be different, but the maternity would be as well.  Using a term such as “Exogenous Ancestry” would better fit full adoption circumstances as not only is the paternal DNA different, so is the maternal DNA.  This term would be applicable to discontinuities found in mitochondrial and autosomal DNA. 

Name changes are often considered NPEs – however, these can be voluntary and NPE doesn’t fit the situation – I am not sure any term other than “name change” would fit this scenario.

Finally, the term appears to pinpoint a given “event”; however, we may not be able to identify a specific generation when this discontinuity occurred.  While a person’s recorded ancestry may have confirmation going back several centuries, Y-DNA tells a different story.  Yes, there was some sort of misattributed paternity, but where did this “event” occur in the lineage?  Can we find it – sometimes, but not always.  We know that somewhere along the ancestral line exogenous DNA entered the picture. 

Where did this Term, Exogenous Ancestry, Originate?

It isn’t an original term, although I have been sparingly using “exogenous Y-DNA” since 2012 to soften the blow when reporting NPEs in my study. While recently performing Google searches for terminology relating to DNA from outside the family/clan/tribe, I found it used in the study of wolf and coyote populations of North America. 

Lupine biologists used it to describe DNA found in certain wolf populations that originated from outside the pack – sometimes considered an unusual occurrence.  In addition, it was also used when wolf DNA was present in populations of coyotes – especially in areas where no known wolf populations existed – hence an ancestral occurrence (von Holt, Kays, Pollinger, & Wayne, 2016).

Exogenous ancestry is broader term than non-paternity events, it is already used in mammalian DNA studies, and it is a better fit to a variety of DNA discontinuities. Will it gain in popularity?  I hope, but sometimes teaching an old dog, wolf, or coyote new tricks isn’t that easy.  I would be interested in hearing your spin on this term.

Sources

Non-Paternity Event (n.d.). International Society of Genetic Genealogy Wiki. Retrieved June 10, 2016 from http://isogg.org/wiki/Non-paternity_event

Sykes, B., & Irven, C. (2000). Surnames and the Y chromosome.  The American Journal of Human Genetics, 66(4), 1417-1419. doi:10.1086/302850

von Holt, B. M., Kays, R., Pollinger, J. P., & Wayne, R. K. (2016). Admixture mapping identifies introgressed genomic regions in North American canids. Molecular Ecology, 25(11), 2443-2453.  doi:10.1111/mec.13667

Friday, February 12, 2016

He Inspired a Genealogist – Mr. George T. Ihnat




Today, I received notification that a teacher I had in junior high school and high school had passed away on Wednesday, February 10, 2016.  I hadn’t seen Mr. George T. Ihnat since the day I graduated in June 1973; however, he had a profound effect on me by instilling a love for family history.
 
George T. Ihnat in 1972
Beginning in 1967, I attended Park Terrace Junior High School in North Versailles, PA – where we moved from teacher to teacher instead of having one teacher all day.  I barely remember any of my instructors from Park Terrace, as there were so many – but one who made a lasting impression was Mr. George T. Ihnat who taught 8th grade English. I would later have him as my 11th grade American literature instructor at East Allegheny High School.
 
As I had many great teachers during my life, I can’t say I remember the specifics of the vast amounts of knowledge he imparted in either class; however, I do recall an assignment that had influenced my primary life’s interest.  One day in 1968, Mr. Ihnat assigned us a project to create a family tree – a typical project that occurs during many people’s school experiences.  I hadn’t thought about my ancestry until then and I haven’t looked back.
 
The assignment prompted me to ask my mother about her and my dad’s families.  Since my dad had passed away in 1962, I knew very little concerning my paternal lineage.  Mom knew my dad’s mother’s family, but only my grandfather’s name and a few scattered details about his siblings. She went into her secretary and pulled out a piece of folded paper in my father’s handwriting that had the names and dates of my father’s grandparents. He had jotted down these notes after visiting relatives in Ohio during the summer of 1960. She also found an old obituary about my great-great grandmother, Sarah Ann Jones Merriman, who was the oldest woman in McKeesport, PA at the time of her death in 1929.

Later that day, my mom and I went to McKeesport-Versailles Cemetery and found Sarah Merriman's and my second great grandfather’s grave – John Merriman was a Civil War veteran in the 101st Pennsylvania Volunteers. My research also inspired me to query my only living grandparent – my mother’s mother about her lineage. I was given a wealth of information about her and my grandfather’s sides of the family.

I also asked my Aunt Nath, my dad’s oldest half-sister who attended the same church as us, if she could provide some additional information. She gladly wrote down names of family members that she could remember. That was a little over 47 years ago and I still have all of these notes and clippings. It got me interested in family history and this was later rekindled in 1978 with the return of my great-grandparents’ family bible to its bloodline.


Mr. Ihnat’s assignment continues to inspire me even to this day in discovering family – old and new. This interest has expanded from archives, library, and cemetery research to DNA testing of relatives – a keen hobby thanks to an English teacher who went beyond the scope of grammar and composition with an assignment about a family tree.
 
Mr. Ihnat:  I am sorry that I never connected with your during my adult years to tell you how that one assignment changed my life forever. Thanks to you it did. While I am hard pressed to remember any of my junior high teachers, you’ll never be forgotten. Rest in Peace. 

Sunday, January 10, 2016

Case Study: Blaine Bettinger



How did you enter the field of genetic genealogy? What and who influenced you?  Were you an innovator, an early adopter, or are you still a laggard who hasn’t tested? Although, I sent in my first DNA kit in 2007, I still feel like a DNA adolescent among some of my peers. If I had to categorize my experiences, I would rank myself in the early majority.   

That first kit was inspired by the article “Shaking the Family Tree with Recreational Genetics” in Newsweek.  I saw it November 2007 at my optometrist’s office and I showed it to my wife who is adopted. Within days, Ancestry had a sale on their Y-DNA and mtDNA tests and both of us took the plunge. 

By the end of the year, I found out that my haplogroups were I1a (old designation) and H.  My wife’s mtDNA was also an H.  We were not too impressed by these results, as they told us little; however, my haplogroups confirmed what I already believed concerning these lines:  my patrilineal line was likely Norse when taken to its logical conclusion and my matrilineal line came from central Europe.  Both haplogroups pointed in these directions. To me, this was still a giant genetic leap.

During 2008, Ancestry partnered with two other companies:  Sorenson Molecular Genealogy Foundation (SMGF) and 23andMe.  I signed up for accounts at both and submitted my Y-DNA and mtDNA results to Sorenson. At that time, 23andMe only offered health and trait information for a hefty price tag ($499), so I passed on their product, as I wasn’t interested in spending that kind of money for this info.  I had a login account, but no data of my own – yet.

Fast forward to 2010.  Wanting to know more about my genetic ancestry, I subscribed to a wonderful online resource, the now defunct DNA-Forums.org, and began learning about this new service at 23andMe called Relative Finder (now DNA Relatives).  DNA-Forums also alerted me in March 2010 that 23andMe was having a month-long sale of their product with $200 off the $499 price – it was called the Oprah sale, as it had been advertised on her show.  Curious, I bit er spit and had my results in May.  I also encouraged my brothers, mother, wife, children, and cousins to test and thus began a process of collecting relatives’ DNA.  Needles to say, I was hooked. We now have 50 of our relatives tested.

That same year, GeneTree (part of the SMGF family and also now defunct) had a $79 sale on their Y-DNA-46 test and I began my surname project with six participants.  We were able to confirm that, except for those with non-paternal events in their ancestry, everyone with our surname and its variants came from a single progenitor.  This was something we couldn’t have done with traditional genealogical records as they didn’t go back far enough.

But the more I learned, the more I questioned.  I was curious about the X-chromosome, as my match to my brothers was extremely small.  So with a Google search in May 2010, I found two enlightening posts on the X at Blaine Bettinger’s blog The Genetic Genealogist.  He made it easy to understand and his fan charts were a true blessing to me and others trying to wrap our collective brains around the differences in transmission of the X among males and females. For those posts on the X-Chromosome, see the following links:  “Unlocking the Genealogical Secrets of the X Chromosome” and “More X-Chromosome Charts.”

Since 2010, a number of changes have occurred.  Ancestry no longer offers Y-DNA and mtDNA tests, DNA-Forums vanished out of thin air in the middle of the night in early 2012, and GeneTree and SMGF were absorbed by Ancestry and folded.  Gone, gone, and gone.  Several aspects of Genetic Genealogy, however, have remained constant; one of those is Dr. Blaine T. Bettinger’s blog The Genetic Genealogist. 

Just recently, I enrolled in a graduate Social Media Course at Southern New Hampshire University for professional development. This week we were challenged to write a case study on a “thought leader” who used social media.  Since Blaine’s blog was the first I encountered on the subject, I wanted to analyze his work.  He agreed and supplied some answers to very specific questions that I posed.

Blaine has influenced well over a million individuals and continues to enlighten others on a daily basis.  He has given me permission to reproduce this case study here.  I hope you learn something about The Genetic Genealogist and have a great appreciation of the power bloggers in our discipline.