What Next Generation Sequencing (NGS) does and how it works to help us
Next Generation Sequencing is only about a year old and is being used for complete sequencing (full genomes) and for sequencing just the Y chromosome. The Y chromosome contains YSTRS, YSNPS, and mtDNA. Whether you test a Big Y at FTDNA or a Y PRIME or ELITE at Full Genomes Corporation this new sequencing process is looking for newly discoverable SNPS. Most of the Y chromosome never undergoes recombination and so is passed from father to son to son and so forth intact. That’s how we can trace back any man from now to the proverbial Adam.
YSTRs are the markers that men have tested in our DNA project. They tend to Mutate in the realm of every 100-500 years and have helped us identify who is related in, what we call, a genealogical time frame (since the advent of surnames). They are counts of long stretches of strings of values “AAAGGGGTTTGAG” where we count how many times that sequence is repeated. The number of repeats is recorded for that STR for example:
DYS19 = 16 [repeats]
YSNPs are markers where the ancestral value has mutated to a new value. This is generally a one time event. At a given position on the Y chromosome a SNP mutates. Here is one of our newly discovered shared SNPs:
at Position 18099238 the ancestral value is “A” but ours is “G” (SNP is named FGC22538)
That change is a mutation that gets recorded once in one man one time and then every man that is descended from him carries that mutation. by going from the most recent SNP backwards we can essentially reconstruct the whole Y tree.
Let’s review
Now Group C continues
So find Z1 and continue here
So you can see that in the above example only KINCAID and FRANK share with our Group C WHEATON at the FGC12993 SNP. And beyond that KINCAID shares its own SNP Z1370.2. This essentially the way in which SNPs can be family defining.
Then Group B continues
Since this chart was drawn there are lots more sub-clades under L2 and our newly discovered one looks like this
Note the HUGE difference in the number of sub-clades under Group C WHEATON with 13 sub-clades between its ancestor U106 (S21) versus Group B WHEATON with 2 below L2. U106 is a more prolific group in the UK and also has many more testers. L2 is widely scattered in the UK and does not have as many testers.
Here’s a slightly different example that is not of our group but shows where we can go. [This happens to be R-L21- DF 63.] The long line of SNPS at the top are shared by all those below. All of the men sharing these SNPS are FRANKLINS except one who has an NPE and is also by blood a Franklin. Okay then you can see that some below the top block are shared by some FRANKLINS and not others. These SNPS can be used to identify who is more closely related to whom and when the lineage is known to identify different lines below the common ancestor. In this case the common ancestor was born about 1780.
So this means that if we had more WHEATON group C testers we would find similar clustering. And since we have a pool of about 18 potential test takers for WHEATON group B we might eventually be able to give Glen and Adam and others who don’t know how they connect a definitive path to ROBERT WHEATON and we can tell whether MALLENBY, RAINES, HOWELL and HANCOCK are related downstream of ROBERT or before Robert as in before 1600. And if people test specific SNPS once we have identified potential ones that might be shared we can tell how WHEATON, MALLENBY, RAINES, HOWELL and HANCOCK are related.
So what we in Group B have now is a long list of new SNPS that must be confirmed by others within our group. Some of them are likely to be private like the one we discovered in Jerry’s Walk through the Y and others like the example above are going to be shared by many. I have requested information on a panel at YSEQ. And a couple of you are considering other options. If we get one or more results from NGS we should know which direction to head.
If this is still not clear please ask questions. More info to come.