Friday, January 05, 2018

Metspalu v Laziridis: trying to understand

Am trying to square Metspalu, et. al.'s 2011 paper and Laziridis et. al.'s 2016 paper.

Here's what's bugging me.  The diagram below is a crop from a diagram in the Metspalu paper, a graphical representation of the genetic make-up of various populations.

You can see South Asia is mainly k5 and k6.  k6 is mostly confined to South Asia, while k5 extends into Central Asia, the Caucasus, the Middle East and into Western Europe.

They write:
We found no regional diversity differences associated with k5 at K = 8. Thus, regardless of where this component was from (the Caucasus, Near East, Indus Valley, or Central Asia), its spread to other regions must have occurred well before our detection limits at 12,500 years. Accordingly, the introduction of k5 to South Asia cannot be explained by recent gene flow, such as the hypothetical Indo-Aryan migration. The admixture of the k5 and k6 components within India, however, could have happened more recently—our haplotype diversity estimates are not informative about the timing of local admixture.
PS: thanks to guest's comment below, I know clarification is needed: Metspalu et. al. run  ADMIXTURE, which estimates how much a modern sample of unrelated individuals derives their ancestry from a set of postulated ancestral populations.
 The typical dataset consists of genotypes at a large number J of single nucleotide poly- morphisms (SNPs) from a large number I of unrelated individuals. These individuals are drawn from an admixed population with contributions from K postulated ancestral populations. Population k contributes a fraction qik of individual i’s genome.
You try various Ks and ADMIXTURE also estimates the standard errors on the results.  Reich 2009 used a Principal Components Analysis (PCA) to come up with ANI/ASI.  ADMIXTURE was created in 2009 apparently. 

We are told:

Choice of an appropriate value for K is a notoriously difficult statistical problem. It seems to us that this choice should be guided by knowledge of a population’s history. Be- cause experimentation with different values of K is advisable, admixture prints values of the familiar AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) statistics, widely applied in model selection. 

 Strictly speaking, their detection limit is 500 generations, and they use 25 years per generation. The point is that k5 in South Asia dates to more than 500 generations or 12,500 years ago.

The next is a little leap of mine - is it justified? In the Ancestral North Indian/Ancestral South Indian (ANI/ASI) model,  ANI corresponds to k5 and ASI corresponds to k6.  That this is so is not entirely clear to me.

We now come to Laziridis et. al.  They do an ancient DNA (aDNA) analysis of Near Eastern samples (Near Eastern with respect to Europe)  dating from 12000 to 1400 years ago, and they refer to aDNA analyses of Steppe inhabitants; as far as I know, the Steppe aDNA does not go back before 12000 years.  Now, the Laziridis paper says:
We show that it is impossible to model the ANI as being derived from any single ancient population in our dataset. However, it can be modelled as a mix of ancestry related to both early farmers of western Iran and to people of the Bronze Age Eurasian steppe...
 But if ANI == k5, and k5 spread before 12500 years ago (strictly speaking, 500 generations, while the aDNA dates are presumably radiocarbon dates) why would one expect to explain ANI in terms of contemporary peoples or peoples younger than ANI?

Perhaps one can say that Near Eastern aDNA,  Steppe aDNA and ANI (k5) all arose from the mixture of two ancestral populations X and Y (ancestors of the 12000-year-ago-people) and the Near Eastern aDNA and the Steppe aDNA represent relatively unmixed descendants of X and Y respectively, while ANI is a descendant mixture of X and Y.   I don't think this is what the Laziridis paper does.

My guess is Laziridis et. al. instead of sticking to just to genetics, also buy into the predominant theory of the spread of Indo-European languages, and hence attempt to explain ANI in this illogical way, or else Laziridis thinks the Metspalu paper is wrong and ANI(k5) is younger than 12500 years; or else I have misunderstood Laziridis or else ANI != k5.

PS: the most probable of the above alternatives is that I misunderstand Lazirides, the second most probable is that Laziridis et. al. don't think Metspalu is correct, and ANI is (much) younger than 12500 years.

PS: Jan 6: This larger excerpt of a diagram in Metspalu shows ADMIXTURE with K=8 and K=12 (K is the number of hypothetical ancestral populations) and you can see that it does not really change the story that most of Indian ancestry traces to two components.  One would hope that one can come up with an objective definition of k5, k6 that is the same when computed with different but adequate samples.


tim drake said...

There is a small k4 (dark blue) component also in some populations of UP :). Besides,some old samples like the >15,000 year old Ma'lta Buret boy (who happens to be of Y haplogroup R*) shows some ASI. Check out the image below.

Arun said...

Where is that image from?

tim drake said...


See the comments section. A poster known as @TruthPrevails posted it

tim drake said...

Also, unnecessary attention on R1a subclads, let's not forget the other son of R which is R2 (which btw has very deep subcontinental presence across indo-european, dravidian both).
For example check this
and check this for haplogroup L

The yfull has very detailed tree like description of Y haplogroups thoug i am not sure the data they have is the result of private DNA tests or taken from various papers/Universities open source labs!!!

tim drake said...

Hi Arun, I was lurking over the "indian-defenders" forum and saw the SI video on AMT discussion, at 26:30 in the video, Dr Chaubey talks about the aDNA samples from rakhigarhi where he says that the preliminary results show some kind "dravidian" expansion towards north-west India but no migration from West to east. What does he mean by dravidian here ? Is it the k6 (dark green) component in the figure you have cited from metspalu ? Could you please forward this doubt :) ?

tim drake said...
This comment has been removed by the author.