Sunday, April 01, 2018

Hoisted from the comments: The Genomic Formation of South and Central Asia

A pre-print on bioarxiv: The Genomic Formation of South and Central Asia

Some excerpts follow.


4) the first ever ancient DNA from South Asia from Iron Age and historical settlements in the Swat Valley of Pakistan (1200 BCE– 1 CE) from 7 sites.........
Finally, we examined our Swat Valley time transect from 1200 BCE to 1 CE. While the earliest group of samples (SPGT) is genetically very similar to the Indus_Periphery samples from the sites of Gonur and Shahr-i-Sokhta, they also differ significantly in harboring Steppe_MLBA ancestry (~22%). This provides direct evidence for Steppe_MLBA ancestry being integrated into South Asian groups in the 2nd millennium BCE, and is also consistent with the evidence of southward expansions of Steppe_MLBA groups through Turan at this time via outliers from the main BMAC cluster from 2000-1500 BCE. Later samples from the Swat time transect from the1st millennium BCE had higher proportions of Steppe and AASI derived ancestry more similar to that found on the Indian Cline, showing that there was an increasing percolation of Steppe derived ancestry into the region and additional admixture with the ASI through time. 
Comment: See below for all the acronymns and terms.  This is way after the Indus Valley Civilization, and the potential Rakhigarhi aDNA and after the hypothetical arrival of Aryans around 1500 BC.  More on the above later.
The absence in the BMAC cluster of the Steppe_EMBA ancestry that is ubiquitous in South Asia today — along with qpAdm analyses that rule out BMAC as a substantial source of ancestry in South Asia — suggests that while the BMAC was affected by the same demographic forces that later impacted South Asia (the southward movement of Middle to Late Bronze Age Steppe pastoralists described in the next section), it was also bypassed by members of these groups who hardly mixed with BMAC people and instead mixed with peoples further south.  In fact, the data suggest that instead of the main BMAC population having a demographic impact on South Asia, there was a larger effect of gene flow in the reverse direction, as the main BMAC genetic cluster is slightly different from the preceding Turan populations in harboring ~5% of their ancestry from the AASI.
qpAdm: See here.
BMAC = Bactria–Margiana Archaeological Complex
Steppe_EMBA = Steppe Early-Middle Bronze Age
AASI =
Ancient Ancestral South Indian (AASI) - related” : a hypothesized South Asian Hunter - Gatherer lineage related deeply to present-day indigenous Andaman Islanders
This image from Wikipedia puts this in context.
CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=868992
On Steppe_EMBA
Lazaridis et al. show that Early to Middle Bronze Age steppe groups, including Yamnaya, tagged by them as Steppe EMBA, are best modeled with formal statistics as a mixture of Eastern European Hunter-Gatherers (EHG) and Chalcolithic farmers from western Iran. The mixture ratios are 56.8/43.2, respectively.

However, they add that a model of Steppe EMBA as a three-way mixture between EHG, the Chalcolithic farmers and Caucasus Hunter-Gatherers (CHG) is also a good fit and plausible.
If the purportedly "ubiquitous" in South Asia Steppe_EMBA arrived without going through BMAC, it seems to me that an explanation that the same ancestral hunter-gatherers diffused through Iran into India as into the Steppes before the BMAC was populated is simpler than this ancestry diffused into the Steppes and then into India through the BMAC without affecting BMAC.  But we shall see.
...we estimate that the time of admixture between Iranian agriculturalist-related ancestry and AASI ancestry in the three Indus_Periphery samples was 53 ± 15 generations ago on average , corresponding to a 95% confidence interval of about 4700-3000 BCE assuming 28 years per generation.  This places a minimum date on the first contact between these two types of ancestries.
Indus_Periphery samples: three "outlier" aDNA samples:
...between 3100-2200 BCE we observe an outlier at the BMAC site of Gonur, as well as two outliers from the eastern Iranian site of Shahr-i-Sokhta
 Continuing:
Previous work has shown that the Indian Cline — a gradient of different proportions of West Eurasian related ancestry in South Asia — can be well modeled as having arisen from a mixture of two statistically reconstructed ancestral populations (the ANI and the ASI), which mixed mostly after 2000 BCE.  Ancient DNA analysis has furthermore revealed that the populations along the Indian Cline actually descend more deeply in time from at least three ancestral populations, with ancestry from groups related to early Iranian agriculturalists, Steppe_EMBA, and Onge. 

To shed light on the mixture events that transformed this minimum of three ancestral populations into two (the ANI and ASI), we used qpAdm to search for triples of source populations — the AASI, all sampled ancient Iran/Turan-related groups, and all sampled ancient Steppe groups — that could fit as sources for South Asians.  As South Asian test populations we used an Indian Cline group with high ANI ancestry (Punjabi.DG), one with high ASI ancestry (Mala.DG), early Iron Age Swat Valley samples (Swat Protohistoric Grave Type - SPGT), and Early Historic Swat Valley samples (Butkara_IA).  Fig. 3A shows that the only models that fit all four test South Asians groups are combinations that involve the AASI, Indus_Periphery and Steppe_MLBA (in the analyses that follow, we therefore pooled the Steppe_MLBA).  The evidence that the Steppe_ MLBA cluster is a plausible source for the Steppe ancestry in South Asia is also supported by Y chromosome evidence, as haplogroup R1a which is of the Z93 subtype common in South Asia today was of high frequency in Steppe_MLBA, but rare in Steppe_EMBA  absent in our data).
Steppe_MLBA = Steppe Mid-to-Late Bronze Age.  This is a pre-print, but the ubiquitous in India Steppe_EMBA is now really Steppe_MLBA - that is a bit disconcerting.

One of the problems I have with all of this is - India was almost certainly more densely populated than the surrounding areas, it was one of the refuges of humanity during the glacial periods; but because there is no aDNA from India, Indian populations are always modeled as having arrived from outside.
To obtain a richer understanding of the ancestry of the entire Indian Cline, we took advantage of previously published genome-wide data from 246 ethnographically diverse groups from South Asia, from which we sub-selected 140 groups that fall on a clear gradient in PCA to represent the Indian Cline (the other groups either fall off the cline due to additional African or East Asian-related ancestry or had small sample size or heterogeneous ancestry).

The per-group qpAdm estimates for the proportions of ancestry from these three source sare statistically noisy.  We therefore developed new methodology that allows us to jointly fit the data from all Indian Cline groups within a hierarchical model.
The analysis confirms that the great majority of all groups on the Indian Cline can be jointly modeled as a mixture of two populations, and the analysis also produces an estimate of the functional relationship between the ancestry components.  Setting Steppe_ MLBA to its smallest possible proportion of zero to estimate the minimum fraction of Indus_Periphery ancestry that could have existed in the ASI, we obtain ~39%.  Setting AASI to its smallest possible proportion of zero to estimate the maximal fraction of Indus_Periphery ancestry that could have existed in the ANI, we obtain ~72%.
Notice one thing - out of modern 246 groups, 106 were discarded as noise; but aDNA from one population in Swat and three Indus Periphery outliers is taken to be typical.  This is a constraint introduced by the paucity of aDNA data, not because it necessarily reflects reality.

In fact, this paper puts ANI and ASI to be Bronze Age.
These results suggest that the ASI and ANI were both largely unformed at the beginning of the 2nd millennium BCE, and imply that the ASI may have formed in the course of the spread of West Asian domesticates into peninsular India beginning around 3000 BCE (where they were combined with local domesticates to form the basis of the early agriculturalist economy of South India, or alternatively in association with eastward spread of material culture from the Indus Valley after the IVC declined.
 So the largely unformed by the beginning of the 2nd millenium BCE the ANI and ASI ancestries formed and started mixing immediately ("can be well modeled as having arisen from a mixture of two statistically reconstructed ancestral populations (the ANI and the ASI), which mixed mostly after 2000 BCE.").

Here's what is confusing to me:

The three Indus Periphery samples date to 3100-2200 BCE.  They are taken to be emigrants from India ("we hypothesize that these outliers were recent migrants from the IVC"). But they are taken to be ancestors of ANI rather than descendants.  The general prejudice that Iran and the Steppes were more densely populated and/or had more population than the IVC continues. All to save the Indo-European languages' hypothesis.

PS:--- some observations

If aDNA population X in geography x has characteristics that modern population A in geography a does not have, then X is unlikely to be ancestral to A  (those characteristics could have gone extinct, so unlikely, not impossible).  But then, some of the ancestors of population A who were located in geography a could be ancestors of the aDNA population X in geography x.  What precludes that hypothesis is the lack of aDNA from geography a plus a liberal application of Occam's Razor.

PPS: the geographic isolation of Swat, e.g., Wiki: "Swat was home to the last isolated pockets of Gandharan Buddhism, which lasted until the 11th century, well after most of the area had converted to Islam." 

PPPS:

I cannot dispute "Steppe ancestry" for Indians if genetics shows it. My main contention is that geneticists try to "match" their findings to existing fake theories made up by linguists using fake languages. The logic used to link Steppe grave with India language is garbage.


 

37 comments:

raj said...

After reading the paper it does seem like Historians were right about somethings and very wrong about some others. The ridiculous projection of colonial fantasies on our ancestors are proved completely false. The Indus Valley people probably had a range of skin colour simillar to modern Indians and definitely weren't 'black' or noseless. Far from having some notions of racial purity they see (assuming the steppe people were Indo-Aryan speaking) they frequently assimilated and intermarried with other surrounding groups. Likewise the Aryan-Dravidian divide also has also hugely been exaggerated. The only difference being Indo-Aryans speakers have slightly higher steppe ancestry and Dravidians slightly more Iranian farmer. The only real genetic divide seems to be between settled populations and the austro-asiatics who themselves were migrants from the east. hope the Indian team will release the rakhigarhi data soon.

PPS - As usual wikipedia has tried to downplay Hindu links with swat. The Swat valley was the oddiyana pitha sacred to buddhist and Hindu tantrikas. At Mangalapura/Manglaor - The goddess was worshipped as Mangaladevi. It also is home to the peak Elum ghar known as sripada/ vishnupada by the Buddhists and Hindu respectively even after the occupation of swat by pashtun muslims natha yogins still undertook pilgrimage there even in the 15th century

tim drake said...

Well,out the many iron age samples from pakistan ,only 1 R1a has been found . If the haplogroups assignment in the pre-print are right (which may not be the case as Mal'ta boy has been incorrectly placed), the R1a found is R1a1a1b ,that is Z-645 which is ancestral to the european Z282 and south Asia Z93. Moreover, i am seeing 4 R2a males in the 8000BCE- 7000 BCE in iran (quite interesting if the assignments are correct)

Shiva Krishna said...
This comment has been removed by the author.
Shiva Krishna said...
This comment has been removed by the author.
tim drake said...

Imo the rakhigarhi R1a would come under Iranian farmer like ancestry if it happens to be prior to 2000 BC :). Moreover, since Dr Rai is already part of the study, we all know how it's going to turn out for rakhigarhi.

Shiva Krishna said...
This comment has been removed by the author.
tim drake said...

"Now coming to Niraj Rai well that isn't possible because he has given not one but two interviews saying exactly what I am saying and opposite to what you just said" ----- Shiva Krishna, one interview was in the jagran paper can you please point out where was the other ?

Shiva Krishna said...
This comment has been removed by the author.
Shiva Krishna said...
This comment has been removed by the author.
tim drake said...
This comment has been removed by the author.
Jijnasu said...

Nick patterson says that they don't have any DNA from the IVC yet. I don't know why different scientists are sayimg different things. Maybe the initial samples were contaminated with modern DNA. Anyway I think Dr chaubey was humouring his audience in that conference not wishing to get into complex issues. If they did actually have evidence for OIT I'm pretty sure he'd have mentioned it when asked about migrations to and from India. Instead he only mentioned an old paleolithic migration.

Shiva Krishna said...
This comment has been removed by the author.
Shiva Krishna said...
This comment has been removed by the author.
Shiva Krishna said...
This comment has been removed by the author.
Shiva Krishna said...
This comment has been removed by the author.
Jijnasu said...

Nick Patterson was actually defending his Indian colleagues whom people online were accusing of delaying the data. All of it is pretty vague and with all the data we have very very unlikely. So until the some revolutionary data actually comes out the OIT is dead

Shiva Krishna said...

@Jijnasu @tim drake From YOUR side - on one the hand @Jijnasu "Nick patterson says that they don't have any DNA from the IVC yet. I don't know why different scientists are sayimg different things. Maybe the initial samples were contaminated with modern DNA. Anyway I think Dr chaubey was humouring his audience in that conference not wishing to get into complex issues." on the other @tim drake " I watched that but a bird told me few weeks ago that he had conversed with Dr Chaubey and Dr Chaubey believes in the arrival of certain people between 2500 BC and 5000 BC from the area around Ukraine :/. May be these were Dr Chaubey's earlier views and may be he changed it." So according to you guys do we have aDNA or not??? - both the statements that I quoted above contradict each other.

I guess the birds are fluttering eh ;)

Shiva Krishna said...

@Jijnasu Dead for you! Anyways with the way the science is progressing and the availability of aDNA increasing every day - sooner rather than later we will have all the answer. Time is all that is in between. In a few years time a lot of people will end up looking stupid, and will be trolled like never before, which side that will be only time will tell meanwhile we all believe what we believe!

tim drake said...
This comment has been removed by the author.
Shiva Krishna said...
This comment has been removed by the author.
Shiva Krishna said...
This comment has been removed by the author.
tim drake said...
This comment has been removed by the author.
Shiva Krishna said...

@tim drake Dr. premendra priyadarshi lol good one! will not get into that debate! haha but tell me I am curious when did he say this, chubby stuff, to you? and where did you communicate with him - curious! thanks

Shiva Krishna said...
This comment has been removed by the author.
tim drake said...
This comment has been removed by the author.
Shiva Krishna said...
This comment has been removed by the author.
tim drake said...
This comment has been removed by the author.
Shiva Krishna said...

@tim drake ok cool thanks… Peace.

tim drake said...

"Steppe EMBA is mostly R1b and only steppe MLBA is full of R1a, suddenly, that is 2000 BCE - 1400 BCE - direction" --- haven't the oldest R1a-M417 been found in eastern europe (some more than 7000 years old) ? What is your expansion model here btw ?

Jijnasu said...

@timdrake actually dr vagheesh clarified that Indians do have EHG ancestry. https://mobile.twitter.com/vagheesh/status/980409143793209344

Jijnasu said...

@timdrake
EHG can be thought of as WHG + ANE. This WHG component is absent in the Hunter gatherers of siberia and chalcolithic and bronze age populations of iran/central asia. There are some deep parallels between steppe populations and central asians both being mainly combinations of west asians CHG and Iran neolithic respectively, ANE rich groups (EHG and siberian HGs) and Populations descended from anatolian farmers - but they are not the same and can be identified distinctly

Baji Rao 1 said...

@jijnasu and Tim Drake,

The out of india theory of Indo European expansion is the correct theory as it is based on archaeological and literary evidence. So if there is any genetic connection between india and the steppes, it means there was a northward emigration from india into the steppes. What is branded as steppe DNA might very well be Indian DNA that was infused into the steppes. The geneticists are simply trying to retrofit the genetic interpretation to the Aryan invasion theory paradigm. They are likely wrong in their genetic interpretation.

Baji Rao 1 said...

The rig Veda can be dated to a period between more than 3000 bc to 1500 bc, in terms of its composition. This dating of the rig Veda is done based on the mittani records of West Asia which are scientifically dated. So, in 3000 bc where the oldest parts of the rig Veda were composed, Indo-Aryan speakers were already present in north india as the rig Veda was composed there in the above mentioned period in an Indo-european language. So how then this genetic study can say that indo Europeans entered india in the mid to late 2nd millennium bc, aka 1500 bc to 1000 bc, bringing indo European language when based on the evidence of the rig Veda and the fact that it was composed starting around 3000 bc, indo-europeans were already present in india in the early bronze age itself? Therefore the conclusion of this genetic study is nonsense.

Baji Rao 1 said...

Arun, read my above two comments and tell me what you think about this genetic study?

Arun said...

The British came with English, an Indo-European language, to India in the 17th century, but Indo-European was already there.

The Sakas (Indo-Scythians) came to India 2000 years ago, speaking some now-lost Indo-European language, but Indo-European was already there.

So Steppes people could have come 3500 years ago, speaking some Indo-European language; but as you point out, if we take the content of the Rg Veda seriously, Indo-European was already there.

Baji Rao 1 said...

Arun,
The British and scythian examples are not valid because they involved only small groups which had no impact on Indian genome. Whereas, according to this study, the steppe people supposedly shaped much of modern Indian genome. Is it believable? I don't think so, because the steppe didn't have the population density,numbers or drive to impact the population size of india at the time. This is because the steppe never developed agriculture which could not have lead to population expansion of that level which could impact the very large population of india. Whsts your view on this?

tim drake said...

"So if there is any genetic connection between india and the steppes, it means there was a northward emigration from india into the steppes" ---- i doubt so bajirao. There is no evidence for that unless one finds such samples from IVC and neolithic samples from North west part of subcontinent.