2025: Indo-European STILL comes from Eastern Europe
And they say no good story ever started with a glass of milk...
An Update to this post:
Proto-Indo-European comes from Eastern Europe
In August of 2022, the largest and most controversial archaeogenetics study in a while came out, probably the largest since the initial floodgates of Indo-European debate broke in 2015. In this study, peppered with prominent names in genetics like David Reich and Iosif Lazaridis, the Steppe Urheimat seemingly cemented in 2015 was rejected, for a new “So…
2025 started out with a series of controversial events happening in the Indo-European Studies sphere of the internet. Iosif Lazaridis and colleagues officially published a paper that had long existed as a preprint, The Genetic Origin of the Indo-Europeans. We got our first Iranian DNA from the historical period, with shockingly little and potentially no Steppe ancestry. Paul Heggarty continued to pedal his 8000-year-old Indo-European hypothesis in response to Lazaridis and Nikitin, while simultaneously suggesting that Lazaridis’s study is actually just blatantly Out-of-Armenia and is burying these bones under a layer of fatty words to encourage the Steppe Hypothesis crowd to reconcile. And finally, some shitlib academic went viral on Xitter for claiming that the Indo-European homeland simply does not exist, and to suggest that it does exist is a white supremacist dogwhistle. All of these events, combined with the recent war on Indians, has led to a bolstering in the ranks of those Indian Chauvinists (as well as other sorts of ethnic narcissists south of the Caucasus) who deny the origin of Indo-European, and perhaps even Indo-Iranian, in Europe.
But this could not be further than the truth. The latest two important papers have found much of the evidence that the supporters of the Steppe Hypothesis were hoping for. We now know that the Yamnaya Culture is not genetically very distinct from the Sredni Stog culture that preceded it, rather it exists at the end of a genetic cline that the more heterogenous Sredni Stog culture occupies. Copper-Age “Kurganized” Balkan groups have been shown to have had significant ancestry from the Steppe, which works in favor of the theory that Anatolian spread through the Balkans rather than through the Caucasus. It is these cultures that have archaeological connections with early layers of Troy, an area yet to be properly sampled during the Bronze Age.
The Out-of-Armenia Hypothesis in the Southern Arc study was defended by the claim that the Yamnaya got CHG from south of the Caucasus, not from the Balkans. Although I am still skeptical of this, it’s no longer relevant, as we now have genetic evidence of CHG-EEF hybrid ancestry north of the Caucasus far before the formation of the Yamnaya, in the form of the Nalchik (c. 4900 BC) and Remontnoye (c. 3900 BC) samples. Even the Eneolithic Steppe samples we knew about earlier may have had some of this more southerly ancestry, as the Nalchik study suggests. Simply put, there is no longer any need to account for this ancestry by going south of the Caucasus, because it has been demonstrated to have already existed north of the Caucasus before the genetic formation of the Yamnaya genetic profile, and far before the formation of the Yamnaya as a culture. The Yamnaya genetic profile also seems to predate the Yamnaya culture proper by at least 700 years, based on the admixture dates estimates of the recent Lazaridis study.
The reason people were so confused about this, is that the study has a very sly way of interpreting these findings. I feel the Harvard crew has a bit of an interest in not completely reversing their findings from 2022, where the same people involved in this study came out strongly in favor of an origin of Indo-European south of the Caucasus. So we are led to believe in the significance of these “clines” rather than believe in the significance of certain genetic profiles. Admixture is listed as combinations of clines (ex: “Yamnaya is 4/5 CLV Cline”, 1/5 NPR Cline”) which is obviously quite vague, as these clines include pretty different populations. The emphasis on clines exists only to keep alive the possibility of a Proto-Indo-Anatolian urheimat that is either across the Caucasus, or transcends it altogether. However, I still feel the tone of this paper is slightly in favor of the Steppe hypothesis.
“Yamnaya and Anatolians share CLV ancestry (Fig. 2e,f), which must stem from proto-Indo-Anatolian language speakers, except for the possibility of an early transfer of language without admixture. That the CLV ancestry in Central Anatolians during the Hittite presence included lower Volga-related ancestry implies an origin north of the Caucasus (Fig. 2f and Extended Data Fig. 1). Long (30 cM or longer) IBD segments shared by Igren-8 Serednii Stih and Areni-1 with Berezhnovka-2 document Eneolithic links of lower Volga ancestry (Extended Data Table 5), and one link (15.2 cM) between the north Caucasus Vonyucka-1 with early Bronze Age Ovaören (MA2213) ties Central Anatolia to this once expansive network. Even so, only two Indo-Anatolian descendant groups transmitted their languages to posterity: the Yamnaya, aided by their horse-wagon technology6, and Anatolian speakers, surviving long enough for their languages to be committed to clay around 2000 bc5, vanishing in late antiquity and fortuitously decyphered in the twentieth century. Our reconstruction, based on genetics (Extended Data Fig. 5), has traced both groups to the CLV people north of the Caucasus, but it cannot discern who first spoke pre-Indo-Anatolian languages.”
There’s just no reason, in my opinion, to talk about a “CLV people”, when Eneolithic Steppe itself seems to simply be an extremity of the Khvalynsk Culture’s preexisting genetic variation (similar to Yamnaya being an extremity of Sredny Stog), and the Eneolithic Steppe samples share long IBD segments with Sredny Stog individuals. The “B” in the BP Group, the Berezhnovka-2 Kurgan, is culturally Khvalynsk-like from what I understand. The Remontnoye samples, which are more genetically southerly than the geographically more southern Eneolithic Steppe samples, are stated in the study to have come from burial sites with clear Sredny Stog and Khvalynsk traits:
Steppe Eneolithic burials were placed in both simple pits and pits with a side chamber (catacombs). Bodies were usually arranged supine with raised knees, like Khvalynsk and Serednii Stih, or sometimes in a contracted position on their sides. Inventory items typically include saiga astragali, bone rods, and pottery vessels similar to Serednii Stih.
According to David Anthony, Nalchik is also a Khvalynsk site, but I’ve seen conflicting information on this.
Remontnoye and Eneolithic Steppe burials are rich in R-V1636. V1636 is kind of a controversial haplogroup, because it’s an earlier branch than even the R1b found in Samara EHGs. It is rare today and found in very distant places. While it certainly can’t be called evidence of a Khvalynsk origin, it’s probably more likely that it is of Khvalynsk origin than southern origin, as R1b is most certainly a European Hunter Gatherer Haplogroup. Interestingly, V1636 is found in a Kura-Araxes (Hurro-Urartian?) sample in Armenia, and Kura-Araxes samples had Eneolithic Steppe ancestry. It is also found at a royal tomb in Arslantepe. This, combined with the apparent IBD sharing between Eneolithic Steppe individuals with a Bronze Age Anatolian from Ovaoren, and passing qpAdm models using Eneolithic Steppe in BA Anatolians, has convinced the Lazaridis crew that Anatolian came through the Caucasus Mountains, rather than through the Balkans. I disagree with this, and remain convinced that Anatolian entered Anatolia through the Balkans for the reasons I discussed in my original post. But as for the connection between Anatolia and Steppe Eneolithic, this can be explained in a Balkan approach as well.
Firstly, V1636 in Anatolia isn’t necessarily good evidence of anything. I’ve heard that the V1636 sample isn’t actually royal, it’s some sort of servant sacrificed or otherwise buried within or on top of the tomb of the royal. So, not an “elite”. Even if it was, it could have easily found its way to Anatolia through chance over the course of 2000 years though Chalcolithic Armenia → Kura-Araxes → Hurrians
Also, the Sredny Stog I-L699 is also found in Anatolia:
Secondly, Eneolithic Steppe individuals were probably among the Sredny Stog invaders of the Balkans. In fact, the Steppe component in Trypillian farmers is best represented with Steppe Eneolithic. There is also a sample which is basically just pure Steppe Eneolithic in the study, that is actually from the Balkans — Giurgiulesti, and another very Steppe Eneolithic-rich sample as far as Hungary, Csongrad.
There is apparently an R1b-Z2110 (downstream of Z2103, the primary patrilineage of the Yamnaya) sample in North Mesopotamia in the study, but it’s low-resolution. Possibly a bad call or maybe even contamination. Either way, Z2110 branched off too late, around the beginning of the Yamnaya expansions (according to YFull) for it to be that meaningful to the origin of Z2103. There remains very little reason to think that Z2103 originated south of the Caucasus, but Out-of-Armenia types will pretend it’s reasonable to say so.
This is what I think the story of the Western Steppe and Caucasus is in 2025:
Pre-6000 BC: EHG lives in Volga-Ural Region, WHG-EHG cline present in Ukraine, Baltic, Scandinavia, Balkans (Iron Gates). CHG at least exists in the Caucasus Mountains but may exist as far up as the lower Volga. I have heard some sources say that Samara EHGs have slight CHG ancestry, not sure if this is true or not. Elshanka Culture is probably EHG but possibly has WSHG and Southern influences as it was an early adopter of pottery.
5500-4500 BC: Groups with majority ancestry from CHG and EEF sources, and possibly some Zagrosian, introduce agriculture and pastoralism to the Steppe, develop farming. Khvalynsk Culture is the product of trade (including bride-exchange) between the native hunter-gatherers and southern pastoralists and farmers. They adopt pastoralism and copper tools but retain EHG patrilineages and majority EHG ancestry. Khvalynsk individuals also acquire minor ancestry from West Siberian Hunter Gatherers. TTK001 (Tutkaul) is used as a proxy for this but I suspect it is really Kelteminar ancestry which is probably intermediate between Tutkaul, EHG, and Botai based on the geography. The CHG/EEF populations can be represented by the Darkveti-Meshoko Culture, who seem to be genetically similar to the later Maykop Culture, being a mix of CHG-Iran and EEF ancestry. Khvalynsk expands southwards to the lower Volga and into the Manych Depression, and peppering the Piedmont region above the Caucasus.
4500-3500 BC: One of these Khvalynsk groups, with some mixture from CHG-rich groups and probably inhabiting the Don region, expands into Ukraine forming the Sredny Stog and Suvorovo Cultures. This group mixes with the Ukrainian Hunter Gatherers, who are mostly EHG with some WHG, and sometimes with the Neolithic Farmers. They adopt a mutualistic relationship with the Cuceteni-Trypillia Farming Culture that lives around the more forested but fertile rivers, but also expand into farmer territory. This group is Proto-Indo-Anatolian. Proto-Anatolians diverge in the Balkans, and travel into Anatolia at some point between this time and the earliest written evidence of Anatolian, through the Bosporus region. At the same time, some sort of Western Siberian HG group invades Piedmont Region, mixing with Eneolithic Steppe individuals and somehow adopting Maykop cultural artifacts.
3500-2900 BC: An eastern tribe of the Sredny Stog group, the Yamnaya, probably only around 10,000 in number, expands rapidly across the steppe while retaining a very homogenous genetic profile. Around the mouth of the Don, the Sredni Stog survive and adopt Yamnaya ways, but elsewhere they are destroyed. I suspect that it is at this stage that the Corded Ware Culture forms, being comprised of Sredny Stog clans that mixed with the expanding Yamnaya but also retained some haplogroups that are pre-Yamnaya. The Sredny Stog seem to be predominantly I2a2, but one of the Don Yamnaya is R1b-L51, and one of the Usatovo males is R1a. R1a is also found in Ukrainian Hunter Gatherers. Tocharian splits off from Yamnaya.
3000-2500 BC: The Corded Ware Culture, a hybrid of Yamnaya, Sredny Stog(?), Hunter-Gatherers of the Belarusian Swamps, and the farmers of the Globular Amphora Culture, expands rapidly across Northern Europe. Produces every Indo-European language other than Tocharian, Hittite, Greek, Albanian(?), and Armenian. The latter three are produced by the Yamnaya-descended Catacomb Culture, but they all might also have Corded Ware influence.
So, that’s the Lazaridis stuff. But then there’s this whole other issue, the Iran Fiasco… Ancient DNA indicates 3,000 years of genetic continuity in the Northern Iranian Plateau, from the Copper Age to the Sassanid Empire. Many of the Out-of-India types, or the people who subscribe to the Heggarty Hypothesis, are using it as evidence that Iranian was not spread into Iran by Andronovo people, but rather it was already there since very early on. But there’s many, many pitfalls with this interpretation.
Firstly, the sample is quite poor. Of the timespans we’re interested in, only seven people were sampled. All of them from Mazandaran and Gilan. The study markets itself as having samples from the Achaemenid period, but it really doesn’t. If you go into the Supplemental Materials, you’ll see that the oldest samples are dated to, at earliest, the late Achaemenid period, but have an average date after the Macedonian invasion or even into the Seleucid period.
The Iranians had been on the Plateau for hundreds of years, and had undergone centuries of Imperial rule by this time. Guenther talks about this… The decline of the Aryan elite in Achaemenid Persia… Make of that what you will.
Furthermore, Mazandaran is geographically unique. It is sheltered by the Alborz Mountains, leaving it with a very different climate from the rest of Iran. The Caspian coast of Iran is covered by the dense Hyrcanian Forest. The natural barriers of forests and mountains, combined with a possibly larger preexisting Neolithic population due to the fertile farmland, may have stifled the invading Iranians from having as large of an impact on the region.
This is backed by genetic diversity within contemporary Iranians. Mazandaranis consistently score very low in Steppe ancestry, and very high in BMAC-type ancestry on G25. I tried many models and this one had the lowest distances:
TKM_IA is half-BMAC, half-Sintashta more or less. Models very well with Iranians and is basically a Tadjik without East Asian and Southern ancestry. Some people might not like that I am using the Zagrosian Iran_Ganj_Dareh with groups derivative of it, but I think it’s justified. There could have been quite pure Zagrosians in the Iranian Plateau well into the Bronze Age, but here’s a run without it:

And here is the first model with samples from a different dataset, the Mariopolous Collection:
Gilakis are probably genetically similar to Mazandaranis as they speak very similar dialects, and the Hyrcania region extends into modern Gilan. If anything, they’re probably less Steppe than Mazandaranis proper, because they’re further westward and Steppe entered Iran from the East.
Plus, the study doesn’t really provide evidence rejecting Steppe ancestry in these samples, moreso it provides evidence that Steppe ancestry is not a necessary component in these samples. A fair amount of runs with Steppe ancestry have passing p-values, but many null runs with just pre-IE components also pass, basically saying that these samples are genetically close enough to pre-IE populations for the difference to be chocked up to chance. This is probably something that becomes problematic in general when admixture is below 10%, and this was partially the issue in Anatolia as well. It’s possible that single-digit Steppe ancestry exists in some of the Anatolian samples. Certainly, there are successful qpAdm runs with Steppe in Anatolia, and you can do it in G25 as well. But allegedly they don’t need Steppe to be modeled. Allegedly. The model preferred by the study is a combination of Shah Tepe BA, similar to the Bustan BA I used in my model, and Hajji Firuz IA. Hajji Firuz IA is probably an Armenian group, as it has slight Steppe ancestry and is heavy in R1b-Z2103. The Iron Age sample from Iran are from the far west, Hasanlu and Hajji Firuz Tepe, and are likely Armenians. That’s why it looks like a lot of R1b in IA Iran, but they’re not real Iranians. At the time the sites, both around Lake Urmia, would have been controlled by Urartu, who again probably had some Steppe ancestry themselves just through interaction with Steppe groups over many thousands of years, but this region would later be inherited by the Armenian minority of the empire and controlled by them for a long time.
“IA Hajji Firuz is the only additional western source that provides plausible models for all historical groups. Using this source, Marsin Chal can be modelled with 78% BA Shah Tepe and additional ∼22% IA Hajji Firuz, while for the Liarsangbon group these amounts are 37% BA Shah Tepe and 63% IA Hajji Firuz. The combined group, referred to as Iran_North_Historical, produces similar results and can be modelled with 52% BA Shah Tepe and 48% IA Hajji Firuz-type contributions (Supplementary Table S14). We interpret these findings as evidence that the genetic profiles of historical-period groups of the northern Iranian Plateau reflect their position along the broader east-west genetic cline, rather than resulting from specific admixture events.”
In my opinion, the western source in Iran is almost certainly from Mesopotamia not northern but southern, as this region became the new Imperial heartland under the Achaemenids and continued to be such unil the Arab invasion. They probably sept into the rest of Iran, having such a large population. The good fit of Hajji Firuz may have something to do with the minor Steppe (~12%) ancestry of Hajji Firuz.
But, for the sake of argument, let’s suppose that these samples really don’t have any Steppe ancestry whatsoever. Would it vindicate Heggarty? Would it demonstrate that Iranian did not come from the Andronovo Culture? Of course not! The evidence is overwhelming. Every other Iranic group we have sampled in ancient times — the Scythians, the Sakas, the Sogdians, the Alans, have Sintashta-related ancestry in particular. The ancient Gandhara civilization, which is obviously some sort of Indo-Iranic group (doesn’t really matter which, nobody’s saying Indic and Iranic came from different sources) have MLBA Steppe ancestry. All modern Iranic groups also have Steppe ancestry. Iranians proper, Afghans, Tajiks, Balochs, have Sintashta ancestry, and many historical and modern Iranic groups are rich in R1a. Indians have Sintashta ancestry. Where did it come from? The Yaz III culture was associated with early Iranians, possibly Avestans, before any DNA evidence came out, and then genetics confirmed that it was not a continuation of the preceding BMAC culture, but half Sintashta in ancestry! (TKM_IA and UZB_IA).
And if Iranic was already on the plateau before the Sintashta-Andronovo culture’s interaction with it, where’s the evidence? How come the peoples of Iran as recorded by the Mesopotamians are not Indo-European? The Gutians, the Subarians, the Elamites, the Kassites… No, the Iranians do not show up in the historical record with respect to the Plateau until the Iron Age.
Heggarty’s theory is just straight up bad. It provides no true historical explanation for its fancy language model. I’m not a linguist, but I see no reason to believe his methodology is more effective than the methods that linguists have been using previously to estimate an Anatolian breakoff around 4000 BC. Somehow we are supposed to believe that the Anatolian Farmers spread the same language family into Europe that the Neolithic Iranians spread into South Asia and Iran. These groups are genetically separated by possibly tens of thousands of years. We also have plenty of examples of languages spoken by populations extremely rich in these ancestries that are certainly not Indo-European. This is what Lazaridis himself had to say about Heggarty’s nitpicking of his paper:
Related posts:
(Where I debunk Out-of-India Hypothesis)
The no good story ever started with a glass milk vid edit was crazy in 2019
i’m surprised balochi is so high in steppe ancestry while gilanis are lower, i never even knew nor expected that . also, what’s up with the khuzestani? I thought they were just arabic speaking iranians , didn’t know they were that distinct genetically.
> certainly from Mesopotamia not northern but southern, as this region became the new Imperial heartland under the Achaemenids and continued to be such unil the Arab invasion
who could’ve guessed that today’s southern iraq is where you will find the iranian militias… some irony in how the neocons invaded iraq just to give it away to iran