It is a well-known fact that the Iranian languages — Farsi, Kurdish, Pashto, Balochi, and a few others, form a clade with the Indic languages — Hindi, Urdu, Punjabi, et cetera. How they came together, and how they fell apart, however, is a much more controversial topic. There are many questions that go into this, and many questions whose answers rely on this. It determines the dating of the Vedas and the Gathas, and subsequently determines the identity of both the Rigvedic tribes and the speakers of the earliest Iranic languages, namely Old Avestan. There is also not much of a consensus on when Indo-Iranian diverged from the other Indo-European languages. It definitely diverged after Anatolian and Tocharian, and it definitely diverged after Germanic, Italic, and Celtic. Does it form a clade with Greek and Armenian, the so-called “Graeco-Aryan Hypothesis”? Or is it downstream of Greek, but upstream of everything else? Or maybe, it forms a clade with Balto-Slavic.
The Graeco-Aryan hypothesis is rarely seen anymore. Linguistic associations with Indo-Iranians, if they were ever real, probably have to do with Catacomb influence on the succeeding CWC-derived Srubnaya Culture. The Catacomb Culture is genetically more or less identical to the Yamnaya Culture, and the final true successor to it. It would eventually meet its doom, and be overwhelmed by Corded Ware raiders, but some Catacomb lineages survived in the succeeding Srubnaya Culture and spread into Central Asia. There are archaeological similarities with Mycenaeans (funerary masks, spearhead shapes), modern Greeks have the Yamnaya-derived Y-Haplogroup R1b-Z2103, and from what I’ve heard early Greeks (Logkas samples) model well as a two-way between Helladic Greeks and Catacomb/Yamnaya. If Graeco-Aryan existed, you would expect Greeks to be Srubnaya-derived and have R1a-Z93, but they don’t.
The Indo-Baltic hypothesis is more reasonable, but I think it is beyond the realm of archaeology to determine whether the Indo-Baltic hypothesis is true or not. A recent book on Indo-Iranic and a semi-recent study on Indo-European phylogeny supported it. It is true that Balto-Slavic languages are on the Indo-Iranian side of the famous “Centum-Satem divide”, but it isn’t considered as big of a deal as it used to be. If the Indo-Baltic Hypothesis is true, it is likely that it goes like this:
CWC expands into forest-steppe, fleeing expanding Z2103 Yamnaya lineages
Western IE languages go westward, Indo-Balts are represented by Middle Dnieper Culture
Indo-Iranians split off and form Fatyanovo Culture, Middle Dnieper mix with Baltic Hunter Gatherers
If not true, then it just means it went the other way. Fatyanovo splits off first, then Western branches split apart. I don’t really have an opinion on it and don’t think it’s terribly relevant, maybe it could invigorate some Russian Scythianists if true. The biggest piece of evidence (outside of linguistics) that it isn’t true, in my opinion, is the fact that both Slavic and Germanic R1a branches belong to the same clade outside of the Indo-Iranian clade (Z93-derived). Balto-Slavic could not have originated too far east because Western Russian Hunter-Gatherers were mostly EHG-derived and could not have provided the elevated WHG ancestry associated with Slavs today.
Map for reference:

After this point, there are still some questions. It’s unlikely that Andronovo (which is obviously Indo-Iranic, unless you’re some OIT or Heggarty type) is Sintashta-derived, despite the two often being treated as synonyms. It is more plausible that Sintashta is a dead end of Indo-Iranian and both Srubnaya and Andronovo come from the Abashevo culture. Historically the earliest phase of Andronovo (Alakul) was seen as Srubnaya-adjacent.
Being such an expansive culture, it is no surprise that Andronovo did not stay culturally or linguistically unified for very long. The migration southward was fast. We know that by the late 16th century BC, the Indic and Iranic languages had diverged, as it is around this time that we get the first historical records of the Mitanni (who were an Indo-Aryan ruling elite of the Hurrians). However, I hear many incorrect statements about the rise of the Iranians. People claim that the Iranic peoples syncretized with the Oxus Civilization, and that there was a religious war that split apart the Indo-Iranians late in the Andronovo period. The evidence, at least to me, suggests that the Indo-Iranic split occurred before the diversification of the Andronovo culture, likely before 1700 BC and possibly well before it. By the 15th century BC, the Karasuk Culture forms, which I think quite clearly is Proto-Scythian. It is a Siberianized, highly mobile branch of the Andronovo Horizon that has the Siberian ancestry that all later Scythian cultures have to some extent (and it can be used to model other Scythian-Saka cultures quite well). It was quickly followed by the genetically more or less identical Mezhovskaya Culture in Northwestern Kazakhstan and Russia, and then by the Cimmerians in Southern Russia and Ukraine.
It is likely that Zoroaster, if he existed (which I believe he did) lived during the late Andronovo period rather than the more BMAC-influenced Iron Age Yaz II Culture which corresponds to Younger Avestan geography. It’s difficult to say anything conclusive about Airyanem Vaejah (the wintry homeland of the Aryans), because it is likely that even the composers of the Vendidad and the Yasht did not know precisely where it was. However, even if it was relatively southern, in Khorasan or Transoxiana (which may be quite likely, as Transoxiana was the seat of the leader of the Zoroastrian priesthood) these places retained a MLBA Steppe genetic profile for the most part until the complete end of the Andronovo period. The three most southerly Andronovo samples are the Kokcha samples from near the Oxus river (archaeologically associated with the Tazabagyab subculture of the Andronovo Culture), the Dashtiqozy samples from Northwestern Tadjikistan, and the Kashkarchi samples from the Ferghana Valley (which, by the way, are dated to 1100 BC, which is very late for Andronovo) and likely correspond to the Chust subculture of the Andronovo Culture.
.These samples are, for the most part, 80-100% Steppe on G25, but I knew of a qpadm run which modeled one of the Dashti Kozy samples as only around 70% Steppe, so I did some qpAdm runs as well. Indeed, Dashti Kozy is lower in qpAdm, but around 80% Steppe on average (likely lagged down by I4160, the 70% sample I saw and the lowest sample in G25), 14% BMAC, and 6% West Siberian Hunter Gatherer (Tarim Mummy or Botai are examples of this) while the Kashkarchi (Ferghana) samples are around 94% Steppe and 6% BMAC. The Kokcha samples did not require anything other than MLBA Steppe to pass, and when other populations were included it was modeled as 98+% MLBA Steppe. You can see the full tibble for these at the very bottom of this post.
A few months ago, an Indian scientist threatened to sue a Twitter user who I will not name because he posted an unreleased, likely never-to-be-released sample from the Sinauli site of the Indus Valley Civilization. It was quite similar looking to some of the Southern Andronovo samples, 82% Steppe and 18% BMAC. I don’t believe it was an actual Indo-Aryan invader, I think it’s more likely it was a tourist or something. It’s probably not high-quality as there is a large standard error, but it shouldn’t have been hidden.
It is still possible that this is representative of the early Vedic invaders. The amount of BMAC and WSHG in the early Indo-Aryans could not have been too large, as modern Indians do not need to be modeled with anything BMAC or WSHG. Whether Indo-Aryan split off within the Andronovo Culture or as a sort of early breakaway from it is hard to say, because there isn’t one archaeological culture that is clearly representative of the Indo-Aryan invaders. The historical presence of Indo-Aryans outside of India is limited to the Mitanni Aryans, and according to some people, the Wusun of Western China. Some people will point to the Gandhara Graves culture, but I think not. It may have been Indo-Aryan, but it does not represent the core Early Vedic population. It encompasses only a small portion of the Aryavarta, It has too low Steppe and too high IVC to represent the source of Steppe ancestry in later Indian populations, and there is a lack of R1a relative to autosomal Steppe ancestry compared to modern Indians. The modern Dardic people who inhabit the area are very rich in R1a, and although they are among the most similar to Swat Valley samples, they have more Steppe than all but the outliers. There is an outlier from BMAC, near the Afghan-Uzbek border, who is genetically very similar to Swat Valley samples and predates any of the Swat Valley samples by a few centuries. These samples may be products of the Vakhsh Culture of Northern Afghanistan and Southern Tajikistan, which appears to be something of a hybrid culture — a mostly BMAC-like culture with some Andronovo influences. Proto-Nuristanis maybe? Who knows. Not Proto-Dardic as Dardic is probably post-Sanskrit.
By the Classical Period the Swat Valley and the surrounding area had been thoroughly Indicized (or Dardicized) and spoke the probably Dardic Gandhari language. This is reflected in a decrease in IVC ancestry and an increase in Andamanese-like and Steppe ancestry over the sampled time period. I’ve seen mixed evidence on the presence of BMAC in Swat Valley samples. You can model them as BMAC + IVC + Steppe but according to the Narasimhan study they can be modeled as Steppe + IVC alone so it may just be that BMAC adjusts for high Andamanese ancestry in some of the IVC samples. They are clearly pulled either towards Iran-rich IVC or towards BMAC.
There is a popular theory that the split between Iranians and Indic speakers was not due to geography, but due to a sort of religious schism. In Hinduism, the title of Deva is used to refer to the gods, and Asura is used to refer to demons, while in Zoroastrianism the title of Daeva is used to refer to demons while the title Ahura is given to the gods, and particularly Ahura Mazda. There are many differences between Zoroastrianism and Hinduism that one probably wouldn’t expect from peoples with such recent common heritage that the Avestan tribes considered the peoples of Punjab to be their kin. Several important Vedic deities are not preserved among the Iranian Yazatas, and in Younger Avestan texts are referred to as Daevas. One of the major exceptions is Mithra, who is referred to as an Asura in Vedic texts.
I used to believe in this theory, but once you peel back the layers the problems begin to emerge. First of all, Zoroastrianism isn’t a pan-Iranic religion. Although you begin to see some elements of it among the Sarmatians and Alans, it doesn’t appear to be anything like the Scythian religion as recorded by the Greeks. Secondly, the term “Asura” and “Ahura” just mean “Lord”. All gods are technically also Asuras, but not all Asuras are gods. Early Vedic texts did not treat Asura as analogous to a necessarily evil or demonic entity. Likewise, the term “Daeva” was originally more neutral and over time became very negative and equivalent to what we would call a demon. Many gods in the Vedic pantheon are also Yazatas in the Zoroastrian texts, and despite Mitra’s Asuric title in the Vedas he is still worshipped as a god. It is even quite possible that Indra actually is represented in the Zoroastrian pantheon, but that only his epithet Vrtrahan (Slaver of Vritra) survives (as Verethragna).
The identification of Vedic deities with demons was probably a reaction to rising tensions between the Indo-Aryan and Iranic peoples, rather than an in-built feature of Zoroastrianism. The Punjab is referred to in Younger Avestan texts as a land of Daeva-worshippers, “too hot for reason”. Meanwhile, Classical Indians characterize the Iranic tribes as unclean, warlike, meat-eating, impious, inbred barbarians. Most of the peoples labeled “Mleccha” were Iranic tribes. I don’t think early Zoroastrianism in essence is particularly different from the other Indo-European religions, but if a true religious distinction existed that could have resulted in the divide, it would probably be the different characterizations of deities in India and Iran.
To the Hindus, the gods were in some sense mortal just like humans. They died and were reborn at the end of every universal cycle. They lost even more prestige in Buddhism, where they are characterized only as powerful spirits that aren’t integral to the structure of the universe. The Bodhisattvas were far superior to them in power, and more worthy of worship. This wasn’t what all Buddhists believed, some believed that the gods were simultaneously Boddhisattvas such as the Japanese. To the Iranians, the gods are immortal and existed before the creation of the universe, but are still contingent on Ahura Mazda. This would explain why Zoroaster identifies the Daeva with “beings ignorant of the distinction between truth and falsehood”, because they are capable of dying. But, I’m just playing devil’s advocate and don’t believe in the significance of the Ahura-Asura distinction.
Kokcha BA:
# A tibble: 7 × 14
pat wt dof chisq p f4rank Russia_Srubnaya China_Xinjiang_Xiaohe_BA Turkmenistan_Gonur_B…¹ feasible best
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl>
1 000 0 8 3.33 9.12e- 1 2 0.985 0.00290 0.0117 TRUE NA
2 001 1 9 6.75 6.63e- 1 1 0.999 0.000752 NA TRUE TRUE
3 010 1 9 5.55 7.84e- 1 1 0.993 NA 0.00735 TRUE TRUE
4 100 1 9 630. 9.65e-130 1 NA 0.149 0.851 TRUE TRUE
5 011 2 10 7.89 6.40e- 1 0 1 NA NA TRUE NA
6 101 2 10 1823. 0 0 NA 1 NA TRUE NA
7 110 2 10 725. 2.36e-149 0 NA NA 1 TRUE NA
# ℹ abbreviated name: ¹Turkmenistan_Gonur_BA_1
# ℹ 3 more variables: dofdiff <dbl>, chisqdiff <dbl>, p_nested <dbl>
Ferghana BA:
$popdrop
# A tibble: 7 × 14
pat wt dof chisq p f4rank Russia_Srubnaya China_Xinjiang_Xiaohe_BA Turkmenistan_Gonur_B…¹ feasible best
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl>
1 000 0 8 5.69 6.82e- 1 2 0.942 0.000573 0.0569 TRUE NA
2 001 1 9 14.9 9.40e- 2 1 1.01 -0.00518 NA FALSE TRUE
3 010 1 9 10.6 3.05e- 1 1 0.944 NA 0.0561 TRUE TRUE
4 100 1 9 539. 2.38e-110 1 NA 0.146 0.854 TRUE TRUE
5 011 2 10 18.5 4.77e- 2 0 1 NA NA TRUE NA
6 101 2 10 1846. 0 0 NA 1 NA TRUE NA
7 110 2 10 647. 1.75e-132 0 NA NA 1 TRUE NA
# ℹ abbreviated name: ¹Turkmenistan_Gonur_BA_1
# ℹ 3 more variables: dofdiff <dbl>, chisqdiff <dbl>, p_nested <dbl>
Dashtiqozi BA:
$popdrop
# A tibble: 7 × 14
pat wt dof chisq p f4rank Russia_Srubnaya China_Xinjiang_Xiaohe_BA Turkmenistan_Gonur_B…¹ feasible best
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl>
1 000 0 8 3.87 8.68e- 1 2 0.797 0.0600 0.143 TRUE NA
2 001 1 9 25.5 2.45e- 3 1 0.951 0.0489 NA TRUE TRUE
3 010 1 9 18.3 3.22e- 2 1 0.868 NA 0.132 TRUE TRUE
4 100 1 9 461. 1.02e- 93 1 NA 0.200 0.800 TRUE TRUE
5 011 2 10 40.1 1.65e- 5 0 1 NA NA TRUE NA
6 101 2 10 1714. 0 0 NA 1 NA TRUE NA
7 110 2 10 606. 8.96e-124 0 NA NA 1 TRUE NA
# ℹ abbreviated name: ¹Turkmenistan_Gonur_BA_1
# ℹ 3 more variables: dofdiff <dbl>, chisqdiff <dbl>, p_nested <dbl>
right = c("Turkey_Boncuklu_N",
"Iran_GanjDareh_N",
"Georgia_Kotias.SG",
"Italy_North_Villabruna_HG",
"Russia_Karelia_HG",
"Russia_Samara_EBA_Yamnaya",
"Indian_GreatAndaman_100BP.SG",
"Nganasan.HO",
"Peru_RioUncallane_1800BP.SG",
"Jordan_PPNB",
"Russia_AfontovaGora3")
Christian Weston Chandler-derived Srubnaya Culture.
Which is what he tried to show but many wignats attacked him and called him a jew for showing that some ashkenazi jews grouped within southern european parameters but Cypriots didn't.