In 2018, a reconstruction of a hunter-gatherer from Mesolithic Britain was revealed by the UK’s Natural History Museum. It was quickly misinterpreted by many of Britain’s left-wing that the earliest Britons were “Black” in the colloquial sense, because it appeared that Cheddar Man was quite the melanated individual. Mesolithic Europeans had zero African ancestry, strikingly European cranial and facial morphology, and higher rates of blue eyes than any modern population on earth, but this didn’t stop the BBC from labeling Cheddar Man as one of history’s first “Black Britons” on their utterly abysmal music video “Been Here From The Start”, where they also erroneously list the Roman emperor Septimus Severus as Black.
In recent years, many lay facial reconstruction artists have emerged on the internet, and more recent reconstructions seem to depict Western Hunter Gatherers with only moderate pigmentation, akin to that of perhaps an Arab. I have talked personally to some of these reconstruction artists, they say that higher-coverage WHG samples tend to average more moderate skin tones, but I have never been provided evidence to suggest that, as a general rule, lower coverage can result in a push towards lower scoring samples. Usually, when predicting phenotypes, academic papers use HIrisPlex-S. Just for clarification, IrisPlex was the original tool, and it measured eye color. Then HIrisPlex measured eye and hair color. Then HIrisPlex-S measured eye, hair, and skin color.
HIrisPlex is quite good at predicting phenotypes in modern homogenous Western populations, where it was created to be used for forensics, but being good at predicting traits for individuals is not the same as being able to predict traits for populations. It quite clearly struggles with East Asians, which it predicts as being darker-pigmented than Indians and Bedouins. According to the official HIrisPlex Manual, there is a high loss of accuracy, especially in East Asians but to a lesser extent in general, from missing certain alleles in a sample that results in the model having difficulty distinguishing intermediate and dark/black colors. This loss of efficiency does not result in the model withholding a prediction. East Asians typically have intermediate or light brown skin tones, but I don’t think anyone can say they’ve met a Chinaman who they would call “dark-black”.
It has been demonstrated in modern Brazilians that HIrisPlex cannot make a reasonably trustable call 2-3x more often for pale samples than for dark samples. The sample sizes aren’t great, though, or at least they aren’t for the darkest skin category which only 12 Brazilians had out of the 276 participants. The methodology is also questionable, as people’s skin colors were classified subjectively and may have been influenced by environmental factors such as tanning or room lighting. A Spanish study had a similar issue which they blame the unreliability of HIrisPlex skin predictions on. But two other studies I have found also suggest admixed populations give inaccurate HIrisPlex scores.
The reason East Asians are so misrepresented is probably because HIrisPlex relies on a fairly small amount of SNPs to make its predictions. 6 for eyes, 22 for hair, and 36 for skin. It is likely there are many alleles for pigmentation that are unknown, were not known when HIrisPlex was created, or won’t be known for the foreseeable future. Because skin color has undergone recent evolution, East Asians likely evolved light skin on their own, and have their own alleles for lighter skin that the population HIrisPlex was made for doesn’t have in high amounts. Why it would show such bias towards dark features is confusing, though, and I wonder if the same could be true for Mesolithic Europeans. It’s more likely than you think, because a great deal of genetic drift has occurred in the last few thousand years and a great deal of selection not just for lighter skin but most likely for light skin genes that minimize the deleterious effects of light skin, in the process potentially selecting against older depigmentation genes. I think everyone can agree that people are lighter today than they were thousands of years ago, but there is a big difference in making an inequality statement like that and trying to pinpoint the skin tone of our Mesolithic ancestors.
Typically, a higher reliability in one cohort does not suggest that a predictor is biased in favor of that cohort. The errors, if they are random, should cancel out with a large enough sample size. At least, this is how it works with polygenic risk scores, but maybe the reliance on only a small amount of alleles and the categorical layout of HIrisPlex leads to different starting assumptions. It’s hard to say, and I think the system is quite dumb and it would be much better if some sort of continuous variable such as the melanin index was used to predict skin color in modern DNA. Something could presumably also be done for hair color like this. Even if HIrisPlex was unbiased, even the most well-sampled ancient populations usually won’t have any more than a few dozen samples which are fit for a full HIrisPlex prediction, and the ones that are fit are still very low-coverage compared to modern populations which results in accuracy loss. Most ancient DNA is between 0.5x and 5x coverage.
Meanwhile, HIrisPlex is rather unreliable on samples below 8x coverage. There are means of improving the accuracy of low-coverage HIrisPlex samples. The Southern Arc study simulated 10 allele profiles based on the allele likelihoods of the original low-coverage samples, ran these through, and chose the average of the outcomes. The study I linked above, though, used 1,000 simulations and downscaled1 samples still deviated somewhat in phenotype predictions from their parent samples. Imputation is another method that researchers have used to get workable results from low-coverage DNA, although I’ve only ever seen it used to do component analysis and the probabilistic study found that imputation isn’t effective (it is possible they just imputed suboptimally). these methods are not without their biases. Imputation, for example, can skew the results away from recessive traits including blondism, blue eyes, and very pale skin. The most commonly used aDNA pigmentation predictions are those on Ancient Dna Explorer, a site made by Genetiker. He used to have some very odd theories about the Solutrean Hypothesis, and had his own DNA pigmentation predictions, then he kind of dipped, and then came back and dropped this site and has been silent again ever since. He said on Anthrogenica that he just ran every sample through HIrisPlex-S, so I’m assuming he just used the direct method and probably didn’t include ambiguous calls.
There has been a new pigmentation predictor going around, the Andrei DNA Trait Predictor, which has much fairer predictions for Western Hunter Gatherers, estimating them to be around as dark as modern Arabs. A lot of people seem to have done it to themselves, and say it’s accurate, but I am skeptical of it. Andrei DNA has made some… Odd content, to say the least, in the past. Content that would suggest he has ulterior motives.
I support the work of amateur geneticists and don’t think their work should be cast aside because of a lack of credentials, but Andrei’s trait predictor has never been tested on modern samples let alone on an international scale. I would have liked to make this article longer, I originally intended to do analysis myself but it never went anywhere because of mediocre data quality. I’m not saying to disregard phenotype predictions, but I think they should be taken with a grain of salt.
High-coverage sample that has been artificially made low-coverage.
Nice article. What does "low coverage" mean regarding a DNA sample.
So Aryans did TND in Europe & all was well till Hitler let that nigger sprint in Berlin Olympics?
ਅਕਾਲ