Power and limitations of genetics as a tool to study racial differences

by Anthony Greenberg, posted on Apr 6, 2018

The recent New York Times opinion piece by the geneticist David Reich on genetics of differences between human races has generated much discussion among my scientist friends. It was followed by a rebuttal from 67 scholars of diverse backgrounds, and a sympathetic article by the blogger and essayist Andrew Sullivan. I typically stay away from this topic because given its horrific history it deserves vary careful treatment by experts in a variety of disciplines. I am not well equipped to discuss most aspects of this, but it did strike me that a few key points are either missing in the debate or scattered across the various contributions. I will attempt to collect and lay out the most essential arguments here. I will confine myself to a fairly narrow scope, in order to not stray beyond my area of expertise. Truth be told, I am writing this to some extent in order to clarify my own thinking, but I hope it will help others as well.

In a nutshell, my concerns are that a somewhat ill-defined concept of “population” is used interchangeably with the even more nebulous, and historically fraught, concept of “race”; that too much attention is focused on small effects, while variation around these estimates is all but ignored; that within-population observations of trait heritability are used to infer the nature of among-population differences; and that the existence of a genetic component of variation in a trait is used to imply that it cannot be modified by environmental interference.

Before I go into detail on these points, let me lay a bit of groundwork. For the purposes of this discussion, a phenotype is a measurable characteristic of an individual, such as height, and I use this term interchangeably with “trait.” We are concerned with the variation of phenotypic values among individuals, and want to know how genetic variation contributes to it. Note that this is not the same as studying the genetic basis of the trait itself. For example, we can look at two people of differing height. Each had to attain their stature by developing from an egg. A multitude of genes had to be turned on or off, executing a developmental program that produced the outcomes we see in these people. However, the DNA sequence of most of these genes is identical between our pair of individuals, and thus these genes do not contribute to the variation we observe.

We next need a way to talk about genetic variation, since this is the material that can potentially contribute to phenotypic differences. Genetic variation is simply a difference of DNA sequence among genomes of individuals. To find such variation, we use any of the available technologies to determine the DNA sequence of each person we picked for the study. We then take advantage of the fact that more than 99% of the genome will be identical between any two people on average to line up the DNA sequences. Looking at this alignment, we find the positions in the genome where there are mismatches. We are interested not only in the number of such positions, but also the frequency of the alternative state (which can span one or more nucleotides). We will call such states at the same position “alleles.” For example, if out of 100 individuals two people carry one allele, while the rest carry another, we say that the minor allele frequency is 2%.

Populations and races

Now we are equipped to discuss the issues raised by David Reich. From the start, he clearly lays out the evidence that historical notions of race are ill-defined and do not necessarily line up with “populations.” But what are these populations? The bloodless technical definition is: groups of individuals that can freely interbreed, but are at least somewhat reproductively isolated from other groups. The emphasis on mating is not accidental or prurient. Groups that do not interbreed do not exchange genetic information and thus can evolve independently from each other. Typically (but not always), population boundaries reflect some sort of geographical separation. So it is with humans. Throughout the piece, when Reich gives examples of research findings, he talks about geographical populations (such as “West Africa”). Yet, despite clearly describing the difficulty of relating the social concept of race to the geographical definition of populations, Reich still slips up and uses them interchangeably. Look at the following two paragraphs:

Beginning in 1972, genetic findings began to be incorporated into this argument. That year, the geneticist Richard Lewontin published an important study of variation in protein types in blood. He grouped the human populations he analyzed into seven “races” – West Eurasians, Africans, East Asians, South Asians, Native Americans, Oceanians and Australians – and found that around 85 percent of variation in the protein types could be accounted for by variation within populations and “races,” and only 15 percent by variation across them. To the extent that there was variation among humans, he concluded, most of it was because of “differences between individuals.” In this way, a consensus was established that among human populations there are no differences large enough to support the concept of “biological race.” Instead, it was argued, race is a “social construct,” a way of categorizing people that changes over time and across countries.

Note the transition from Lewontin’s study of geographical populations to the description of the consensus opinion on “biological race.” A casual reader (as most readers undoubtedly are) would be forgiven for assuming the two terms are interchangeable. This mingling of concepts led to the rebuke by the group of 67 researchers voiced in the BuzzFeed piece. The rebuttal justifiably complains that Reich does not draw a sharp enough line between geographical populations and the historically-defined races. However, it then appears to deny any population differences whatsoever. The most offensive paragraph is

Given random variation, you could genotype all Red Sox fans and all Yankees fans and find that one group has a statistically significant higher frequency of a number of particular genetic variants than the other group – perhaps even the same sort of variation that Reich found for the prostate cancer–related genes he studied. This does not mean that Red Sox fans and Yankees fans are genetically distinct races (though many might try to tell you they are).

It is absolutely true that, given enough genetic variants, any random partition of a group of humans into sub-groups will result in some of these variants differing in frequency between the sets of individuals. But every population geneticist knows that, and robust methods that are theoretically well motivated and extensively tested in practice exist to correct for these random differences. Claiming otherwise betrays a lack of understanding of even the basic methods used to analyze genetic data.

To recap, there are robustly identifiable differences in allele frequencies among geographically defined human populations, beyond what we would expect by chance. That said, these populations are clearly not the same as “races,” a term with a horrific history of abuse and unclear definition. The study of the relationship, if any, between the two concepts requires cooperation at a minimum between geneticists, anthropologists, and sociologists. Although I am by no means an expert in the latter two areas, it seems fairly clear that races are at most poor proxies for geographical populations and thus not particularly useful for any purely genetic research program.

Small differences, large variation

If geographical populations are entities that we can discuss from a genetic perspective, why did I say in the beginning that they are “ill-defined”? If you go back to the definition I presented above, you will note that it is quantitative. There is more interbreeding within than among populations. But what level should we set as a boundary? In practice, if we are using genetic information to estimate the number of populations in our data set, we look for evidence of allele frequency mis-match between groups. If we find such evidence, beyond what we would expect by chance, we declare that the groups are “populations.” As we get more and more data, adding individuals and DNA sequence variants, our power to detect subtle population structure increases. Lewontin, in his 1972 study mentioned in Reich’s article, had data for 14 markers and a range of a dozen to 100 individuals, depending on the marker. His estimate was that 15% of variation in allele frequency among people from different continents was explainable by their geographic origin. A newer examination, published in 2012 using three million markers and 602 individuals, came up with a remarkably similar estimate (12%). But while these additional data do not seem to overturn the older results at continental scale, we are now able to make finer geographical distinctions, even on the scale of countries within Europe.

Despite the greater statistical confidence that we can attach to these fine distinctions, the magnitudes of differences are still tiny. Only about 1% of genetic variation is attributable to differences among populations within a continent. There a two ways to understand what this means. If we take two people from the same population on a continent and compare their DNA sequences, they will be roughly 12% more similar to each other than if we pick pairs from different continents, but only 1% more similar than two people from distinct populations on a continent. Another way to look at this is to try to predict a person’s genetic composition. If we know nothing about the individual other than the fact that they are human, we can go through each site in their genome that is variable in the whole human population and assign the major allele at that site as our predicted value. Because most humans deviate from the mean, we will often be wrong in our prediction. If we additionally know what continent the person is from, we can increase our accuracy roughly by 34% (square root of 12%). If we further know the population, that information only gives us a 10% advantage (square root of 1%). The upshot is that well over half of the difference between people coming from different continents is due to individual, not geographic, factors. Once again, we are not talking about racial groups here. Since the alignment between race and population is at best imperfect, the explanatory power of race is even less.

Finally, let me point out that up to now we did not even mention any phenotypes. We have been dealing only with differences in DNA sequence. Reich moves quickly between discussing allele frequencies and phenotypes, in my opinion not doing enough to direct the reader’s attention to the transition. The extent to which variation in phenotypes among individuals is explained by the variation in their genetic makeup is called heritability. The statistical machinery used to calculate it is the same as the one used for estimating what fraction of genetic variation is due to geographical factors, but the biological meaning is obviously different. Heritability depends on many factors, and can be vastly different among phenotypes. Thus, when when Reich writes that “The ancestors of East Asians, Europeans, West Africans and Australians were, until recently, almost completely isolated from one another for 40,000 years or longer, which is more than sufficient time for the forces of evolution to work,” he is talking about the divergence of genotype frequencies. Much difficult work is still required to establish what, if any, fraction of this genetic differentiation has phenotypic consequences, and how the genetics interact with the distinct environments found at these locations.

Heritability of traits and population differences

The crucial thing to understand about heritability is that it can be estimated only within populations. While in theory DNA variation may contribute to between-population differences in phenotype, we cannot directly observe these effects. We instead rely on statistical models to infer the genetic contribution to phenotypic variation. Random mating within the sample is a crucial assumption in these models. While they depend on the data set structure and the kind of data available (most notably, do we have genome sequence or are we relying on similarities between relatives?), essentially these models amount to estimating correlations between genotypes and phenotypes. As most people know, correlation is not causation. To convince yourself of this you can play with the data sets in Google Correlate. Since we know from biology that information flows exclusively from genotype to phenotype, we are not concerned about misinterpreting the direction of causality from the correlations between genotypes and phenotypes. But spurious correlations can also arise if the things we measure are both influenced by an unobserved factor. In our case, population structure is one of the most problematic confounding factors in heritability estimates. Population differences in phenotypes can be due to local environmental effects, while genotype differences can be due to historical contingencies that we have no way of re-tracing. More subtle, and harder to control for, problems such as sampling bias can also occur.

Given this background, it is easy to see that heritability of a trait within populations says nothing about the genetic basis of between-population differences. This is the glaring error made by Sullivan in his article:

… genetics have a significant part to play (heritability ranges from 0.4 to 0.8) in explaining different racial outcomes in intelligence tests…

David Reich of course knows better, but even he makes a more subtle form of this mistake. After discussing the studies of the genetics of educational attainment and IQ (at least this is my best guess at the papers he mentioned), he writes

Is performance on an intelligence test or the number of years of school a person attends shaped by the way a person is brought up? Of course. But does it measure something having to do with some aspect of behavior or cognition? Almost certainly. And since all traits influenced by genetics are expected to differ across populations (because the frequencies of genetic variations are rarely exactly the same across populations), the genetic influences on behavior and cognition will differ across populations, too.

But the studies he cites were all conducted within populations. Furthermore, as is typical when estimating genome-wide associations between phenotypes and genotypes, the authors explicitly control for population structure within their samples. Non-genetic differences between populations can work in the opposite direction of the genetic variants identified within populations. Worse, the relationship can be non-linear, with the magnitude and direction of the genetic effect itself depending on the environment. Let me say this again, the magnitude of within-population heritability says absolutely nothing about the genetics of between-population differences.

I am not saying that there are no meaningful genetically-influenced phenotypic differences between human populations. I am saying that our research on trait heritability and genome-wide associations does not help us find such differences and does not allow us to form any expectations as to what the eventual answers will look like. I believe this is true even for model systems and well-studied agricultural species, and remains an area of active research.

While we are on the subject of genome-wide associations, let me loop back to what I said before about effect magnitudes. Like with the power to detect allele frequency differences between populations, our ability to detect genotype-phenotype associations grows with sample size. The studies mentioned above have very large samples, and so are able to measure very subtle effects. But tiny effects is all they find. Okbay and colleagues, for example, find that the variants they identify explain between 0.01 and 0.035% (yes, percent) of the variance. Knowing an individual’s genotype at these positions in the genome yields virtually no useful information if we want to predict these people’s educational attainment. As a fellow geneticist, I understand why David Reich is excited about these results. There are reasons other than phenotype prediction to go after such effects, although it is appropriate to question whether genome-wide associations like these are cost-effective given limited research budgets. But when we communicate these results to the public we must make an effort to step back and put them in proper perspective, and realize with a measure of humility that genetics is not all there is to biology.

Directional environmental intervention and genetics

So far, I have argued that even though we can confidently identify even subtle differences in genotype frequencies among, and associations of DNA variants with phenotypes within, populations, we still have no idea what, if anything, the genetic variation contributes to among-population phenotypic differentiation. Nevertheless it is possible, although not very likely, that genetic variation will turn out to drive most population differences even for moderately heritable phenotypes such as IQ. Does this mean that we can do nothing to mitigate these differences with environmental interventions? The answer is unequivocally “no.” David Reich is very emphatic on this point. The 67 signatories do not appear to accept that populations are in any way genetically differentiated. I wonder if they are driven to this untenable position by the misconception that a finding of a genetic basis of such differences would mean that nothing can be done about them, a possibility they would find catastrophic. Andrew Sullivan comes at this from the opposite perspective. While he does not quite argue that nothing can, and therefore should, be done about group disparities in IQ because of their genetic nature, he does write

It’s both undeniable to me that much human progress has occurred, especially on race, gender, and sexual orientation; and yet I’m suspicious of the idea that our core nature can be remade or denied.

Charles Murray, the author of the Bell Curve, and Sam Harris in their conversation recorded about a year ago go much further and flatly say that there is nothing we can do about traits, IQ in particular, that are “moderately heritable,” i.e. about 50%.

To see why this is completely wrong, we can look at men’s height measurements. This is a highly-heritable trait, with 89 to 93% of variation explained by genetic factors. Nevertheless, average male height has been steadily increasing over the past 150 years. Most, if not all, of this increase is likely due to changes in environmental factors such as nutrition. Perhaps an even more striking example is provided by the inborn errors of metabolism, diseases that are caused by mutations in genes required in metabolic pathways. These are not only close to 100% heritable, but also caused by single mutations. This is as genetic as we can get, yet many of these disorders can be managed and sometimes completely cured by dietary modifications.

Given the dark history of bigotry and xenophobia, discussions of race cannot be viewed as purely intellectual exercises. The influence of these debates on public policy and attitudes affect lives of millions of people. Geneticists have an important role to play in the discussions, and I am happy to see a prominent scientist like David Reich calmly and very ably engaging the public on this topic. However, as a close look at his contribution reveals, it is extremely hard to present a clear, yet nuanced view of the genetics of racial differences. We have to work extra hard to choose our words appropriately and always be clear what aspect of this tangled problem we are talking about. Unconscious professional biases also creep in. The only remedy is a group effort, with constructive criticism and engagement with scholars from other disciplines and the public. Those who are bent on misconstruing genetics to rationalize bad acts are probably mostly unreachable, but the majority of the public will likely respond reasonably given correct information. Well-meaning policy makers who are worried that any discovery of a genetic basis for socially important traits like educational attainment or IQ would render their efforts moot should be reassured that genes are not destiny, even when they play important roles in trait expression.