Correlation vs causation and the “associated” gene

Written by: Kostas Kampourakis


Nowadays it is common to come across media reports about scientific studies reporting statistical associations between particular genes or other sites on DNA and particular conditions, including diseases as well as other characteristics such as intelligence or educational attainment. What happens in these cases is that particular sites on DNA are found more often among people having, for instance, high educational attainment, rather than among people with low educational attainment. The take-home message from the media reports is often that those DNA sites are responsible for the high educational attainment, even though what scientist have actually found is only an association between these two. This is what I have called the “associated gene” concept: it refers to DNA sequences, which can even be single nucleotides – metaphorically single “letters” in DNA (which are called single nucleotide polymorphisms, or SNPs), which have been found to be statistically associated with a condition or disease.

Whereas media reports (and sometimes experts themselves) often represent associations as causal explanations, thus misleading non-experts, most experts know well that association does not necessarily entail causation. Insofar as we do not understand the respective biological mechanisms (developmental and physiological), associations may be informative but they are not explanatory, and so can be meaningless. However, some experts have claimed that we do not really need to understand anything about the biology underlying these associations in order to use them in practice. If we know which DNA sequence a person has, the argument goes, and we know how strong their effect is on educational attainment, we can estimate a person’s chance of doing well at school without knowing anything about the underlying biological processes.

Consider a similar argument: Each summer we can find a strong association between getting a sunburn and eating ice cream. We do not need to understand the processes that connects getting sunburns and eating ice cream. Finding an association between them can help me predict whether you will get a sunburn simply by knowing how much ice cream you eat, without knowing how eating ice cream causes sunburns. This is what these experts claim. Does it make any sense? As you of course know, eating ice cream does not cause sunburns. What connects these two conditions is exposure to sunlight and the resulting high temperatures, for instance going for vacation to Greece during summer. So, to make a good use of any such association, we do need to understand the underlying biological processes. Many genes can be associated with many conditions, so finding an association between a gene and a condition is meaningless unless we know how the two are connected.

An association at the population level, such as “an association has been found between allele a and phenotype X” should not be interpreted as causation at the individual level, such as “a causes X”. There is a vast difference between the two phrases. The issue here is that genes are implicated in the development of individuals, and we cannot discern their important roles from those of other factors, such as environmental ones. Even though it is possible to statistically discern the impact of genes at the population level (for instance 80 % of differences in height are due to differences in DNA), this tells us nothing on its own about what is going on at the individual level where genes and other factors interact with one another (and so we cannot say that 80% of a person’s height is due to that person’s genes).

DNA and genes are important. We just need to understand that whatever they “do” occurs within a certain context that has a local impact. Genes are important, but not the only important factors. With the exception of some rare genetic diseases where a single mutation is sufficient to bring about a diseased phenotype, most of us are not prisoners of any genetic fate.

Understanding Genes by Kostas Kampourakis
Understanding Genes by Kostas Kampourakis

Title: Understanding Genes

Part of the series: Understanding Life

Author: Kostas Kampourakis

Paperback ISBN: 9781108812825

Hardback ISBN: 9781108835473

Enjoyed reading this article? Share it today:

About the Author: Kostas Kampourakis

Kostas Kampourakis is the author and editor of books about evolution, genetics, philosophy, and history of science, and the editor of the Cambridge book series Understanding Life. He is a former Editor-in-Chief of the journal Science and Education, and two other science education book series. He is currently a researcher at the University of Geneva...

View the Author profile >

Latest Comments

Have your say!