10

It is widely known that correlation does not necessarily imply causation. However, after recognizing that the correlation alone doesn't prove causation, what needs to happen in order to evaluate whether the causation exists? What's the missing ingredient that make some subset of correlations indicative of actual causation.

The following videos discuss the issue, mostly about why you can't infer causation for correlation, but also touch on what's missing:

Both videos state the basic idea that in order to infer causation from correlation, you need to know the causal mechanism. You have to know how the one item caused the other.

This strikes me as incorrect. I would say that in order to distinguish spurious correlation from causative ones you have to investigate which hypotheses generate accurate predictions. I can postulate seemingly plausible mechanisms for many spurious correlations, and I don't need to know the underlying mechanism to infer the existence of a correlation. The only thing that really matters in establishing a hypothesis is the track record of predictions.

Am I on base here? Or is my understanding of the scientific method flawed?

Joseph Weissman
  • 9,590
  • 8
  • 47
  • 86
Winston Ewert
  • 259
  • 1
  • 8
  • I was going to ask this too! Something like: Given that scientific method can only ever demonstrate a repeatable experiment, isn't it the case that all you can ever do is establish a correlation between events x and y ? I'd suggest that it boils down to the interactions at a molecular level, which is still only observed. I guess my question would have been very similar but ended up asking whether causality is unprovable, just demonstrable statistically. – user2808054 Feb 19 '15 at 09:51

8 Answers8

5

Causation has been an open question in philosophy at least as far back as Hume's problematization of causation in An Enquiry concerning Human Understanding. As such, there's no one universally accepted, uncontroversial understanding of causation in the philosophical world.

The specific problem with your proposal is that there can easily be multiple theories that do match a given set of data, including ones that call A a cause of B, when in reality A and B are independent consequences of C. This problem is sharpened in the case where you legitimize attribution of causation without explanation of the underlying mechanism.

Chris Sunami
  • 29,852
  • 2
  • 49
  • 101
5

One can have quantitative predictions only if a hypothesis is well formulated within a background theory, which de facto will make of the hypothesis some sort of "mechanism". Also note that track records is not the only thing that matters in establishing a hypothesis, there is also: compatibility with other accepted theories, with common knowledge and auxilliary assumptions, with methodological principles, simplicity of the hypothesis, etc. For these reasons, an hypothetical causal relation between two types of phenomena will generally rely on (or take the form of) a hypothetical mechanism, framed in a background theory, which explains and predicts the observed correlation.

As noted in an other answer, different hypotheses (one in which you have a spurious correlation between two effects of a common cause, another with a direct causal relation) can make the same predictions. Track records won't be sufficient in this case.

One way to distinguish causation and spurious correlations is through interventions. If changing a factor produces an observational change then you have a causal relation. One can also vary the experimental setup to force two hypothesis to make different predictions in certain circumstances, which is also a form of intervention.

You can read this for more ways to distinguish correlation and causation: http://people.hss.caltech.edu/~fde/papers/Eberhardt_PhilComp.pdf

Quentin Ruyant
  • 5,858
  • 16
  • 29
3

The short answer to how you demonstrate causation is that you don't. Any explanation covers a lot of stuff you don't and can't observe (e.g. - what's happening in the core of the sun). There are also more basic logical reasons why you can't demonstrate causation, or anything else. To create knowledge, you make guesses about what is happening in reality, and then you test the guesses by critical argument and experiment. Guesses that survive tests may be true.

If your theory predicts some particular correlation between X and Y, and you find the correlation, then your theory has passed that test and it may not be false. That's all a correlation can tell you.

You say you don't need to know the underlying mechanism, but you're wrong. For a start, if you don't know the underlying mechanism, you can't tell what will count as a test. For example, if I want to trap atoms in a magnetic field, then unless I know how the magnetic field traps the atoms, I won't know if I've set it up correctly, how long I should wait for the atoms to be trapped and stuff like that. Another problem is that if you don't know the underlying mechanism it is difficult to test whether an apparent failure is a result of the theory being wrong. If you know the mechanism you can test whether your experiment is working properly by doing related experiments that test different parts of the experiment, e.g. - look at currents in magnetic field producing coils.

alanf
  • 7,748
  • 13
  • 20
1

To generate accurate predictions, is a basic demand from scientific hypotheses. But it has no specific relation to causation. Therefore it does not relate to the question, as to the gap between causation and correlation.

It isn't clear whether an "underlying mechanism" is the best way to fill the gap between correlation and causation. What could such a mechanism be? For one thing, it cannot be a causal mechanism, for this would just push the question back instead of answering it. Anyway, the consideration of accurate prediction is unrelated to this issue.

Ram Tobolski
  • 7,371
  • 1
  • 13
  • 21
  • I don't really understand. Isn't the point of a scientific hypothesis to postulate causation? – Winston Ewert Feb 19 '15 at 02:44
  • 1
    @WinstonEwert Newton faced this issue. His theory of gravity gave a mathematical description of how we observed bodies falling; but he supplied no underlying mechanism. He was famously challenged on this point and said: *I frame no hypotheses." The entire quote is well worth reading. http://en.wikipedia.org/wiki/Hypotheses_non_fingo Newton understood what so many people still don't understand today: that science is ultimately descriptive and not necessarily explanatory. Even if we have an "explanation" we never have an ultimate explanation. In the end, science can only describe. – user4894 Feb 19 '15 at 03:01
  • @user4894, it seems to me that Newton was postulating a causative relationship between massive bodies and falling. He didn't know anything about the causal mechanism, and wisely refused to guess at it. But he was still able to infer the same sort of cause and effect relationship existed. – Winston Ewert Feb 19 '15 at 03:07
  • 1
    @user4894 Newton's statement that he "frames no hypothesis" is widely recognized to be false. For example the principle of inertia is not falsifiable since no inertial system exists in reality, and masses and forces are free parameters that he could not deduce from pure observation, without hypothesis on the form of his laws. Newton's mechanics is explanatory. – Quentin Ruyant Feb 19 '15 at 06:28
  • @user4894 the project of reducing scientific theories to pure phenomenal descriptions was pursued by the logical empiricists, and notoriously failed. Importantly, one aspect of this failure is the fact that physical properties are generally dispositional (i.e. causal), and dispositions do not reduce to regularity descriptions. – Quentin Ruyant Feb 19 '15 at 06:33
  • @WinstonEwert Science is a marginal issue here. What is really at issue is to distinguish the metaphysics of causes from epistemology. Correlation belongs to the metaphysics of causes. A correlation exists regardless of whether we know about it, or not. Prediction belongs to epistemology. To bring predictions to the question of the nature of causes would be a category mistake. – Ram Tobolski Feb 19 '15 at 18:30
  • @RamTobolski, my question wasn't about the nature of causes. My question was about the scientific method and inferring the existence of causative relationships. – Winston Ewert Feb 20 '15 at 01:51
  • @WinstonEwert It comes down to the same thing. Correlation belongs to "what we infer". Prediction belongs to "how we infer". They are not on the same level. They should not be mixed. – Ram Tobolski Feb 20 '15 at 04:10
  • @Ram Tobolski I don't agree. Correlations are observed, not inferred. Causation is inferred (or not) on the basis of correlations. Also regarding your answer: of course the underlying mechanism will be causal, that's precisely the point: infering a causal mechanism from observed correlations. – Quentin Ruyant Feb 20 '15 at 10:36
  • @quen_tin 1 It doesn't matter for my point whether correlation is observed or inferred. What does matter is that correlation is a "what", while prediction is a "how". Correlation is inferred (or observed), while prediction is an inferring. 2 If the criterion for a causal connection is a causal mechanism, it will lead to an infinite regress. Because each causal step in the mechanism will require an additional causal mechanism. – Ram Tobolski Feb 20 '15 at 13:15
  • @Ram Tobolski no, the regression will stop at the fundamental level, where a background theory simply assumes causal relations. In any case the question is how to infer causation, so I don't see why predictions would be irrelevant. – Quentin Ruyant Feb 20 '15 at 15:33
  • @RamTobolski, my question was about how we infer causation. It seems to me that you are trying to answer a question about what causation is. It simply isn't the question I asked. – Winston Ewert Feb 20 '15 at 15:37
  • @quen_tin The background theory cannot assume causal relations, or the regress won't stop. – Ram Tobolski Feb 20 '15 at 16:03
  • @WinstonEwert The question "how to infer X" leads naturally to the question "what is X". – Ram Tobolski Feb 20 '15 at 16:08
  • @Ram Tobolski what regress? There are primitive causal relations posulated by the theory. One who accepts the theory accepts them as primitive. Other causal relations must be rooted on them through a mechanism. – Quentin Ruyant Feb 20 '15 at 18:42
  • @quen_tin To make the issue a bit less abstract, can you give an example of such a background theory, and of some primitive causal relations which the theory postulates? – Ram Tobolski Feb 20 '15 at 19:15
  • A correlation between a virus and a given symptom explained through a genetic mechanism (the base theory is biology). Or you can find examples in everyday life: each time you open a door, a window opens on the other side of the room (the base theory is fluid mechanics). – Quentin Ruyant Feb 21 '15 at 00:09
  • @quen_tin Thanks. What would be, in these examples, the "primitive causal relations which the theory postulates"? – Ram Tobolski Feb 21 '15 at 10:44
  • In the case of fluid mechanics, interactions between molecules idealized as punctual particles. In the case of biology it could be how certain genes produce certain chemical compound perhaps. – Quentin Ruyant Feb 21 '15 at 13:49
  • @quen_tin I don't think that particle interactions are causal relations. For one thing, an interaction is mutual, bidirectional. There is no inherent separation to "cause" and "effect" in a particle interaction. So I don't think that particle theories "postulate" causal relations. The term "cause" is external to the theory. It is not postulated by it. – Ram Tobolski Feb 21 '15 at 19:10
  • Ok I agree it's an issue but that's not really the point I think. You can interpret theories differently but if you accept causal interpretations (which most people implicitely do--causal talk is pervasive in science) then the way you go from correlation to causation is through a mechanistic explanation in terms of a base theory. – Quentin Ruyant Feb 22 '15 at 11:03
  • In classical physics, a causal interpretation is not precluded at all: just say that the state of a system at t causes its state at t+dt through the action of newtonian forces. – Quentin Ruyant Feb 22 '15 at 11:06
  • It's even more true in biology where terms are often defined by their causal profile (a gene codes for a protein, an enzyme causes such reaction...) – Quentin Ruyant Feb 22 '15 at 11:15
  • @quen_tin 1 The relation between two momentary states of a system, if it is a causal relation at all, is entirely generic. It does not depend on the details of the theory. It is not, then, a theory-specific relation. It cannot be said that such a relation is postulated by the theory. 2 Concerning causal interpretations: Given the theory, the causal interpretations are pretty much straightforward. The scientists don't have freedom in the decision, which relations are causal. It is not given to them to postulate which relations are causal. – Ram Tobolski Feb 22 '15 at 18:41
  • I don't know what you mean by entirely generic. The relations between two physical states at two instants depends on the laws of nature postulated by the theory. Postulating evolution laws amounts to postulate causal relations, or to endow physical properties with causal powers (dispositions). It's straightforward indeed, but what is the problem? – Quentin Ruyant Feb 22 '15 at 22:24
  • In any case I can ensure you that there are sound causal or dispositional interpretations of physics and mechanistic accounts of higher level causation in contemporary philosophy (as well as many who, like you, disagree but do not deny that such positions exist) and I have no interest in pursuing the discussion any further. You can have a look at stanford encyclopedia entries on causation, dispositions or laws of nature if interested. – Quentin Ruyant Feb 22 '15 at 22:25
  • @quen_tin Thank you for the discussion then, and for the recommendations. – Ram Tobolski Feb 22 '15 at 23:51
  • @Tobolski: If they have no freedom in determining which relationships are causal, then what is that they are doing when they do? – Mozibur Ullah Feb 24 '15 at 11:47
  • @Mozibur They are doing science. This includes developing theories with terms like mass, gravity, energy. The term 'cause' itself is not scientific, but comes to science from without. – Ram Tobolski Feb 24 '15 at 18:41
  • @tobolski: doesn't this, to some extent depend on one draws the line in science? After all, Aristotle wrote in a book called the Physics, that actions cause reactions at a point of contact - which I expect, after a long historical genesis, suitable modifications was one of Newtons Laws. – Mozibur Ullah Feb 27 '15 at 10:17
  • Having said that, I doubt many working physicists are aware of this genealogy; and if they did, most wouldn't care; to that extent, which is quite a large extent I think you are right. – Mozibur Ullah Feb 27 '15 at 10:21
  • @Mozibur thanks. And even Aristotle 's Physics is not so much empirical science as an apriori foundation for science. – Ram Tobolski Feb 27 '15 at 12:44
1

I think in a purely theoretical world, the only time correlation could imply causation is if you invoke ceteris parabus to the max. Ceteris parabus is when you ensure all variables are the same. If every single variable apart from your dependent and independent are kept identical, I think it would be a fair assumption that correlation implies causation.

The scientific method is not perfect, because it 'affirms the consequent'.

Consider the example of hooke's law: Our hypothesis is that extension is proportional to the attached load. Experimentally, we see that this is the case.So we assume our hypothesis is correct.

We are basically saying:

If P, then Q

Q occurs, therefore P occurs

This is flawed logic. What if something else caused Q

Take a clearly false example:

If it rains, the grass is wet

The grass is wet, therefore it has rained

What if I was watering the garden?

Hope this helps.

lagrange103
  • 265
  • 1
  • 9
0

You are absolutely right.

As Hume's famous "problem of induction" suggests, the difference between correlation and causation may be purely numerical. Attributions of "causation" simply assume that the "past resembles the present" and "the future resembles the present," and that what happened "before" is our best Bayesian prediction of what will happen in the future.

That really is our best scientific consensus. Each "guess" gets stronger the longer it survives and thus becomes a surviving "prediction." We predict the sun will rise tomorrow, because it has risen for millions of tomorrows. We have no better application to certainty. Certainty is only a reduction of possibility.

However, this is why modern "causation" is, in many people's view, a very suspect concept, and Aristotelean causation may be making a comeback, I really don't know. We now see ideas like "overdetermination" or "strange attractors" or "statistical states" or various other mediums of change introduced into even physical science in the place of old "billiard-ball"causation.

It might be better now to say that "causation" is a subset of "correlations." As you suggest. What limits it to this subset would seem to be that it happened "before" the effect. Before and after. Which leads us, unwillingly and miserably, into theories of time, which are the most difficult and unsettled regions of philosophy and science.

Nelson Alexander
  • 13,532
  • 3
  • 29
  • 53
0

We recently discussed a kindred but different question: Is the idea of a causal chain physical (or even scientific)?

"The only thing that really matters in establishing a hypothesis is the track record of predictions." - OP

The Ptolemeic system of predicting planetary positions, with tables refined over generations, was initially better than the predictions of the heliocentric model. But consider how much more parsimonious it is, giving a clear insight into why retrograde motion occurs, rather than only predicting it.

I would follow Nancy Cartwright's approach in How The Laws Of Physics Lie, that our theories rely on making correct abstractions, which can then have necessary relations - but are only true if the abstractions are correct. Say Newton's law of gravity, which he proved geometrically, follows from a set of axiomatic statements. But one of them was not properly recognised, the assumption space is flat (parallel lines will never meet), which no one understood how to challenge until later.

Science, the domain of observation rather than proof, is always tentative. Correlations always face the Problem of Induction. But we can say, if nothing has changed, then this prediction will occur with confidence, and know if it doesn't something has changed, the explicit or implicit assumptions/abstractions to make the model, are not valid.

Hypothesis generation is a challenge for a simplistic understanding of science and how it works, because there simply is no universal, or set of, rules for making a hypothesis. It requires an act of creativity and imagination, which do not belong in the realm of the mechanistic or purely logically inferable. In fact mathematics requires creativity and imagination to do well also - the system may follow necessarily, but there will be many equivalent formalisations like different descriptions given of the same object, and these can be more or less skillful (Godel's theorems are a great example of clarifying consequences of axioms).

In physics & other sciences, the world behaves as it does regardless of our theories. But a good hypothesis goes to the easiest possible ways to differentiate between sets of assumptions, not just between theories but paradigms, shedding light ideally even on where foundational assumptions we have about the world, like space being flat.

Popper rejected there being a logic of discovery, suggesting instead something more like a fitness landscape, and evolutionary selection of hypothesees. Which is fine where there is continuity in the 'discovery space'. But some anomolies require creating a whole new space of ideas, an entirely new kind of idea that has not occurred before.

CriglCragl
  • 21,494
  • 4
  • 27
  • 67
0
  • Is correlation sufficient to establish that X causes Y? ---- No!
  • Is correlation necessary to establish that X causes Y? ----- Yes!

If correlation is not a sufficient condition for causation, then what is? Demonstrating causation is proving the existence of a cause (X) and effect (Y) relationship (causality); demonstrating the causative link, its mechanisms, and its direction.

Does correlation imply causation?

  • Is a perfect correlation between X and Y ever sufficient to demonstrate causation? What if the coefficient of determination is a 100% (a measure of the best fit of correlated data)?
  • Does a perfect correlation even count as evidence (not proof) towards establishing causation?

How can one demonstrate a cause and effect relationship between X and Y? X causes Y means that X is the cause (event), and Y is the effect (event)? How can one determine which of X and Y is the cause and which the effect?

How can one determine the direction of causation? How can one determine whether 'X is the cause and Y the effect' or whether 'Y is the cause and X the effect'.

"Correlation does not imply causation" means there is no way to legitimately deduce (i.e., derive) a cause and effect relationship between two variables X and Y solely on the basis of an observed correlation between them, no matter the strength of the correlation. Correlation alone cannot be sufficient to establish a cause and effect relationship (i.e., to demonstrate causation); more is required to determine which of X and Y is the cause and which the effect (i.e., the direction of causation).

Correlation is a necessary condition for causality, not a sufficient condition! Correlation is not sufficient to demonstrate causality, no matter how strong the correlation between X and Y, because just because X and Y co-occur does not excluded the possibility that both X and Y are caused by a third variable Z.

Moreover, from the mere fact that X and Y co-occur one cannot deduce (deductively derive) the direction of causation from X to Y or in reverse; that is, correlation can never be sufficient to determine which one of the variables X and Y is the actual cause and which the effect!

The following causal relations exist between two events (X, Y), some such that exactly one of X and Y is the cause, and the other the effect called direct and reverse causation. Furthermore, there is a relation between X and Y such that neither X nor Y is the cause (in the case where both X and Y are the effects of a common cause Z), and the option in which both X is the cause of Y and (simultaneously) Y is the cause of X, where both X and Y are individually both the cause of the other and the effect of the other.

  • H0: The Null Hypothesis: There is no connection between X and Y, called: coincidental correlation.
  • H1: Direct Causation: "X causes Y"; let this direction of causation be henceforth 'forward'.
  • H2: Reverse Causation: "Y causes X"; the reverse of 'forward' (i.e. the "converse"). H3: "X and Y are both caused by a third variable Z".
  • H4: Bidirectional Causation: "X causes Y" and "Y causes X". When X and Y cause one another, simultaneously, at the same time, in the same sense, it is referred to as bidirectional causation. Otherwise, if 'X causes Y' and then 'Y causes X' and so forth, then this type of causation is called cyclic causation

The following approaches to analyzing causality exist in contemporary philosophy:

  • Empirical Regularity: constant conjunctions of events.
  • Probabilistic: changes in conditional probability.
  • Counterfactual: counterfactual conditions (conditionals with a false if-clause)
  • Mechanistic: mechanisms underlying causal relations
  • Manipulationist: invariance under intervention.

etc.

Example: Counterfactual analysis of causality:

According to the counterfactual view of causality, X causes Y iff without X, Y cannot be. It can be stated that X causes Y iff the two events (X,Y) are spatiotemporally conjoined, and X precedes Y. Causality seems to require not just a correlation, but a counterfactual dependence; that is, a conditional (if-then) statement with a false if-clause.

Causality can be predicted, not established, by a regression analysis, a method of analysis in which the potential causative variable is rendered into a regressor, an explanatory variable, apart and disparate from regressors representing variables other than the potential causative factor.