Observing a zero probability event

Question

I am not a philosopher. Just curious. From a philosophical point of view, if one observes an event with zero probability, does this immediately lead to a contradiction? For example, if you have a probability of 1 of doing A but you didn't do it.

Note: In probability theory there is a technical difference. However, since the integral does not change on a zero-measure set, it does not change much anyway.

Also note that "drawing from a continuous distribution" does not work. Never did a single person literally draw a sample from U(0,1). Many people draw uniformly from little segments to approximate it instead.

I made an edit which you may roll back or continue editing. You can see the versions by clicking on the "edited" link above. — Frank Hubeny, Aug 06 '18 at 04:23
So you're essentially asking what it means if you throw a 6-sided die with sides numbered 1-6 and it shows a '7'? — Flo, Aug 06 '18 at 05:03
No. Just consider a biased dice which has 6 sides but with probability (1/5, 1/5, 1/5, 1/5, 1/5, 0). And you come up with a 6. Notice that 6 is in the sample space in my example, while not in your example. I don't think that this make any materialistic difference in real life or in probability theory. I am just curious how philosophers discuss this issue. — failedstatistician, Aug 06 '18 at 05:08
If a 6 comes up, that means its probability was not zero, after all, doesn't it? — Flo, Aug 06 '18 at 05:10
Sure. I think that way too (I am a biostatistics PhD student). However I am seeing philosophers seriously discussing things like 'if you are to choose A with probability 1, what is your rational choice', as linked...https://philpapers.org/rec/AHMIIT-2 — failedstatistician, Aug 06 '18 at 05:15
Probability: the ratio between the number of ocurred instances of a particular event and the total number of events. Before this sentence — identified by this particular sequence of letters, i.e. a clone of it would be considered the same sentence — was written, it had never been written before, therefore the probability of this particular sentence, as of writing it, is zero (especially since I keep copyediting it when writing it). You can do the same trick, with another sentence, and there you have a zero probability event happeninh before your very eyes. — MichaelK, Aug 06 '18 at 06:57
@MichaelK You are conflating probability with frequency of occurence. — Flo, Aug 06 '18 at 08:56
@MichaelK, writing a specific never-before-written sentence when you intend to write something in general has a nonzero probability. The number of different sentences you can write is finite (since you're limited to finite-length sentences as a human being), and so the probability isn't zero. — David Thornley, Aug 07 '18 at 15:36
@failedstatistician Given the content of the comments, it is safe to say that your question was never understood. I am in the dark about it, also. Please consider revising and expanding upon the question. — Mark Andrews, May 15 '19 at 02:31
To be fair, the same point of view that would insist that people are "drawing uniformly from little segments" should also insist on interpreting "drawing from a continuous distribution" to mean something along the lines of iteratively drawing from successively smaller and smaller segments when more precision is needed. — , May 15 '19 at 07:19

Cort Ammon · Answer 1 · 2018-08-06T14:17:47.413

The answer is generally "events with zero probability do not happen," so observing one must be a contradiction, but that's using layman's speak. If you want to be more precise, you have to be very cautious with your definitions. It turns out you can observe one without a contradiction, but it is a corner case in the definitions of random variables.

One pattern which comes forth is when we argue that if we draw from a continuous distribution such as U(0, 1), and get, as an example, 0.31415, we can show mathematically that the probability of getting exactly 0.31415 was 0, because we take an integral of the probability density function from 0.31415 to 0.31415 which is always zero unless you include strange functions like the dirac delta function in your distributions. However, that definition involved a bit of circular logic. We first made our observation, and then we calculated some probability afterwards.

Probably the most useful construction I can think of to answer the question would be a random variable X, where the probability density function is a piecewise function:

1 if X is -1
1 if X is [0, 1] (inclusive of both endpoints)
0 elsewhere

What I've created is basically a uniform random variable between 0 and 1, but I added that funny discontinuity at -1. Now we can talk about the probability of any draw from X being negative. This is the integral of the probability density function from -infinity to 0, which is clearly 0, by the rules of integration. However, from the definition, it is clear that there is a negative number which can indeed be drawn. Thus observing a negative number does not contradict the definition of X.

Now one could explore the likelyhood of this event occurring. We can use tests like z-tests to show that events like this are infinitely unlikely, but that they do not contradict the original definition of X.

Now that's the math. The follow on question would be whether any particular philosopher who talks about "zero probability" is choosing to use the formal definitions from mathematics, or if they are using the words to describe something that's just a hair different. In general, I find philosophers who talk about "impossible" events are not intending to use those specific formal wordings. However, if they elected to use the precise words "zero probability," those are odd enough word choices as to suggest that they intended to use the formal mathematical definitions.

In the meta question related to this answer, Philip Klöcking brings up a few good points. One is that we have to consider alternate number systems besides those of the real numbers. However, if we consider other systems, we have to be careful to use their definition of "zero" carefully. Given that probabilities are defined on a metric space, and in metric spaces, if the distance between X and Y is "zero," then they are the same point.

The other is perhaps the more interesting philosophical question which I elided over by questioning definitions before focusing on the math. The question uses the phrasing "if one observes an event with zero probability." Within the language of probability, that phrase is well defined. Outside of that narrow scope, that phrase is tricky. Questions like "what is an event" and "can someone observe anything" are famously frustrating questions to answer without going in loops. The idea of an event having "zero probability" also involves mapping the real world into the mathematics of probability. How was that mapping done? I simply assumed it had been done meaningfully, due to the phrasing of the question, but there may have been a contradiction in how that was done. In the VSauce video How to Count Past Infinity, he explores a similar question around the 13 minute mark regarding the many mathematical concepts of infinity that have been invented.

If I may leave with one of my favorite puzzles, a game:

In this game there are two people, labeled Dealer and Player. The Dealer writes two different numbers down on two slips of paper, and seals them in envelopes. It doesn't matter what the numbers are. They can be 0, 5, 4081922, -382.393193, pi, anything. They just have to be different numbers. The Dealer then hands the two envelopes to the Player in any order they please. The Player then selects one envelope to open. They then must decide whether the number in the other envelope is greater or less than the number in the envelope that was just opened.

Obviously it's easy to win 50% of the time. The challenge of the game is to come up with a strategy which wins more than 50% of the time. Can you think of a strategy?

The solution is:

Open an envelope randomly. Before opening the envelope, you pick a random number from a distribution whose domain covers all of the random numbers. Any distribution will work, but a Gaussian distribution is the most usual choice. When you open the envelope, you compare this random number against the one that is chosen. If your random number is greater than the one you exposed, you say that the unseen number is the greater one. If the random number is smaller, then you say the unseen number is the smaller one. If they're the same, then pick randomly.
How does this work? Well let's look at what can happen. If the number you pick is greater or less than both numbers, then you're basically picking randomly, so you have a 50% chance. If you nail the number you see exactly, you have a 50% chance. However, if you choose a value which is between both numbers, you have a 100% chance of getting it right.

Because there's always a number between any two given real numbers, there's always a random value you can pick that happens to be between both numbers. Thus you can always win more than 50% of the time.

@failedstatistician Since I failed to address the philosophy side of it at all, I added an extra paragraph. I also added one of my favorite little number games, which is highly related. — Cort Ammon, Aug 06 '18 at 04:47
That's interesting. Regarding the puzzle: I don't have an answer yet, though my math side is forcing me to get the distribution of number written on the envelopes... — failedstatistician, Aug 06 '18 at 05:03
Fascinatingly, it doesn't require a distribution at all. In fact, the Dealer can know you are using this method, and try to pick numbers with the express intent of mucking it up, and the Dealer still can't break your greater than 50% odds. They'd have to pick an evil number like "infinity," in which case they'd force me to rewrite the game more precicely to call for two real numbers. — Cort Ammon, Aug 06 '18 at 05:05
While I can speculate on why this answer may have attracted a downvote, I can't imagine how I would improve on it. I'm giving an upvote and taking this to Meta. — christo183, Aug 06 '18 at 08:02
@CortAmmon I'm not convinced about that last puzzle. Do you have a reference for it? — Chelonian, Aug 06 '18 at 15:23
@Chelonian Here is the original problem, as posted by Randall Munroe, which is the first reference that I find. Here is a rigorous analysis on Math Overflow. Their final opinion is that the original phrasing is not well defined mathematical wording, but "... with the proper quantifiers and the dependence on x and y explicit, it becomes a cool mathematical result rather than a paradox." — Cort Ammon, Aug 06 '18 at 15:40
What I think makes it possible is the fact that you know two pieces of information. 1) the second number is a real number 2) the second number is not the same as the first. It is then the properties of the real numbers that let you manipulate this to be a solution greater than 50%. — Cort Ammon, Aug 06 '18 at 15:41
Also worth noting is that, if we look at it, we can see that the probability of winning is greater than 50%, but that we cannot find a percentage X% such that our probability of winning is greater than or equal to X%. This logic is enough to turn a >= 50% into a >50%, but it can't do any better than that. It is also noted that it is a physically impossible solution, as there is no known physically meaningful way to draw from a Gaussian distribution, or any continuous distribution whose domain is unbounded. Mathematically, it's no issue. Physically, it's a problem. — Cort Ammon, Aug 06 '18 at 15:43
And, given the way real numbers are used, one can expect second order logic to be involved. Intuitively, I think that one actually needs a second order axiom to make this proof possible. I'm not aware of any formal phrasing for this question which does not contain some second order logic (such as the induction axiom of peano arithmetic) — Cort Ammon, Aug 06 '18 at 15:47
@CortAmmon Thanks. The Math Overflow discussion is way above my head. Just going to put this puzzle in abeyance for now. Further discussion of this point should be in chat. — Chelonian, Aug 06 '18 at 18:57
@CortAmmon No strategy. We don't have any relevant information as to the relation between the second number and the first number. 50% is the best anyone rational should expect. — Speakpigeon, Sep 05 '19 at 15:05
@Speakpigeon I recommend looking at the analysis before jumping to conclusions. As it is, we actually do have relevant information -- the numbers are not the same number. — Cort Ammon, Sep 05 '19 at 15:19
@CortAmmon That's not relevant information. Given any one number x, there is an infinity of numbers strictly greater than x and an infinity of numbers strictly smaller than x. 50/50. — Speakpigeon, Sep 05 '19 at 15:33
@Speakpigeon Be careful dividing infinities by infinities. It can lead to problems. Personally, I think it was best put in the linked math overflow answer, "... with the proper quantifiers and the dependence on x and y explicit, it becomes a cool mathematical result rather than a paradox." — Cort Ammon, Sep 05 '19 at 15:35
@CortAmmon I'm not dividing anything. There's no rational calculation precisely because of the infinities involved. Any number may be chosen and there is supposed to be an infinity of them. The best we can do is trust our intuition. 50/50. — Speakpigeon, Sep 05 '19 at 15:42

Graham Kemp · Answer 2 · 2018-08-06T23:56:24.433

1

Zero probability does not exactly mean impossible.

Probability is a measure placed on the expectation that an event will occur (or had occured).

Observing an event which you had no expectation for occuring is not a contradiction. It is just a surprise.

edited Aug 06 '18 at 23:56

answered Aug 06 '18 at 23:21

Graham Kemp

2,356
7
13

score 0 · Answer 3 · answered Aug 06 '18 at 14:49

Language and propositions

One can propose anything, but the proposal is merely an observation or prediction or explanation of an event, past, present and future.

So if you say this event has zero probability of occurring then the probability being expressed is either wrong or talking about something different than the event it was thought to describe if the said event is now occurring.

A classic example of this is Newtonian physics and relativity. One is a subset of the other, which people thought when talking from the perspective of Newtonian physics described everything, relativity was impossible but actually at the boundaries relativity demonstrated its effects.

This also highlights the problem of absolutes rather than generalities. In theory things work nicely as absolutes, except real life is mainly distorted by subtle interactions, and layered hierarchies, which are always harder to understand and take into account.

So 1, 100% probable is probably as impossible as 0% probable, because to exclude anything being possible, you have to define a space which can be absolute and controlled, which we do not know exists anywhere. Existence itself might be a probability of less than one, which maybe why we get so confused, because we love the idea of proving something is 1, because then we feel safe.... lol

Agreed. I would say avoiding discussing "what if I observe something with probability zero" saves a lot of trouble...say, consider a density function f(X|Y), in which P(Y=0). It is only mathematically defined when Y can be approximated by a sequence of non-null events and the resulting distribution is independent of the sequence you choose. — failedstatistician, Aug 06 '18 at 19:06

score 0 · Answer 4 · answered May 08 '19 at 11:55

Wikipedia defines a probability space as follows:

The probability measure P is a function returning an event's probability. A probability is a real number between zero (impossible events have probability zero, though probability-zero events are not necessarily impossible) and one (the event happens almost surely, with almost total certainty). [my emphasis]

Note that the function P, the probability measure, can have the value zero for events that are possible. This answer will try to make sense of that statement.

To try to make sense out of it, consider how Wikipedia defines event:

In probability theory, an event is a set of outcomes of an experiment (a subset of the sample space) to which a probability is assigned

Also consider how Wikipedia defines the outcome of an experiment:

In probability theory, an outcome is a possible result of an experiment.

Let's consider the question:

From a philosophical point of view, if one observes an event with zero probability, does this immediately lead to a contradiction?

One has to avoid confusing the mathematical definitions with natural language.

If one "observes an event" then one has actually performed an experiment and observed a set of outcomes, an event, of an experiment. An outcome is a possible result and so observing it is not an impossibility and it does not lead to a contradiction.

However, if one does not make any observation then that event is empty. It contains no outcomes. It is assigned a probability of zero so that the probability measure assigns all subsets of the sample space a real number between zero and one. It would be a contradiction to observe an event that one does not actually observe. But the OP says that an observation has been made.

If the sample space is infinite then there could be subsets of that sample space that have measure zero and so are assigned a probability measure of zero. Although those subsets correspond to probability-zero events they still contain possible outcomes. Obtaining such outcomes would not be a contradiction.

Wikipedia contributors. (2019, April 3). Event (probability theory). In Wikipedia, The Free Encyclopedia. Retrieved 11:43, May 8, 2019, from https://en.wikipedia.org/w/index.php?title=Event_(probability_theory)&oldid=890813651

Wikipedia contributors. (2018, December 21). Outcome (probability). In Wikipedia, The Free Encyclopedia. Retrieved 11:44, May 8, 2019, from https://en.wikipedia.org/w/index.php?title=Outcome_(probability)&oldid=874830338

Wikipedia contributors. (2018, December 31). Probability space. In Wikipedia, The Free Encyclopedia. Retrieved 11:43, May 8, 2019, from https://en.wikipedia.org/w/index.php?title=Probability_space&oldid=876200212

score 0 · Answer 5 · answered Sep 05 '19 at 14:45

Probability is a prior assessment of the likelihood of a particular event occurring in the future, usually over a given period of time or in association with another event, such as the casting of a die.

The assessment of probability is based, broadly, on our experience of the world. Supposing you have observed (and remember having observed!) 100 swans, and that 3 of these swans were black while 97 were not black, you will assess that the probability of observing a black swan in the future is 0.03.

However, this relies on various assumptions. For example, in the example given, when deciding that the probability of a black swan is 0.03, you assume that all other things remain equal. For example, you are going to assume that there isn't anybody currently culling all black swans around where you are. Or that there won't be a flight of black swans landing all around you every time you will be looking for a swan in the future.

Given that your knowledge of the universe is at best minimal, assessing any probability can only be a very dodgy process. You could say that the probability of correctly assessing most probabilities is near zero. It works well, though, when you can afford to do something like a "controlled experiment", essentially whenever you can be confident that the domain observed will be somehow protected from perturbations and will therefore remain broadly similar to what it was during the critical period when you made the observations on which you based your assessment.

However, there are many cases where we are all very good at it. For example: what do you think is the probability that the water in the bottle on your kitchen table suddenly explodes? Well, yes, near, very, very near zero. If it ever did, we would rather suspect foul play than anything wrong with our assessment.

Assessing probability requires a domain of relevance. For example, assessing the probability of seeing God is an impossible task. We can assess the probability of getting a 5 when casting a die. The domain of relevance is the casting of a die. You may decide if you are talking about any die at all or one particular die. Whether you trust the die to be fair or whether you cast the die a number of times before to observe what it does. And you need to decide how many times you will cast the die in the future to observe the result at least once.

A probability of zero just means you don't have any good reasons to believe it will happen, all things being equal. The probability of my meeting with God or Aristotle is zero. The probability of my meeting in the flesh someone I never met before is not zero if I can assume that this person is alive and well, and is indeed living on this planet.

The probability of observing, during a given period of time, some event with a probability of zero is 1. This is because at best we only have a very partial knowledge of the world and our assessment of probabilities, however careful it might be, remains a dodgy business (even before taking Quantum Physics into account).

We don't usually pay attention to that because, essentially, we don't assess at all the probability of most of the least probable events, which are too many and insignificant for us to bother. Even if we count the assessment of probability we do without even thinking about it, there are still many possible events we won't assess. What is the probability of Donald Trump inviting himself in my home for dinner tonight? Well, I never thought of that one before, I must confess, even though on reflection, it's definitely not zero.

So, no paradox, I'm afraid. Probability is about predicting the future given what we think we know of the universe from our experience of it, which is necessarily very limited.

So, getting it wrong is... probably probable, more or less.

score -1 · Answer 6 · answered Aug 06 '18 at 12:38

One field of study that seriously deals with questions of what to do when observing probability zero events (even in finite settings) is that of game theory. To illustrate why this is of interest to game theorists, consider the following example.

First, consider an asynchronous version of rock-paper-scissors, where I first have to make my choice and put it in a sealed envelope, and then you make your choice, without seeing what is in the envelope. It's not hard to see that this doesn't give you any advantage relative to the normal version where we have to decide at the same time. But when you make your choice, you should have probabilistic beliefs about what I wrote down. Let's say a win gives +1 point, a loss -1 point, and a tie 0 points. Game-theoretic solution concepts will dictate that your belief should be 1/3 in each possibility.

Now let's extend the game as follows by giving me a fourth option in the first round, which immediately ends the game with a score of +1/2 for me and -1/2 for you. Then, game-theoretic solution concepts say that I should take this option with probability 1 (since I cannot expect to consistently win at RPS, given that we assume that both players are rational).

But now suppose that in the extended game, you, against all odds, see that I did not take that fourth option, and instead wrote something and put it in an envelope. From your perspective, a probability 0 event just happened. Still, you have to now update and figure out what your beliefs are about what I wrote -- should it still be 1/3, 1/3, 1/3? Doing this analysis turns out to be important not only for the case where the probability 0 event actually happens, but also for equilibrium analysis at the outset. For example, if in this circumstance you were to update your belief to believe that I must have played (written down) Rock with probability 1 (and will hence play Paper with probability 1 yourself), then in the first round I would actually not want to take that fourth option, and instead just play Scissors.

This literature goes much deeper than I can describe here, with different solution concepts demanding different things of these probabilistic beliefs (for example, sequential equilibrium is more demanding than perfect Bayesian equilibrium). I just wanted to give a little hint about why game theorists care about these things.

I fail to see how not picking that fourth option constitutes a probability 0 event. If it is possible, it has a non-zero probability. Simple as that. Just because I expect something to never occur does not mean it actually has a 0 percent probability. — Flo, Aug 06 '18 at 12:42
Hi Flo, I agree with you that in real life of course there's always some positive probability that the other player takes the fourth option (if only because something might go wrong in the brain of the player). Game theory considers an idealized situation that I don't think will ever occur in real life. However, often it is a useful approximation of the truth, used in a variety of real applications, and it's a case where people really think about what to do when a probability 0 event occurs, so it seemed very relevant to the question. (I'm surprised my answer was downvoted?!) — present, Aug 06 '18 at 15:19
Seems like a collision of terms to me: What game theorists mean when talking about 0-probability events is very different from actual 0-probability events. In game theory, it seems "approximately-zero-probability-event" would be a more accurate term. — Flo, Aug 06 '18 at 16:00
That's sensible enough, and in fact some of the more advanced solution concepts do work in this way, that you're effectively assigning positive but infinitesimal probabilities to such events.
However, I would not say that game theorists aren't using the terms correctly. Many have a very strong command of probability as a mathematical theory, and what they're doing is very much consistent with that mathematical theory. The meaning of probability in the real world is of course a more controversial and philosophical topic. Anyway, just trying to point out something that seems relevant. — present, Aug 06 '18 at 16:21

Observing a zero probability event

6 Answers6

Linked