2

I am writing a publication in which I mention the fallacy of of first executing a method and then claiming the results are what was aimed for (regardless of the results).

This sounds very much like the story of the Texas sharpshooter fallacy

The name comes from a joke about a Texan who fires some gunshots at the side of a barn, then paints a target centered on the tightest cluster of hits and claims to be a sharpshooter.

but does not fit with the description

The Texas sharpshooter fallacy is an informal fallacy which is committed when differences in data are ignored, but similarities are overemphasized. From this reasoning, a false conclusion is inferred.

I find this weird, because the story, and thus the naming, does not fit with the description, but instead with the fallacy that I described.

By questions:

  1. Why was the Texas sharpshooter fallacy named this way if the story does not fit the meaning at all? Am I missing something or misinterpreting the intent of the story?
  2. Is there a name for the fallacy that I am describing?
Mauro ALLEGRANZA
  • 36,790
  • 3
  • 36
  • 80
Make42
  • 181
  • 4

1 Answers1

1

The fallacy is about picking a "target" in retrospect after one already has the data (akin to drawing a target against the tightest cluster of bullet holes on the barn after one has already shot the gun to create them) and then calculating the probability that the data would all be so close to the target in the same way one would if the target had been predicted beforehand.

For example, if one wants to show that some food has a health benefit, one could take a sample of people who started eating that food and look at how they compared to a control group on a very large number of health variables. Even if the food has no causal effect on any of the variables, if one picks a large enough number of them to test, it may actually be fairly likely there will be some statistically significant difference between the control group and the test group on some variable, just by random chance (similar to the spurious correlations website which charts a large number of different variables and then only shows the ones whose graphs happen to "match" fairly well). And then if one picks the variable that has the largest difference between the test group and the control group--say, performance on a test of grip strength--and calculates in retrospect the "probability" that the two groups would differ so much on that variable under a null hypothesis, one may get a low probability and claim that this makes a case for rejecting the null hypothesis and saying the food was the cause of the difference.

I don't think the wiki's phrasing about "differences in data are ignored, but similarities are overemphasized" is very clear, but one could say in my example the "similarity" that's overemphasized is the way the members of the test group are similar to one another in having a statistically significant level of higher average grip strength, while the "differences" that are ignored are all the other variables where members of the test group aren't any more strongly correlated with one another on that variable than they are with members of the control group.

The wiki gets that particular phrasing from this list of fallacies which they cite, you can see the page for it here and the examples they give of focusing on similarities but ignoring differences, like a dating site that tries to claim two people are a great match by highlighting a few questions they answered similarly while ignoring all the other questions they didn't.

Note that when these types of examples are analogized to the Texas sharpshooter, the important thing is that the target is chosen in retrospect, it's not important to the analogy that the person drawing the target is also the one who "performed a method", i.e. shot the gun. If one sees a friend's car that has a bunch of bugs that have splattered on the windshield, and draws a target around the greatest cluster and then argues the bugs must be preferentially attracted to that part of the windshield, it would be the same fallacy. I don't think there's a name for the idea of a version of this fallacy where it's treated as important that the same person created the data by performing a method and then picked the target in light of the data, if that's what you're asking for.

Hypnosifl
  • 2,857
  • 1
  • 17
  • 19
  • So it is not about the (dis-)similarities of samples, but of the (dis-)similarities of variables? Seems like it in both of your examples. Then the "painting of the target" would be the claim that "yes, we wanted to look for grip strength", even though you did not have that in mind when you started testing.
  • – Make42 Jun 15 '20 at 15:43
  • Your last paragraph reveals a misunderstanding, maybe I did not communicate well enough: I am not emphasizing the point who is developing or applying the method. My fallacy is the connection between method (shooting) and result (target), in contrast to having a connection between data (shooting) and a hypothesis (target). The point is that someone (anyone) develops a method (here clustering), then looks what one gets and during the evaluation says "what we get is what we wanted, so the method is great". Is this a Texas sharpshooter fallacy?
  • – Make42 Jun 15 '20 at 15:51
  • On 1), I'd say it's about the statistical relationship between the sample and the variables you test for in retrospect, like finding the test group are 'similar' in having better grip strength than the control group. On 2), I'm not clear what you are counting as a "method"--are you also talking about a method of generating data (akin to shooting the barn to produce bullet holes, or to setting up the test where one group eats a certain food while the control group doesn't), or just to a method of statistical analysis of preexisting data like clustering? – Hypnosifl Jun 15 '20 at 17:06
  • Also is it important to your question that the person actually claims the pattern they found was the one they were looking for all along? One could imagine a study like this where they just look at a bunch of variables and announce something like "eating this food was associated with better grip strength" without specifically claiming they were interested in grip strength from the beginning, but also not mentioning that they looked at a host of other variables when studying the data. – Hypnosifl Jun 15 '20 at 17:07
  • Regarding your question: I am talking about clustering (see the link in my comment). A clustering method assigns labels to previously unlabeled samples such that samples within the same cluster are more similar to each other and samples in different clusters are more dissimilar to each other. What that exactly entails depends on the definition of what constitutes a cluster and on the method. So clustering methods are a means to analyze data, but they also generates data, namely the labels. [...] – Make42 Jun 15 '20 at 17:27
  • [...] Many authors do not give a method-free definition, before describing their method. They sometimes only give only a definition which is based on the method. I want to argue that this is erroneous, because it is like defining the method first and then claiming that the method does adhere to the definition (but the definition is using the method already). So the cluster definition should be the target and the clustering method should be the shooter. I am not interested in specific studies though, since I am doing basic method research. Is this also answering your second question? – Make42 Jun 15 '20 at 17:29
  • Would an example of what you mean by "labels" be something like the big 5 personality traits which were found through some type of cluster analysis on answers to personality-related questions? But what about a method like first doing exploratory factor analysis to look for patterns in data, but then if you think you've found a meaningful statistical correlation, doing another round of testing with confirmatory factor analysis? Would you consider that a type of fallacy too? – Hypnosifl Jun 15 '20 at 18:05
  • I think you can use the example with the big 5 personality traits. Usually samples (people) get only put into one group (trait) in clustering, but some methods allow fuzzy assignment (a person gets assigned to certain percentage to each group). Let's assume nobody knows about which traits exist, but the method assumes a specific shape of the groups. For example, the groups must be linear separable. [...] – Make42 Jun 15 '20 at 18:55
  • [...] My fallacy is done if a method developer does not state that he wants clusters that are linearly separable, but the he just says that the result we want, is the result this method provides. After we use the method we see that groups are always linear separable. Implicitly the developer now says "ah, yes, this is what we wanted", but he never stated that at the beginning. You question regarding factor analysis does not apply: The fallacy is not about statements that I make retrospective about the data, it is about statements I make retrospective about how my method works / its results. – Make42 Jun 15 '20 at 18:58
  • Do you still consider it a fallacy in the "fuzzy" case where everyone can be assigned a continuous value on a number of scales which were themselves decided on by factor analysis, like in the big 5 personality traits where a person may be given a number on a scale of some trait like introversion/extroversion? Or are you thinking more specifically of using cluster analysis to define a bunch of mutually exclusive categories and assigning every data point to one or another, ignoring the fact that the actual pattern of variation is more continuous? – Hypnosifl Jun 15 '20 at 19:43
  • The fallacy I described would be applicable to fuzzy cases. Maybe a different wording: When I say "how the method works" I am talking about the structure of the results. For outlier detection look at the image in https://bit.ly/2YGSGhj - you can see that the results of each method have a specific structure per method (not per dataset). I believe one should state the structure one aims for before describing the method and it is a fallacy to use the method in the description of this structure. "My methods aims to produce a result structure that is the result structure of my method." – Make42 Jun 15 '20 at 20:21
  • But don't nearly all the categories we use in everyday life and in science in some sense originate from observations of correlated properties in experience? Maybe some sufficiently broad ones are a priori or hardwired into our biology, but for example the original basis for animal names would have probably something to do with noticing a bunch of individual animals that share a common cluster of features (like for a cat it'd be things like pointed ears and a short snout and vertical pupils etc.) As long as the categories are provisional & subject to continued testing, I don't see a fallacy. – Hypnosifl Jun 15 '20 at 23:02
  • The debate whether real categories exist sounds like the metaphysical question whether universals exist, but let's not go there :-). I do believe some real categories exist, but cases we are discussing here, they definitely are of the kind that are either man-made or at least man-discovered. There are three objects of discussion: data (input), method, results (output). It seems to be you are coming back to relation data-results, while this is not my concern. My concern is the relation method-results, or, as the results depend on the data, more precisely, the relation method-"result structure". – Make42 Jun 16 '20 at 08:05
  • Maybe to make this clear: The fact that the term "clustering" appears in the Texas sharpshooter fallacy has nothing to do with the fact, that I want to do cluster analysis. I am just a researcher of clustering methods. – Make42 Jun 16 '20 at 08:25
  • I guess I don't really understand what you mean by "method" outside of the specific question of using cluster analysis, or "results" outside of using cluster analysis to define categories. Can you give some other examples of method/result pairings of a different type, such that you'd say the claimed results are fallacious in some sense? – Hypnosifl Jun 16 '20 at 08:55
  • In order to get closer to answering this question I tried to analyse the texts about the Texas sharpshooting fallacy at https://philosophy.stackexchange.com/questions/73602/what-is-the-texas-sharpshooter-fallacy. There I do not discuss "my fallacy", because this seems to be a particular complex fallacy (at least it seems now), requiring its own post. So I first try to sort out the Texas sharpshooting fallacy and then, later, see how my fallacy fits into the picture. It seem to me that my fallacy is some sort of circular fallacy though. Possibly the Texas sharpshooting fallacy is not. – Make42 Jun 16 '20 at 12:04