Pick a card, any card…
…or slight of hand with statistics
One of the critics of Blanchard’s work, Madeline H. Windzen, challenged his research by misrepresenting how statistics works in science. Most people are rightly suspicious of statistics, not having been exposed to it as part of their education, save perhaps, learning to calculate simple odds. Goodness knows how many times is it misused to mislead consumers in commercials for example. However, Windzen mislead by playing to that suspicion, by setting up a false straw man to be knocked down:
On the linked page, she sets up a hypothetical data set that has easily differentiable clusters on a two dimensional graph. After explaining how easy it would be to differentiate them in her hypothetical graph, she then shows us Blanchard’s actual data, then, with a slight of hand, asks us to draw the wrong conclusion, that there are no clusters, nor differences between categories in the data!
“But, wait! Isn’t she right? That seems to make sense to me,” you might say.
No, it isn’t right.
Let’s set up another case… a real world case. Let’s consider a histogram, a graph of how many people are a given height. If we were to plot this, we would find a classic “bell curve” centered on 5’7″ tall (at least for adults living in the U.S.). There are no obvious clusters present in the graph, as only one peak is present. But consider, people come in two sexes. If we plot the two sexes on the same graph, we find that we now have two new bell curves centered on 5’4″ for women and 5’10” for men. These two curves also over lap a great deal. Also consider, that if we add those two graphs together, we, by definition, get the first curve. In other words, those two curves are “hidden” in the first. but we know its in there.
Knowing something about these hidden curves, we can make some statements about using the first curve. For example, if we know that someone is 5’7″ tall, the probability, the odds, that someone is a man or a woman is exactly half, 50%. We know that if someone is taller than 5’7″, the odds that they are a man are greater than 50%, if shorter than 5’7″, the odds are less than 50%. The further away from 5’7″, the greater the odds for one sex over the other.
Thus, mathematically, statistically, we can use the original curve to create statistical correlations with other data, that can be correlated with yet other data. For example, if we find that the shorter a human being is, the longer they live, as a group, we can also say, with some statistical probability that since shorter humans are more likely to be female, then we can also say, that female humans are more likely to live longer. Thus, we can use a data set that would appear to have no discernible clusters to learn something about two very real categories of people… even though we can’t draw neat little boxes around the groups.
This is the case with Blanchard’s data set. We actually have two very real and different groups, that overlap in the graph. There is one group, whose scores center on +12, -6 on the graph, while the other group is quite literally all over the map! The scores of the autogynephilic category is very very noisy, while the feminine androphilic category is tightly clustered. So, if we make an arbitrary grouping, centered on +12, -6, we will find that we have a higher chance that individuals with that score are feminine androphilic type than any other area on the graph. We know that we will get some individuals who are AGP inside of the cluster, and we know we will leave some feminine androphilic transsexuals outside of the cluster. But, we know that we have a higher probability that a given individual will be feminine androphilic inside, and a higher probability that a given individual will be AGP on the outside. Thus, we can now, statistically learn things about the two groups, differentiate their characteristics.
From the differences between the groups thus determined, we learn that only 10% of the individuals scoring near +12,-6 reported erotic cross-dressing and autogynephilic ideation, while those outside of this cluster in all three of the other quadrants, a much much higher percentage reported autogynephilic arousal. Thus, we know that the instrument is not that good at separating the two types, just as height is not that good at separating the sexes, but both instruments, Blanchard’s Modified Androphilia & Gynephilia Scales and the yard stick work well enough to let us statistically say things about the groups in question.
So, back to Windzen. She clearly is an intelligent and knowledgeable educator. So, why did she not explain, properly explain, how statistics is used in general, and how Blanchard used them in particular, to learn, about the two types? Instead, she questioned why he didn’t use another method, letting the reader wonder if it was really Blanchard who was pulling the statistical slight of hand? In fact, she knows, but failed to report, that Blanchard had actually used a another, equally valid software tool that is designed to find such “latent clusters”, to find the best way to draw the lines around possible clusters in the data. He did the best he could with the data he had… but Windzen suggests otherwise. Why did she pull such an intellectual sleight of hand?
Perhaps it is because she thinks she can? How many people understand that their chances of being struck and killed by lightning are greater than their chance of winning their state’s lottery!
So, far from bringing Blanchard’s work into question, and certainly even further from “debunking” it as I’ve read some bloggers claim, Windzen has only made us question her.
(Addendum 11/21/2013: Blanchard’s observations discussed here have been repeated by several more recent studies, which I have written about else where. You may wish to read further on my essay What is a Transsexual?)