February 13, 2002

silhouette3.JPG From the desk of Jane Galt:

Sophismata criticizes my earlier post

Sophismata criticizes my earlier post for, among other things, generalizing from too small a sample size. In keeping with my committment to let my readers know when I'm wrong, as well as my opponents, I post his response and my explanation.

Both of them haven't figured out what correlation means. Correlation "is a measure of the degree of linear relationship between two variables." For instance, when the price of oil goes up, the price of food also goes up. We say that oil prices and food prices are positively correlated. Similarly, when oil prices goes up, the sales of SUVs goes down; these two variables are negatively correlated. The price of oil affects transportation costs, which in turn affect the price of food. The price of oil also affects the cost of owning a vehicle. Thus, in this case, correlation is also causation. However, one can show that crime rate in certain cities is highly correlated to bubble gum sales. (I had to do this in junior statistics.) But you would be wrong to conclude that bubble gum is the cause of crime. Rather the population of this city increased over time, which affect both gum sales and crime. Comparing the voting pattern of two groups has nothing to do with correlation.

I wasn't comparing the voting patterns of two groups; I was theorizing about the correlation between the political makeup of the current staff, and the political makeup of the staff they hired, that being the subject of the article. I do know what correlation means, I do, I do!

But I may have confused people. The data, of course, refers only to the makeup of the current staff, so there is no way to run a regression showing the actual correlation. I got mixed up with Wilentz's words. I should have said that the deviation from the norm is massive, which it is.

Is the difference in voting patterns material? This takes us the the wonderful world of hypothesis testing. I think this is where Live from WTC was going, but how she is able to deduce that this is a three or four sigma event escapes me.

I didn't deduce it. As I said in my post, this was a wild-ass guess. For non statistics people (of whom I am almost one), "sigma" is a technical term referring to the standard deviation from the mean. I'm not even going to try to explain how this works, but basically, it refers to the probability of a given event being in a certain range in the distribution. A 3 or 4 sigma event is one that is highly improbable, like, say, getting either a 1550 or a 150 on your SAT's -- it happens to a very small percentage of the population being studied. So I was guessing that if you plotted the political makeup of academia, as a profession, against other professions, you would find them at the very far left of your spectrum, just as you would probably find, say, Southern Baptist ministers pretty far over there on the right. But it would be damn hard to design a study that would give you the data to plot, so this is, and will remain, a guess.

From the comments to the post, I think the poll only had 150 respondents. If 6% of the professors voted for Bush, then only 9 people in the survey voted for Bush. If you drew 150 people randomly from a population that voted for Bush 49% of the time, the odds of getting a group that had only 9 Bush voters is 5.9 x 10^(-42); that is pretty much zero. However, and this is a big HOWEVER, you are on very shaky ground to extropolate any information from a poll this small.

Yup, the sample's small. Certainly not definitive. Either that's not in the article, or I missed it. Mea culpa.

Nonetheless, Wilentz's reaction is wacky. When you criticize study design, you suggest a better study. You don't say, "well the sample's too small, so the effect isn't real, case closed." Wilentz looked at a suggestive study that was too small to be definitve, and said flat out "it's not true."

But I think that my post was a little jumbled and unclear about what I was saying, since the very clever Mr. Ramachandran couldn't make it out. So I apologize for any confusion that may have arisen, and urge everyone to go to Sophismata often to enjoy the fine mathematical and statistical insights to be gained there.

Posted by Jane Galt at February 13, 2002 07:40 AM | TrackBack | Technorati inbound links