Wednesday, June 06, 2007


Recently a book was published titled “Freakonomics” which was a joint effort between a University of Chicago Professor Steven Levitt and a reporter Stephen Dubner. The authors pose six questions and then provide answers by mining data from various sources. Of course the problem with data mining is the same as mining minerals, all that glitters is not gold and data can be selected to provide a desired conclusion just as ore samples can be selected to provide a higher value to the raw material. But the fact that this manipulation is possible doesn’t mean it was intentional or that the resulting conclusions are wrong – it just means that some skepticism might be useful in evaluating the conclusions. Levitt’s position is that “teachers and criminals and real estate agents may lie, and politicians and even CIA analysts, but numbers don’t?” Of course anyone who has ever actually done data mining or statistical analysis knows full well that numbers CAN lie and frequently do, global warming (cooling) being a recent example of how data can be selected and interpreted to arrive at a desired conclusion.

Initially Levitt and Dubner address incentives and their impact which frequently is opposite of what is intended. The first case they address is actually a negative incentive in the form of a penalty for parents who pick up their child after the agreed upon time. Before the penalty was enacted there were approximately 8 late pickups per week but after the penalty was instituted the number of late pick-ups increased to 20 per week – the exact opposite of what was intended. As it turned out the late fee was simply too small because it was less than the current cost of baby-sitting. Had the penalty been substantial it probably would have worked.

Another example of social action and reaction cited by Levitt and Dubner was the sudden disappearance of seven million children. It seems that when the IRS demanded a social security number for every child claimed as a dependent the number of children suddenly declined because the alleged parents were cheating on their taxes. This of course is an example of a successful incentive. But Levitt and Dubner started their analysis by posing the question “What do schoolteachers and Sumo wrestlers have in common?” The answer turned out to be that they both cheat – the conclusion reached by Levitt was that incentives for teachers and Sumo wrestlers are structured so that they are encouraged to cheat.

All three of these are examples of “The Law of Unintended Consequences”, but is the data used to perform the analysis correct? If we assume that the data is in fact correct and the conclusions accurate, then why was the analysis done on schoolteachers and not on taxation? The reality is that historically high taxes do not generate higher revenues for governments but do in fact result in less revenue while tax reductions generate higher revenues. This has been demonstrated repeatedly since the Roman Empire and as of this writing the unemployment rate is at an all time low and the stock market at an all time high. However, this type of analysis and logical conclusion would have been counter to the prevailing view in academia and the media that the Bush tax cuts were favors to his rich friends and not good for everyone else. In fact there seems to be a veiled subtext running throughout this book that supports the liberal view.

In the analysis of what the Klu Klux Klan and Real Estate brokers have in common, the objective is supposed to be how information and the control of it can and does influence decisions. This is really not very surprising since sound decisions are based on having good information. However, in this case the authors show how the use of adjectives and adverbs is used to sway opinion and how businessmen are essentially dishonest. For examples he uses Enron, Insurance Companies, and Real Estate Brokers and compares them to the Klu Klux Klan. Naturally the Klan is more emotionally charged than say – ABC, NBC, or CNN -- who routinely slant their reports with emotionally laden words. This does not mean that the data present is false or that the conclusions are wrong, but just as one Robin does not make a Spring neither does one dishonest individual or company mean that ALL business people are dishonest or that all insurance companies are crooked. This is the flaw throughout this book, it implies things that are not true and is actually a veiled attack on business and the free market.

Riding to the rescue is John Lott, another economist from academia, including the University of Chicago current home of Steven Levitt. Lott takes the position in his book “FREEDOMnomics, that not only is FREAKonomics a pile of rubbish, it is a veiled attack on the free market and business in general. Lott points out that the assertion that the Klan is like a group of Real Estate agents who use “fear” to take advantage of others is beyond the rhetorical boundaries of taste if not accuracy. In fact Lott dissects the example used by Levitt and Dubner and demonstrates that while it might have been an actual example, their analysis of the motivation and result is a simplistic view of the data and a more detailed analysis would show their conclusions to be incorrect and the reality to be completely opposite.

Lott goes on to cite example after example of how the liberal tendency to root for the underdog and to blame corporations for the bad judgment of plaintiffs cost individuals and society a great deal more than the liberals realize. Liberal empathy is not rooted in hard analysis but is driven by emotion. As an example Lott cites the judgment against Wyeth Labs over a girl who contacted polio two weeks after receiving the vaccine, even though it was demonstrated in court that the girl had contacted polio PRIOR to receiving the vaccine. The upshot was that the court decided against Wyeth in sympathy for the little girl but Wyeth had to raise its prices for the vaccine thus causing other little girls to not receive the vaccine because it was now too expensive. Lott cites multiple examples of these decisions that reward an individual or a small group at the expense of everyone else.

In his examination of crime and the death penalty Levitt states “ It is hard to believe that fear of execution would be a driving force in a rational criminal’s calculus in modern America”. Perhaps this comparison is the most illustrative of the pitfalls of data mining and the liberal's tendency to substitute rigorous analysis for a superficial one that supports a desired outcome . The initial comparison showed that the murder rates between states that had capital punishment and those that didn’t showed that it was ineffective in reducing murder. Actually 10 of the 12 states without capital punishment have homicide rates BELOW the national average. At first glance this would seem to bear out the proposition that the death penalty was an ineffective deterrent. What is NOT stated is that these states have historically had lower murder rates than the national average so the statistics cited by Levitt and Dubner were not meaningful because they did not rely on a common factual basis.

During the years 1968 to 1976 when the death penalty was suspended nationwide, these 12 states had a lower murder rate than the national average. However, when the death penalty was reinstated those states that reintroduced it had a 38% drop in their murder rates. This statistic paints a very different picture than the one painted by Levitt.

These are both very interesting books and worthwhile reading but both are based on statistics and anything based on statistics is highly suspicious.

No comments: