Friday, November 11, 2011

Born to Run?

About a year ago, I read a book called "Born to Run," by Christopher McDougall, who last week wrote an article in the New York Times Magazine on the same subject.

McDougall's basic premise is that we were faster and less injury-prone before we started wearing all these fancy running shoes and that they are what's causing running injuries. For example, in the New York times article:

"Back in the ’60s, Americans 'ran way more and way faster in the thinnest little shoes, and we never got hurt,' Amby Burfoot, a longtime Runner’s World editor and former Boston Marathon champion, said during a talk before the Lehigh Valley Half-Marathon I attended last year. 'I never even remember talking about injuries back then,' Burfoot said. 'So you’ve got to wonder what’s changed.'"

Statistics frowns on such anecdotal evidence, though it does make a good story. Did we really run faster? There are a lot of facts that we can look at though average times aren't among them. Marathon records (shown in Wikipedia) for men have indeed only downticked a little since the sixties. In 1970, Ron Hill of the UK (close enough, runnig-shoe wise, to be considered american?) set a record of 2:09:29. This year, a new record of 2:03:38 was set (the most recent US record was 2:05:38 in 2002). Six minutes in 40 years doesnt seem like much, but is it because of the shoes or because the sport has matured? And are Americans seen less because running isnt really a big competitive sport here?

When you look at women's times, the changes are much more dramatic. Women more recently began running marathons and fewer participated in the sport in general until relatively recently. In 1970, the women's marathon record was 3:02:53 (set by an american). In 2003, Paula Radcliffe (England) ran it in 2:15:25. That's a 47 minute improvement, or nearly 2 minutes per mile. In the 2011 New York Marathon, 40 women from the US bested the 1970 record time (see marathon site here for results).

So, I can't agree that we ran "way faster" 40 years ago. This doesn't mean that bare foot runners are slower than shoed runners because changes over the last fourty years in the level of competition, and improvements in training and fitness, rather than shoes, might have been the factors contributing to improved times.

How about injury rates? Do people get more injuries with running shoes than without? Unfortunately, any data on injury rates is tainted by the changes in the makeup of the population that runs (from a small, highly fit population to a large more population more varied in fitness--think of the then-overweight President Clinton running with a stop at McDonalds post-jog), and there haven't been any studies that directly compare injuries over time for barefoot running against running shoe running. A good summary article is here.

A recent article in Nature, while not looking at historical data, supports McDougall's contention that running shoes can be more harmful than bare feet when running. The article is lead-authored by Daniel Lieberman, a big advocate of barefoot running, so his bias may have been to look at things he believed were helpful about barefoot running and not at aspects of barefoot running that may be harmful. The article looks at impact forces and not at injuries, and doesn't consider that runners with shoes may be able to change their stride to reduce the impact forces (McDougall says this is hard to do with running shoes, and, from my own experience, I tend to agree, though I don't think it is impossible).

The statistical net-net is that there is no direct evidence either way right now. I admit some bias but I would say that the lack of evidence, given the power and money behind the shoe industry, tends to make me believe that, at best, fancy shoes are no better than bare feet, because if there were an effect in favor of shoes, I would certainly think we'd have seen a study by now (this is something correctly pointed out by McDougall and other advocates of barefoot running). Therefore, don't be surprised if you see me running with feet au-naturel someday soon.

Wednesday, March 30, 2011

Detecting cheating

In my professional work, I like being the statistical sleuth, trying to figure out whether a person or company cheated, and how much they cheated. Thus it was with a lot of interest that I read a recent article in USA Today describing suspicious activity that went on some standardized tests in DC schools.

It seems that standardized tests at certain DC schools have improved dramatically. For example, the article says, "in 2008, 84% of fourth-grade math students were listed as proficient or advanced, up from 22% for the previous fourth-grade class." Of course, this could just be part of the amazing turn around.

However, the review found that this dramatic change corresponded with a another interesting statistic: the school had a very high number of erased answers that were changed from wrong answers to right answers (WTR erasures). Again, here's what the article said: "On the 2009 reading test, for example, seventh-graders in one Noyes classroom averaged 12.7 wrong-to-right erasures per student on answer sheets; the average for seventh-graders in all D.C. schools on that test was less than 1. The odds are better for winning the Powerball grand prize than having that many erasures by chance."

Here's my problem with this logic: the calculation of the chances assumes that each student is acting independently and erasing much more than usual. In other words, the chances are calculated assuming that the students are randomly grouped by school with respect to the number of WTR erasures they have, and thus no school should have a particularly high or low number of erasures: number of erasures and the associated school would be statistically independent.

This statistical independence assumption falls apart if there is cheating, wherein teachers erase wrong answers and change them to correct answers after the test is completed. However, the statistical independence assumption also could also fall apart for innocuous reasons.

Suppose the students at this school were instructed to arbitrarily fill in the last 10 questions immediately upon beginning the exam (this might be a good strategy if there is no penalty for guessing and if many students do not finish the exam). Then, for the ones who get to the end of the test, they are erasing most of their guesses. This is a completely legitimate strategy, but it would make raise the number of WTR erasures a great deal. A lot of more complicated test taking strategies would also lead to more erasures, and if this school in particular taught those strategies, there would be a very high chance that there would be far more erasures at this school than at others, and some of the people interviewed cited strategies that may have led to more erasures.

Thus, the high erasure rate, even WTR erasures, may have a relatively simple explanation: this school effectively coached the kids in test taking while other schools did not or coached the children differently.

The article provides a link to several documents summarizing the results of the analysis. What I find interesting is that the worst school, BS Monroe ES, in terms of WTR erasures, also has a lot of WTW (wrong to wrong erasures). On average, this school has about between 2 and 3 WTW erasures per student, or about 1 WTW for every 5 WTR erasures. A more interesting, and I think more revealing, analysis would be to see how this ratio compares to the normal ratio. If the normal ratio is 1 WTW to 5 WTR, it indicates cheating may not have been the reason for the erasures (unless the cheaters were purposefully erasing some and changing them to wrong answers--which seems unliklely since there is no indication potential cheaters realized erasures could be detected at all). If the general ratio is far from 5 to 1, it would be another indicator of a different process going on at BS Monroe ES, perhaps involving cheating though it is still hard to rule out other, innocuous explanations that involve test-taking strategy.

Another analysis would be to look at the WTR vs. WTW erasures student by student. Presumably, students who answered a higher percentage of un-erased problems correctly would have a better ratio of WTR to WTW erasures. If that were not true, then it would lead more clearly to the conclusion that someone else was doing the erasing.

The research revealed in the article shows the correlation of two things: a dramatic increase in test scores and a dramatic number of WTR erasures. Cheating is one explanation for these increases. Another, however, is the implementation of a smart test-taking strategy at the school, which might well be part of an overall program to increase the test scores and improve the school. A statistical test can have a seemingly dramatic result (less likely than winning the lottery), but while defeating a specific hypothesis (independence of erasures by school), it doesn't necessarily prove another hypothesis (cheating).




Tuesday, September 28, 2010

Throw away your cold medicine again?

A couple years ago, I wrote about a study that looked at the effect of a seawater nasal spray on the health of children (see that post).
Yesterday's New York Times, explored a very similar claim. Anahad O'Connor's column, "Really? The Claim: Gargling With Salt Water Can Ease Cold Symptoms," looks at a study of 387 Japanese adults aged 18 to 65 (see this page for an abstract). Treatment groups gargled with PLAIN water or a "povidone-iodine" solution. Those gargling with plain water did the best, with 0.17 URTIs (upper respiratory tract infections) every 30 person-days, meaning about 1 in 6 get a URTI per month if they gargle with water. The control group had a rate of .26, meaning about 1 in 4 got a URTI. The iodine group had a rate of .24, also meaning about 1 in 4 go a URTI.

So water looks pretty good. The only caveat, and it is the same as the issue I mentioned in the earlier post, is that the outcomes were self-measured. The people doing the gargling reported whether or not they had a URTI. IN Japan, where the study was performed, there is a strong bias toward water gargling, at least according to the abstract of the study, which says: "Gargling to wash the throat is commonly performed in Japan, and people believe that such hygienic routine, especially with gargle medicine, prevents upper respiratory tract infections (URTIs)." In fact, the article reports that those in the control group gargled one time a day on average as well l (but those in the treated group gargled around 3 times a day). This affinity for water gargling and the belief that it stops infection may result in water-gargles reporting fewer infections, thus throwing the results of the study into question.

The New York Times, by the way, gives recommendations based on an upcoming book by Philip Hagen, to gargle with *salt* water, but cites this study, which is referring to *plain* water only.

My conclusion? If you THINK it is going to work, it's fairly likely water gargling will be effective, and it is a lot cheaper than buying some kind of preventative medicine. If you don't think it will work, this study provides little help in deciding whether it actually will work.

Monday, March 15, 2010

You asked for it, you got it. Toyota!

I think that's how the ad line went. When? maybe 25 years ago.

Well, it seems to apply now. Sudden acceleration. Mention a problem with a car, any problem with any car, and people will start crawling out of the wood-work with the complaint. Why? It's a numbers game. There were more than 100,000 pri-i(?) sold in the US in 2005-9. With that many people driving them around, any tiny problem that is reported is going to be "substantiated" by others. Those of us old enough to remember the Audi 5000 found the high correlation between those Audi's with sudden acceleration and those sold to 85 year-old ladies inexplicable (studies mostly concluded it was driver error--see a recent article here in Wired).

The latest, after the brake-related Prius recall, is the claim of sudden acceleration. A guy in California managed to call 911 while it was happening--pretty amazing, huh? Unless, of course, you made it all up. Here's what the current thoughts about it are (from wikipedia):
"On March 8, 2010, a 2008 Prius allegedly uncontrollably accelerated to 94 miles per hour on a California Highway (US), and the Prius had to be stopped with the verbal assistance of the California Highway Patrol as news cameras watched [86]. Subsequent to the event, media investigations uncovered suspicious information about the alleged runaway Prius driver, 61-year old James Sikes, including false police reports, suspect insurance claims, theft and fraud allegations, television aspirations, and bankruptcy.[87][88] Sikes was found to be US$19,000 behind in his Prius car payments and had $US700,000 in accumulated debt.[87] Sikes stated he wanted a new car as compensation for the incident.[87][89] Analyses by Edmunds.com and Forbes found Sikes' acceleration claims and fears of shifting to neutral implausible, with Edmunds concluding that "in other words, this is BS",[90] and Forbes comparing it to the balloon boy hoax.[88]"

Notwithstanding the apparent CA tale above, the reality is that the rare problem is a tough nut to crack statistically. Suppose there is an issue in 1 in 10,000 Prius' and that this issue only crops up on one in 10,000 rides on those cars. Thus, it's a 1 in 100 million car rides in Prius. Even among those, it may be a very short-lived problem and not cause any injury or accident. Such a rare problem might be drowned out by other driver error problems, such as accidently hitting the gas instead of the break, perceiving that the car is accelarating when it is not, hitting both the gas and the break simultaneously in an attempt to hit the break. Each of these things can be exceedingly rare (1 in a million) and still be 100 times as common as the real problem.

There are other ways to go about teasing out rare events. In the lab, a machine could possibly simulate conditions that were occurring when the supposed sudden acceleration took place and see if it is repeatable. Yet these conditions are hard to figure out, as they are determined with the imperfect information of the person reporting the incident. As might be the case with the recent report, that person could be lying, but even if not, they are likely shooken up enough that they cannot remember the exact conditions very well. Consider airline crashes, where we often have very objective information (the black box), but it is still very difficult to figure out what happened and why.

One thing seems certain to be true: we won't know whether or not Prius cars are at fault for a long time to come, and far fewer of them will be bought in the next couple years.

Wednesday, December 9, 2009

More germs = less disease?

So says an article in today's Science Daily, which reports on a recent study at Northwestern of children from the Philippines. The study finds that children from the Philippines have much lower levels of C-reactive protein (CRP), which indicates better resistance to disease. Exposure to germs was much higher for the children in the Philippines.

So what's wrong with this study? It's a very tenuous association, and from what I can gather in the articles, no attempt was made to ensure the children in the U.S. that were compared to the children in the Philippines were similar in other ways. They might be different in CRP due to other environmental or hereditary factors. Perhaps it's the weather? The diet? One of any number of things could account for the difference.

In addition, the study appears to ignore the much higher infant mortality rate and much lower life expectancy in the Philippines (you can try www.indexmundi.com for life expectancy and other information by country). In other words, even if higher germ exposure does mean lower CRP, does it actually mean less disease and longer life? The broad indication is that it does not.

In order for the study to be valid, it needs to adjust for whatever inherent differences (in addition to germ exposure) exist between Phillipino and US children, and then see if CRP levels are still different. An even better way to do such a study would be to study children living in similar environments (same place, socio-economic situation, etc.) and determine if the ones exposed to more germs had lower levels of CRP when they reached adult-hood.

I've seen articles (see this for example, but I can't find a more definitive one at this time) that indicate that children with early exposure to farm animals have fewer allergies, but nothing showing exposure to more serious germs is good. And some of the germs that we are exposed to are more than just common germs--they are deadly. It might be that those who are exposed to these deadly germs early, and live, are much better off later in life, but that is no reason to expose them to those germs unnecessarily. Of course, you wouldnt give your child a deadly disease so that, if they survived, they'd be resistant to it later in life.

We live in a society that is sometimes alarmist concerning germs, and I have written about this. Yet this doesn't mean that, on the whole, a clean environment does not promote good health, and the article cited above seems to only have the most tenous of indications that it may not.

Thursday, October 29, 2009

Why Swine Flu is not a bunch of hogwash

This updates my previous blog: "Why Swine flu is a bunch of hogwash?"

Things have changed a bit in the months since that blog, and the hysteria I cited has leveled off. President Obama did declare a swine flu emergency a couple days ago, but I think that was a good idea.

Here is what has changed:
1) Swine flu deaths have been at epidemic levels the last three weeks. The chart below (from the CDC) shows flu and pneumonia deaths as a percentage of all deaths. The upper black line indicates epidemic level, and the red line is the current level. The graph shows four years of weekly figures.While this graph doesn't look too serious, and 2008 levels were much further above the threshold at their peak, the scary thing here is that it is so early in the season. This graph serves as a reminder, too, that every year the flu kills thousands of people, and the flu vaccine could prevent a large number of those deaths.

2) Hospitals are already getting crowded. One of the big problems with a real epidemic is the overcrowding of hospitals. This means that the really sick people cannot get treatment, and that is part of the reason the emergency was declared. See this article in USA Today about over-crowding. ok, so it's USA Today, a paper that loves hyperbole, but, again, it's early in the season and any indication of overcrowding at this point is scary.

3) The vaccine is not yet fully available. The regular flu vaccine has been out for weeks. Unfortunately, almost none of the flu this year seems to be covered by that vaccine. The majority seems to be 2009 H1N1 (the swine flu). See this chart for a breakdown. Note the orange/brown is 2009 H1N1, and note the yellow means it is not tested for sub-type, so almost all typed flu is swine flu.
That's why I am worried. The other concern is that, even when the vaccine does come out, people won't take it. See my brother's blog about why you should and the crazies who say you should not.

Thursday, October 15, 2009

Redskins are lucky to play bad teams, but how lucky?

A recent article in Yahoo Sports pointed out that the Washington Redskins are the first team in history to play six winless teams in a row. Here is their schedule so far (also according to the article cited above):

Week 1 -- at New York Giants (0-0)

Week 2 -- vs. St. Louis Rams (0-1)

Week 3 -- at Detroit Lions (0-2)

Week 4 -- vs. Tampa Bay Buccaneers (0-3)

Week 5 -- at Carolina Panthers (0-3)

Week 6 -- vs. Kansas City Chiefs (0-5)

The author of the article, Chris Chase (or, as he notes, his dad-let's call him Mr. Chase), calculates the odds of this as 1 in 32,768. This calculation is incorrect and far too high for several reasons, which I get to below. But first, let me explain how the calculation was likely performed.

The calculation assumes, plausibly, that the Redskins have the same chance of playing any given team (unlike some college teams, who purposely make their schedules easy, this is not possible in the NFL).

The calculation also assumes, not plausibly, that teams that have thus far won no games have a 50-50 chance of winning each game. The implicit assumption there is that all NFL teams are evenly matched. The fact is that there are a few really good teams, a few really bad teams, and a bunch of teams in the middle. Thus, there are likely to be a bunch of winless teams after 5 games, and not, as the incorrect calculation below implies, only 1 winless team of 32 after 5 games.

Finally, the calculation, apparently in a careless error, assumes the chances of playing a winless team the first week are 50-50, when, of course, all teams are winless the first week.

So the Mr. Chase's (incorrect) calculation is
Week 1 chances: 50% ( 1 in 2)
Week 2 chances: 50% (1 in 2)
Week 3 chances: 50%*50%=25% (1 in 4)
Week 4 chances: 50%*50%*50%=12.5% (1 in 8)
Week 5 chances: 50%*50%*50%=12.5% (same as week 3 because the team they played had only played three games)
Week 6 chances: 50%*50%*50%*50%*50%=3.125% (1 in 32)

A law of probability is that the chance of two unrelated events happening is the product of their individual chances. Thus, if the chance of rain today is 50% and the chance of rain tomorrow is 50%, the chance of rain both days is 25%, if those chances are unrelated (which, by the way, they probably aren't). This is why the chances for multiple losses are multiplied together.

But back to the football schedule. To calculate the chances of 6 straight games against winless teams, Mr. Chase reasonably multiplied the 6 individual chances (again it assumed the 6 matchups were unrelated):
50% * 50% * 25% * 12.5% * 12.5% * 6.25% = .003%, or 1 in 32,768.
SO, the 32,768 is the number reported in the article.

The easy correction is that the chances of playing a winless team in the first game is 100%, so the calculation should be:
100% * 50% * 25% * 12.5% * 12.5% * 6.25% = .006%, or 1 in 16,384.
This error has been pointed out in comments on the article.

In addition, other comments point out the other major flaw: teams do not have equal probability of losing. Thus the chance that a team will be, say, 0-2 is not 25% (50%*50%) but something else, depending on the quality of the teams. At the extreme, half the teams lose every game and half win every game (this of course assumes losing teams only play against winning teams, but it is possible).

The reality is certainly not this extreme, which would imply a 50-50 chance each week of playing all losing teams (and thus a 1 in 32 chance of playing 6 in a row). So, how do we figure out the reality?

The easiest way is to look at, each week the percent of teams that are winless. If we assume the Redskins have an equal chance of playing each team, then we can compute the odds each week (click on the week to see the linked source). Note that everything is out of 31 teams instead of 32 because the Redskins can't play themselves.

Week 1: 31 out of 31 teams winless. Chances: 31/31=100%
Week 2: 15 out of 31 teams winless. Chances 15/31=48% (I am assuming no byes first week and I know redskins lost their first game).
Week 3: 8 out of 31 teams winless. Chances: 8/31=26%
Week 4: 6 out of 31 teams winless. Chances: 6/31 = 19%
Week 5: 6 out of 31 teams winless. Chances 6/31 = 19%
Week 6: 4 of 31 teams winless. Chances: 4/31 = 13%

So the actual chances, assuming the Redskins have an equal chance of playing each team each week and cannot play themselves, are: 100%*48%*26%*19%*19%*13% = 0.06%, or 1 in about 1,700. Much more likely than 1 in 32,000 but still pretty unlikely.

And after all these easy games, how are they doing? Unluckily for Redskins fans, not too well...they're 2-3 going into Sunday's game against the winless Chiefs.