What are the chances?

2020-07-17T13:38:00.004-07:00

Reminder to check my website (https://salthillstatistics.com/blog ) for recent posts.
However, here's a preview of the most recent one (https://salthillstatistics.com/posts/80 ):
CUOMO IS NOT YOUR COVID HERO
July 16, 2020 By Alan Salzberg

Our dear leader in orange is clearly a villain with respect to COVID, whose response has ranged from nothing to blaming others. So it is natural that we need heroes. Anthony Fauci has proven worthy over and over, for decades, actually. There are many other scientists who sounded the alarm, also.

When it comes to political leaders, Cuomo has gotten a lot of press recently. This article today (https://www.cnn.com/2020/07/15/health/coronavirus-under-control-states/index.html) lauds New York's turnaround from "worst to first" and Cuomo says "this wasn't only about what government did. This was about what people did." Cuomo put that false modesty aside when he created a ridiculous piece of self-congratulatory graphic art: https://www.nytimes.com/2020/07/14/arts/design/cuomo-covid-poster-new-york.html.

New York has won only this title thus far in the fight against COVID: Worst in the world.

... (see full post here: https://salthillstatistics.com/posts/80 )

Vaccines are good, but this article about them isn't!

2020-01-15T13:10:00.000-08:00

Note: most of my posts to this blog can also be found at my website: https://salthillstatistics.com/blog

The evidence that showing vaccines are safe and that they save lives is generally overwhelming, so I'm always pleased to see another article reviewing the data behind them. I figure such articles will lead to even more people being vaccinated and more lives saved.

However, I was disappointed that a recent New York Times article did the statistics so poorly. The article compares 10,000 people who got various diseases with 10,000 people who were vaccinated. This comparison is inappropriate, because most people who do not get vaccinated do not get the disease they are being vaccinated for, and, especially for diseases like the flu, many people who do get vaccinated get the disease they were vaccinated for. A proper comparison would compare some number of people who were vaccinated against the same number who were not vaccinated.

I use the CDC figures, and the figures presented in the NY Times article to do just that, focusing on the flu, since it is by far the most common illness mentioned in the article. Millions of people in the US get the flu every year and typically tens of thousands die from it.

It is known that the flu vaccine reduces the chances of getting the flu by about 50% (see https://www.cdc.gov/flu/vaccines-work/vaccineeffect.htm). The percentage of the people who get the flu varies quite a bit. In 2017-2018, an estimated 45 million people got the flu (more than 10% of the population) but in 2011-2012, only about 9 million people got the flu (see https://www.cdc.gov/flu/about/burden/index.html, which also has hospitalization and death rates). So, we'll assume a year that is somewhere between the best and worst years, where about 1 in 13 people get the flu. Given a vaccination rate of 50% (sadly it has been below this recently), that means that, you have about a 1 in 20 chance of getting the flu if you get the vaccine and a 1 in 10 chance if you do not.

Now let's consider the effect of everyone in the US (about 300 million people) not getting vaccinated versus getting vaccinated. The following table summarizes the results (using the per 10,000 people figures found in the NY Times article but projecting them to the full population).

This shows that vaccinating everyone would mean fewer than 270,000 hospitalizations (versus 540,000 if no one was vaccinated) and fewer than 21,000 deaths (versus 42,000). The only area where the vaccine is worse is "other bad effects," where I am grouping allergic reactions and Guillain-Barré Syndrome. In this case, about 800 more people might suffer these (sometimes) very serious side effects. However, this pales in comparison to the more than 20,000 lives saved by the vaccine annually.

Even these huge benefits are likely understated. If everyone were to get the flu vaccine, it is likely that it would spread less, since many of the people who currently catch the flu who were vaccinated get the flu from people who are unvaccinated (in 2017-2018 only about 37% of adults were vaccinated: https://www.cdc.gov/flu/fluvaxview/coverage-1718estimates.htm). Also, vaccinations have been shown to reduce the severity of the flu for those who get it (see here again).

So overall, despite the poor comparisons in the article, which seems to imply no one would die from the flu if vaccinated, the benefit of the flu vaccine is still overwhelming.

check my website salthillstatistics.com for more recent posts

2016-11-30T16:03:00.000-08:00

I am posting blogs on my website now. www.salthillstatistics.com

bridge splits re-visited

2015-10-21T07:09:00.001-07:00

A couple years back, I wrote on the chances of various "splits" in bridge (and explained why this is something bridge players care about) in this post, which also explains the math behind the chances.

However, in that post, I failed to include the possibility of 7 trumps being out, because it is fairly rare. Due to some poor bidding on my part, I found myself playing 4 spades last night, and my partner and I only had 6 trumps between us. Here are the chances of the different splits of 7 trumps that are out, between the other two players.

4-3 split: 62.2%
5-2 split: 30.5%
6-1 split: 6.8%
7-0 split: 0.5%

For completeness, here are splits with 6 and fewer (from the prior post).
For hands with 6 trumps out:
3-3 split : 35.5%
4-2 split: 48.4%
5-1 split: 14.5%
6-0 split: 1.5%

For hands with 5 trumps out, we get:
3-2 split: 67.8%
4-1 split: 28.3%
5-0 split: 3.9%

For hands with 4 trumps out:
2-2 split: 40.7%
3-1 split: 49.7%
4-0 split: 9.5%

For hands with 3 trumps out:
2-1 split: 78%
3-0 split: 22%

For hands with 2 trumps out:
1-1 split: 52%
2-0 split: 48%

It's worth mentioning that these probabilities are unconditional. Since the bidding that precedes playing any given hand gives some information, it is typically true that some splits can be ruled out or downplayed. For example. in the 4 spade hand I played last night, a 5-2 or (especially) worse split seemed unlikely, because there was no double from the other side, so I would've put the chances of a 403 split far higher than the unconditional 62%.

See my new posts on my web site

2015-09-04T10:22:00.002-07:00

My newer posts (and some of the old ones) are now on my website:
http://salthillstatistics.com/blog.php Salt Hill Blog

Ultimate Frisbee: to Huck or not to Huck?

2015-02-15T12:05:00.001-08:00

I play a lot of Ultimate Frisbee, a game akin to football in that there are end zones, but akin to soccer in that there is constant action until someone scores. In Ultimate, you can only advance by throwing the disc (so-called because we generally do not generally use Wham-O branded discs, which are called Frisbees). An incomplete pass or a pass out of bounds is a turnover, as is a "stall," where the offense holds the disc without throwing for more than 10 seconds.

In other words, in order for the offense to score, you need to complete passes until someone catches the disc in the end zone. The accepted method of doing this is to complete shorter, high-percentage passes. On a non-windy day, it seems fairly simple for at least one of your six teammates to get open and thus you can march down the field. Of course, one long pass, or "huck," can shortcut the process and give your team the quick score. Much like football, the huck is not typically done except in desperation (game almost over due to time or thrower almost stalled).

However, I am not at all sure this logic makes sense. Suppose you need six short passes to advance to a score. If your team completes short passes with a probability of 90%, you will score about 53% of the time (90% to the sixth power gives the chances of completing six passes in a row). In other words, as long as the chance of completing the huck is more than 53%, you would have a better chance of scoring with a huck.

Thus, the relative chances of scoring via the two methods depends on three things: 1) chance of completing a short pass, 2) chance of completing a huck, and 3) number of short passes needed for a score. The graph below shows the threshold huck completion rate (the rate at which it makes more sense to huck) for different short pass completion rates and always assuming 6 short passes is enough for a score and one huck is enough for a score.

Of course, this simple analysis assumes 6 throws equals a score, and it also leaves out a number of other factors. For example, an incomplete huck confers a field advantage to the hucking team because the opposing team has to begin from the point of in-completion (as long as it was in-bounds). On the other hand, it may not take long for the opposing team to figure out the hucking strategy and play a zone style defense that will lower the hucking chances considerably.

What is a p value and why doesn't anyone understand it?

2014-09-30T15:11:00.001-07:00

I feel like I've written this too many times, but here we go again.
There was a splendid article in the New York times today concerning Bayesian statistics, except that, as usual, it had some errors.

Lest you think me overly pedantic, I will note that Andrew Gelman, the Columbia professor profiled in much of the article, has already posted his own blog entry highlighting a bunch of the errors (including the one I focus on) here.

Concerning p-values the article states:
"accepting everything with a p-value of 5 percent means that one in 20 “statistically significant” results are nothing but random noise." This is nonsense. I found this nonsense particularly interesting because I recently read almost the same line in a work written by an MIT professor.

P-value explained in brief

Before I get to explaining why the Times is wrong, I need to explain what a p-value is. A p-value is a probability calculation, first of all. Second of all, it has an inherent assumption behind it (technically speaking, it is a conditional probability calculation). Thus, it calculates a probability assuming a certain state of the world. If that state of the world does not exist, then the probability is inapplicable.

An example: I declare:"The probability you will drown if you fall into water is 99%." "Not true," you say, "I am a great swimmer." "I forgot to mention," I explain, "that you fall from a boat, which continues without you to the nearest land 25 miles away...and the water is 40 degrees." The p-value is a probability like that -- it is totally rigged.

The assumption behind the p-value is often called a Null Hypothesis. The p-value is the chance of obtaining your particular favorable research result, under the "Null Hypothesis" assumption that the research is garbage. It is the chances that, given your research is useless, you obtained a result at least as positive as the one you did. But, you say, "my research may not be totally useless!" The p-value doesn't care about that one bit.

More detail using an SAT prep course example
Suppose we are trying to determine whether an SAT prep course results in a better score for the SAT. The Null Hypothesis would be characterized as follows:
H0=Average Change in Score after course is 0 points or even negative. In shorthand, we could call the average change in score D (for difference) and say H0: D<=0. Of course, we are hoping the test results in a higher score, so there is also a research hypothesis: D>=0. For the purposes of this example, we will assume the change that occurs is wholly due to the course and not to other factors, such as the students becoming more mature with or without the course, the later test being easier, etc.

Now suppose we have an experiment where we randomly selected 100 students who took the SAT and gave them the course before they re-took the exam. We measure each students change and thus calculate the average d for the sample (I am using a small d to denote the sample average while the large D is the average if we were to measure it across the universe of all students who ever existed or will exist). Suppose that this average for the 100 students is an score increase of 40 points. We would like to know, given the average difference, d, in the sample, is the universe average D greater than 0? Classical statistics neither tells us the answer to this question nor does it even give the probability that the answer to this question is "yes."

Instead, classical statistics allows us only to calculate the p-value: P(d>=40| D<=0). In words, the p-value for this example is the probability that the average difference in our sample is 40 or more, given that the Universe average difference is 0 or less (Null Hypothesis is true). If this probability is less than 5%, we usually conclude the Null Hypothesis is FALSE, and if the NUll Hypothesis were in fact true, we would be incorrectly concluding statistical significance. This incorrect conclusion is often called a false positive. The chance of a false positive can be written in shorthand as P(FP|H0), where FP is false positive, "|" means given, and H0 means Null hypothesis. (Technically, but not important here, we calculate the probability at D=0 even though the Null Hypothesis covers values less than zero, because that gives the highest (most conservative) value.) If the p-value is set at 5% for statistical significance, that means P(FP|H0)=5%.

A more general way of defining the p-value is that the p-value is the chance of obtaining a result at least as extreme as our sample result under the condition/assumption that the Null Hypothesis is true. If the Null Hypothesis is false (in our example if the universe difference is more than 0), the p-value is meaningless.

So why do we even use the p-value? The idea is that if the p-value is extremely small, it indicates that our underlying Null Hypothesis is false. In fact, it says either we got really lucky or we were just assuming the wrong thing. Thus, if it is low enough, we assume we couldn't have been that lucky and instead decide that the Null Hypothesis must have been false. BINGO--we then have a statistically significant result.

If we set the level for statistical significance at 5% (sometimes it is set at 1% or 10%), p-values at or below 5% result in rejection of the Null Hypothesis and a declaration of a statistically significant difference. This mode of analysis leads to four possibilities:
False Positive (FP), False Negative (FN), True Positive (TP), and True Negative(TN).
False Positives occur when the research is useless but we nonetheless get a result that leads us to conclude it is useful.
False Negatives occur when the research is useful but we nonetheless get a result that leads us to conclude that it is useless.
True Positive occur when the research is useful and we get a result that leads us to conclude that it is useful.
True Negatives occur when the research is useless and we get a result that leads us to conclude that it is useless.
We only know if the result was positive (statistically significant) or negative (not statistically significant)--we never know if the result was TRUE (correct) or FALSE (incorrect). The p-value limits the *chance* of a false positive to 5%. It does not explicitly deal with FN, TP, or TN.

Back to the Question of how many published studies are garbage, but it gets a little technical
Now, back to the quote in the article: "accepting everything with a p-value of 5 percent means that one in 20 “statistically significant” results are nothing but random noise."
Let's consider a journal that publishes 100 statistically significant results regarding SAT courses that improve scores and statistical significance is based on p-values of 5% or below. In other words, this journal published 100 articles with research showing that 100 different courses were helpful. What number of these courses actually are helpful?
Given what we have just learned about the p-value, I hope your answer is 'we have no idea.' There is no way to answer this question without more information. It may be that all 100 courses are helpful and it may be that none of them are. Why? Because we do not know if these are all FPs or all TPs or something in-between--we only know that they are positive, statistically significant results.

To figure out the breakdown, let's do some math. First, create an equation, using some of the terminology from earlier in the post.
The Number of statistically significant results = False positives (FP) plus True positives (TP). This is simple enough

We can go one step further and define the probability of a false positive given the Null hypothesis is true and the probability of a true positive given the alternative hypothesis is true -- P(FP|H0) and P(TP|HA). We know that P(FP|H0) is 5% -- we set this is by only considering a result statistically significant when the p-value is 5%. However, we do not know P(TP|HA), the chances of getting a true positive when the alternative hypothesis is true. The absolute best case scenario is that it is 100%--that is, any time a course is useful, we get a statistically significant result.

Suppose that we know that B% of courses are bad and (1-B)% of courses are helpful. Bad courses do not improve scores and helpful courses do. Further, let's suppose that N courses in total were considered, in order to get the 100 with statistically significant results. In other words, a total of N studies were performed on courses and those with statistically significant results were published by the journal. Let's further assume the extreme concept above that ALL good courses will be found to be good (no False Negatives), so that P(TP|HA)=100%. Now we have the components to figure out how many bad courses are among the 100 publications regarding helpful courses.

The number of statistically significant results is :
100= B*N*P(FP|H0) + (1-B)*N*P(TP|HA)
This first term just multiplies the (unknown) percent of courses that are bad by the total studies performed by the percent of studies that will give the false positive result that says the course is good. The second term is analogous, but for good courses that achieve true positive results. These reduce to:
100 = N(B*5% + (1-B)*100%) [because the FP chances are 5% and TP chances are 100% ]
= N(.05B +1 - B) [algebra]
= N(1-.95B) [more algebra]
==> B = (20/19)*(1- 100/N) [more algebra]
The published courses equal B*N*P(FP|H0), which in turn equals (1/19)*(N-100) [using more algebra].

If you skipped the algebra, what this comes down to is that the number of bad courses published depends on N, the total number of different courses that were researched.
If N were 100, then 0 of the publications were garbage and all 100 were useful.
If the N were 1,000, then about 947 were garbage, about 47 of which were FPs and thus among the 100 publications. So 47 garbage courses were among the 100 published.
If the total courses reviewed were 500, then about 421 were garbage, about 21 which were FPs and thus among the 100 publications.
You might notice, that given our assumptions, N cannot be below 100, the point at which no studies published are garbage.
Also, N cannot be above 2000, the point at which all studies published are garbage.

You might be thinking--we have no idea how many studies are done for each journal article accepted for publication though, and thus knowing that 100 studies are published tells us nothing about how many are garbage--it could be anything from 0 to 100% of all studies! Correct. We need more information to crack this problem. However, 5% garbage may not be so terrible anyway.

While it might seem obvious that 0 FPs is the goal, such a stringent goal, even if possible, would almost certainly lead to many more FNs, meaning good and important research would be ignored because its statistical significance did not meet a more stringent standard. In other words, if standards were raised to 1% or 0.1%, then some TPs under the 5% standard would become FNs under the more stringent standard, important research--thought to be garbage--would be ignored, and scientific progress would be delayed.

Another perspective on the admissions game--early admission

2014-05-05T08:18:00.002-07:00

One thing I failed to consider in my previous blog is early admissions.

By admitting many or most of their students early, a college can appear to be very selective when, in fact, it is only selective for people who do not apply early. Applying early decision is the equivalent of ranking a school first, and schools thus know it will improve their matriculation rate by admitting students early. Also, students who really wanted to attend a particular school will perhaps be better than students who may have chosen the school 2nd or 3rd or worse.

A summary of actual acceptance rates at Ivy League schools, early and otherwise, appears here. To understand what is happening here, take Harvard, with the lowest overall acceptance rate of 5.8%. If you apply there through regular admissions, you have a 3.8% chance (less than 1 in 25) of being admitted. However, if you apply early decision, your chances increase to 18.4% (about 1 in 5 or 6). Of course, the quality of the students is likely different between the group that applies for regular admission and the group that applies early, so that the difference between two equally qualified students is likely lower. However, it seems doubtful that the entire difference is in quality of the application pool.

At a recent presentation I heard from an admissions officer at a local college, he stated outright that the standards change between early and later admissions even for "rolling" admissions schools. Put simply, early applicants get priority and are more likely to be accepted.

So what's the strategy? Apply early, but you only have one shot at early decision (typically you can only apply to one school). Therefore, apply to a top choice but the one in which you have a decent chance of getting into, according to that school's average SATs, grades, etc. If you reach too high, you will be rejected and relegated to the regular application pool, where chances of getting into top schools is far lower.

Getting into College

2014-04-28T08:56:00.001-07:00

Now that I have a 9th-grader, I am starting to think about college admissions. The urban myth is: "If you were applying to college now, you'd never get into the (great) college you went to (in the 1980s or 1990s)."
This belief is driven by lower acceptance rates at many elite colleges, as well as the parents and peers of those who went to elite schools. This washington post article debunks this myth. It refers to an article about a study at the Center for Public Education, which has more detail. On the other hand, this paper shows that while overall selectivity fell, the top schools are more difficult to get into, at least as measured by SAT/ACT scores.

Here are some factors that could be at play:
1) Regression to the mean. People who went to great schools are, on average, high achievers compared to the general public. However, if you take the cohort who were accepted to these schools, some fraction will have gotten in by chance, scoring better or doing better just by chance. The next generation will regress to the mean, and this means it will appear as if colleges are more selective,. among those who went to more selective colleges (by the same token, among those who went to the least selective schools, there will be the opposite effect)...all else being equal of course. This is the same effect that results in the children of the tallest people being shorter than their parents, even though they still may be taller than the average person.

2) People apply to more schools. When your average person applies to 10 schools, whereas the average person used to apply to 3, acceptance rates can go down, resulting in higher perceived selectivity. This article shows the number of people applying to four or more schools more than doubling since the 70s. The increase in applications might also imply that students that never would have applied to, say, Harvard, are now applying. This is why a lower acceptance rate doesn't actually mean it is more difficult to get admitted, once you adjust for the quality of the student.

3) Slight increase in actual selectivity at a few schools. The New York Times had an interesting article regarding the changes in selectivity, which focused on the number of spots per 100,000 population (rather than the number admitted). Harvard, with the greatest drop in selectivity, had a drop of 27% (the article focused only on US student rates) in the last 20 years. While this might seem large, keep in mind that their admissions rate has dropped about two-thirds, from 18% to 6%, a much larger change.

4) Student quality improved. There is certainly room in the equation for a true increase in student quality. As the article above implies, the top schools did have moderate increases in test scores.

No matter whether college is the same or more difficult to get into, it certainly appears that it is more stressful. One solution for this is the med school solution (and NYC schools solution): a ranking and matching program. This is fairly simple and goes as follows: each student ranks each school he/she applies to in order of preference. Colleges rank the students that apply in order. Colleges are matched students that are highest on their list, beginning with students who ranked them first. Students are required to go to the college they are matched with, or enter a second consolation round.

CitiBike share--what are the chances?

2013-12-29T05:14:00.002-08:00

I have been working with Joe Jansen on the Citibike data in the R Language. Citibike is New York's bike sharing program, which started in may and currently has more than 80,000 annual members. The R Language is a freely available object oriented programming language designed originally for doing statistics at Bell Labs.

Joe has downloaded all the data and done an extensive analysis, which you can find here. I did a simpler analysis predicting trips using a statistical regression model and graphed it using the function ggplot2 in R. I found that maximum temperature, humidity, wind, and amount of sunshine to be significant factors in predicting the number of trips that will be taken on any given day. While rain was not a significant factor, it is likely confounded with sunshine, so it is only not a factor after accounting for amount of sunshine. Also, keep in mind that a number of days with rain, especially in the summer, are generally sunny days with an hour or two of rain or thunderstorms. The day of the week, surprisingly, was not an important factor influencing number of trips. The R-squared, which is a typical measure of predictive power and is on a scale from 0 to 100%, was more than 70%.

Here is a graph of the results that shows the predicted number of trips per 1,000 members versus the actual number of trips. The day of the week is indicated by the color of the point.

I am an amateur with the function ggplot, and so the legend for day of the week has the days of the week in alphabetical order rahter than Monday , tuesday, etc. Help on that and other aspects of ggplot for this graph would be welcome (please comment accordingly).

If day of the week made a difference, for any given point on the x-axis (predicted trips) you would have more of a certain color that is high on the y-axis than other colors. For example, if more trips occurred on weekends, you would have more of the green colors (Saturday and Sunday) on top. However, no such affect seems to exist. I guess people are enjoying Citibike every day of the week, or casual riders on the weekends are roughly making up for weekday commuting riders.

Highest property taxes in America?

2013-11-25T14:51:00.002-08:00

I read on CNN's money website today that Westchester County, NY has the highest property taxes in America (see Nov 25 Money website). Moreover, the New York area in general seems to have the highest taxes. That surprises me, because, as an owner of a co-op in Brooklyn, I know that my property taxes, and property taxes in general in the city, are extremely low.

So what's the problem? If you click on the "interactive graph" you find that you can display results in two ways. The headline and accompanying map refers to the taxes in dollars. This type of information is little more than a graph of housing prices in the US, because expensive houses have higher taxes than cheaper houses. Sure, tax rate comes into play, but the owners of a $10 million mansion in a low tax district still generally pay more property taxes per year than the owners of a $200,000 house in a high tax district.

Here's an example. Click on Brooklyn on the interactive map and you will see taxes of $3,050. Click on Richland County, South Carolina (where my parents live), and you will see that taxes average $1,129, nearly one-third the "high" taxes of Brooklyn. Yet this belies the fact that housing prices are much higher in Brooklyn.

How much higher? Well, to see this, go to the interactive map that shows taxes as a percentage of home prices. This map accounts for different housing costs and shows taxes in the familiar manner, as a rate. In this map, you can see that Brooklyn property taxes are 0.53% of housing prices and Richland County's are 0.75%. (By the way Westchester County is 1.76%, which is high but certainly not the highest).

Thus, while taxes on the map shown in the headline are nearly three times higher in Brooklyn than in Richland County, S.C., taxes are actually 30% lower in Brooklyn, when looked at as a percentage of home prices.

What are the chances of different "splits" in bridge?

2013-08-12T19:52:00.000-07:00

If you know how to play bridge, skip to the fourth paragraph!
In bridge, 13 cards are dealt to each of 4 players (so all 52 cards are dealt). Players sitting across from each other are partners, so we could think of the two teams positions as North and South and East and West on a compass. A process of "bidding" ensues, in which the team with the highest bid has selected a "trump" suit and a number of rounds, or "tricks" that they have contracted to take.

Suppose North-South had the highest bid and North is playing the hand. Then East "leads" a card, meaning East places a card (any card he/she wants) face up on the table. The play goes clockwise, East-> South-> West -> North. South, West and North must play a card of the same suit that East played. When four cards are down, the highest one wins the "trick" and that winner puts any card of his/hers down, in order to begin a new trick. Play continues until 13 rounds of 4 cards each have been played.

Suppose that West wins a trick and thus gets to play a card. He plays the Ace of Hearts. North, who is next and otherwise required to play hearts, is out of hearts. North can play any other suit, but if he chooses to play the "trump" suit (say Spades are trump), then he automatically wins the trick unless East or South is also out of hearts and play a higher card in Spades (the trump suit). In other words, trumps are very valuable. In the bidding process, the teams try to bid in such a way that the trump suit is one in which they have a lot of cards. Generally, the team with the winning bid (the "contract") will have at least 7 of the 13 trumps between the two of them, meaning the other team will have 6 or fewer. Whatever the number the opponents have, it is generally advantageous to the contract winners if they have the same number each rather than them being skewed to one or the other opponent.

Bridge players begin here:
So here is the probability piece. Suppose you and your partner hold 7 trumps between you, what are the chances the opponents each have 3? have 4 and 2? have 5 and 1? have 6 and 0? To solve this sort of problem, we use combinations. See my earlier post for some detail (and more odds of bridge hands).

The opponents have 26 cards altogether and we want to know the number of different groups of six among those 26 cards. Think of this process as a process of picking six cards from the 26. You have 26 choices for the first card, 25 for the second, and so on, and thus there are 26*25*24*23*22*21 total 'permutations' of size 6. However, we do not care what order they are in so for each first card, there are 6 possible positions, for each second card, 5, etc., and thus we need to divide these permutations by 6*5*4*3*2*1, in order to get the number of unique sets when order does not matter. Again, see my earlier post for a more detailed explanation of this concept.

The R language allows for calculation of this combination of 6 out of 26 with the command "choose(26,6)." This is the denominator when we calculate probabilities, because it gives the total number of equally likely combinations of 6 cards. The numerator is split into the two bridge hands of 13 cards each. The number of combinations with an even 3-3 split are "13 choose 3" for both hands.
To calculate that probability in R, we write: choose(13,3)*choose(13,3)/choose(26,6) and get 35.5%

How about hands with a 4-2 split? That is the chance that Opponent 1's hand has 4 trumps multiplied by the chance that Opponent 2's hand has 2 trumps PLUS the chances that Opponent 2's hand has 4 trumps multiplied by the chance that Opponent 1's hand has 2 trumps. Since the chance that either Opponent has 4 are the same, we can just double the probability of Opponent 1 having 4 and Opponent 2 having 2. We get: choose(13,4)*choose(13,2)*2/choose(26,6) = 48.4% of one opponent having 4 and the other having 2 trumps.

Continuing this calculation, we get the following chances for hands with 6 trumps in the opponents hands( 6 trumps "out"):
3-3 split : 35.5%
4-2 split: 48.4%
5-1 split: 14.5%
6-0 split: 1.5%

For hands with 5 trumps out, we get:
3-2 split: 67.8%
4-1 split: 28.3%
5-0 split: 3.9%

For hands with 4 trumps out:
2-2 split: 40.7%
3-1 split: 49.7%
4-0 split: 9.5%

For hands with 3 trumps out:
2-1 split: 78%
3-0 split: 22%

For hands with 2 trumps out:
1-1 split: 52%
2-0 split: 48%

I find it interesting that the even split (for 2, 4, or 6 trumps out) is only the most likely scenario when 2 trumps are out. When 4 trumps are out, a 3-1 split is more likely. When 6 are out, a 4-2 split is more likely.

Simpson's Paradox

2013-04-29T13:29:00.000-07:00

A North Slope real estate broker (named North) is trying to convince you that North Slope is a more affluent neighborhood than South Slope. To prove it, he explains that professionals in North Slope earn a median income of $150,000, versus only $100,000 in South Slope. Working class folks fare better in North Slope also, with hourly workers making $30,000 a year to South Slope's $25,000.

The South Slope real estate broker (named South) explains that North is crazy. South Slope is much more affluent. The median income in South Slope is $80,000 versus the North Slope median of $40,000.

Question: Who is lying, North or South?
Answer: It could be neither.
Consider the breakdown of income shown below.

We can see that North is not lying. Half the hourly South Slope workers earn $20K and half $30K, for a median of 25K. A similar calculation for the North Slope workers yields an hourly median of 30K. For professionals in the South Slope, the median is $100K, with half earning $80K and half earning $120K. In the North Slope, a similar calculation yields the median of $150,000.

South is not lying either. For the South Slope, the median is $80,000, since more than half of the workers make less than or equal to $80,000 and more than half make $80,000 or more (according to the definition of median, at least half must be above the median and at least half must be below). For the North Slope, the median is $40,000.

What happened here? The problem, and the reason for the conflict between the wages according to type of work and the overall wages, is that the percentage of residents in each category does not match. Thus, though professionals and hourly workers make more in the North Slope, there are far more hourly workers in the North Slope than in the South Slope. Thus, the overall median (or mean) income is lower in the North Slope.

While Wikipedia has an entry for Simpson's Paradox, a specific example of which I described above, it seems that most people are unaware of it. My motivation for writing about it is not the made-up example I present above but the fact that I encounter it so much in my everyday work. I either make my clients very happy by explaining that the 'bad' effect they have found may well be spurious or, anger them when I explain the interesting relationship they have found is a mere statistical anomaly.

The Worst Graph

2012-11-12T06:51:00.000-08:00

One reason for quotes like there are "there are lies, damn lies, and statistics" is because of graphs like these:

This was on the front of money.com this morning with the caption: "Huge US Oil Boom ahead: The U.S. will overtake Saudi Arabia to become the world's biggest oil producer before 2020."

I was shocked at first glance, because I thought oil production was going to go up 10 or 20-fold from the tiny amount in 2011 to the huge amount in 2015. That does indeed sound huge. Then I looked at the left y-axis, where I can see it is only going from 8 million to 10 million barrels a day, an increase of about 25%.

Fine, you say, but you can still easily see that the light blue bar is above the dark blue bar starting in 2025, showing the US overtakes Saudia Arabia.

I'm afraid not. The two bars are not Saudia Arabia versus US production but oil versus gas production, and it is not even clear whose production is depicted. Is the the whole world, the US, Saudi Arabia? The article puts US production at 5.8 million barrels a day in 2011, so it appears not to be US production, but other sources put it at closer to 9 million, so maybe it is the US.

Ok, you say, despite the poor caption, at least you can clearly see that gas production begins to top oil production (in whatever country the graph is depicting) around 2025.

Not really. Since oil is in millions of barrels per data and the gas is in billions of cubic meters (per day, per month, per year, who knows?), this is actually not the case. The year 2030 shows oil at about 10 million barrels per day and gas at nearly 800 billion cubic meters. Which is more? Maybe the readers of money can quickly translate these figures into BTUs or some useful measure of production output, but I sure can't tell you.

Fine, you say, but since they start at about the same level, we at least know that gas increases more than oil over the time period.

Sorry, even that is incorrect. Look at the scale on the left axis (oil), which starts at 8 and goes to 12, a 50% increase. The right axes starts at 600 and goes to 800, a 25% increase. Thus, oil goes from 8 to just over 10 (more than a 25% increase) while gas goes from a little over 600 to just little under 800 (less than a 33% increase--maybe a little more than oil but maybe not).

The only thing that appears to be correct about this graph is the year, until you realize that in the first period, there are only four years (2011-2015) while in the other periods, there are five year differences.

Election Polls

2012-09-21T13:13:00.001-07:00

With the upcoming election, I have been following my favorite prediction site: electoral-vote.com. That site has a big map showing current predictions state-by-state as well as the overall electoral vote prediction. It also shows the senate predictions. It has been amazing accurate in the past (though, of course, this doesn't mean that the sites predictions wont change considerably between now and the election). The predictions are all based on some sort of averaging of polls, and the site shows the results of each poll. What I have found interesting (and it has been noted on the site) is that some polls appear to lean toward Obama while others lean toward Romney. In other words, the polls appear to have biases.

Why? Theories abound about this, and much of it comes down to the polling methodology. The most compelling reason I have seen comes from Nate Silver's blog on the New York Times site. Silver's blog compares traditional polls, which call only land-line phones, with more modern polls, which call cell phones along with land-lines.

As shown by a chart in Silver's blog, there is a clear and consistent difference in every swing state between the two types of polls, with modern polls leaning toward Obama. This is consistent with the idea that younger people are both likely to vote for Obama and also more likely to not have landlines. This issue has been pointed out before, and a Pew Research Report in 2010 noted substantial differences in party affiliation between voters who had a landline and those who only had a cell phone.

There is no doubt that the percentage of homes without landlines is rising rapidly. See, for example, the CDC Report from last year, showing that about 30% of adults did not have a landline in 2011, about twice the percentages as 2008. This increase in wireless-only homes does not necessarily mean an increase in bias (more and more Republicans may be shedding their landlines, and thus the bias could fall even as wireless only usage increases). Still, the departure in the polls indicates that a bias persists.

Born to Run?

2011-11-11T09:32:00.000-08:00

About a year ago, I read a book called "Born to Run," by Christopher McDougall, who last week wrote an article in the New York Times Magazine on the same subject.

McDougall's basic premise is that we were faster and less injury-prone before we started wearing all these fancy running shoes and that they are what's causing running injuries. For example, in the New York times article:

"Back in the ’60s, Americans 'ran way more and way faster in the thinnest little shoes, and we never got hurt,' Amby Burfoot, a longtime Runner’s World editor and former Boston Marathon champion, said during a talk before the Lehigh Valley Half-Marathon I attended last year. 'I never even remember talking about injuries back then,' Burfoot said. 'So you’ve got to wonder what’s changed.'"

Statistics frowns on such anecdotal evidence, though it does make a good story. Did we really run faster? There are a lot of facts that we can look at though average times aren't among them. Marathon records (shown in Wikipedia) for men have indeed only downticked a little since the sixties. In 1970, Ron Hill of the UK (close enough, runnig-shoe wise, to be considered american?) set a record of 2:09:29. This year, a new record of 2:03:38 was set (the most recent US record was 2:05:38 in 2002). Six minutes in 40 years doesnt seem like much, but is it because of the shoes or because the sport has matured? And are Americans seen less because running isnt really a big competitive sport here?

When you look at women's times, the changes are much more dramatic. Women more recently began running marathons and fewer participated in the sport in general until relatively recently. In 1970, the women's marathon record was 3:02:53 (set by an american). In 2003, Paula Radcliffe (England) ran it in 2:15:25. That's a 47 minute improvement, or nearly 2 minutes per mile. In the 2011 New York Marathon, 40 women from the US bested the 1970 record time (see marathon site here for results).

So, I can't agree that we ran "way faster" 40 years ago. This doesn't mean that bare foot runners are slower than shoed runners because changes over the last fourty years in the level of competition, and improvements in training and fitness, rather than shoes, might have been the factors contributing to improved times.

How about injury rates? Do people get more injuries with running shoes than without? Unfortunately, any data on injury rates is tainted by the changes in the makeup of the population that runs (from a small, highly fit population to a large more population more varied in fitness--think of the then-overweight President Clinton running with a stop at McDonalds post-jog), and there haven't been any studies that directly compare injuries over time for barefoot running against running shoe running. A good summary article is here.

A recent article in Nature, while not looking at historical data, supports McDougall's contention that running shoes can be more harmful than bare feet when running. The article is lead-authored by Daniel Lieberman, a big advocate of barefoot running, so his bias may have been to look at things he believed were helpful about barefoot running and not at aspects of barefoot running that may be harmful. The article looks at impact forces and not at injuries, and doesn't consider that runners with shoes may be able to change their stride to reduce the impact forces (McDougall says this is hard to do with running shoes, and, from my own experience, I tend to agree, though I don't think it is impossible).

The statistical net-net is that there is no direct evidence either way right now. I admit some bias but I would say that the lack of evidence, given the power and money behind the shoe industry, tends to make me believe that, at best, fancy shoes are no better than bare feet, because if there were an effect in favor of shoes, I would certainly think we'd have seen a study by now (this is something correctly pointed out by McDougall and other advocates of barefoot running). Therefore, don't be surprised if you see me running with feet au-naturel someday soon.

Detecting cheating

2011-03-30T06:32:00.001-07:00

In my professional work, I like being the statistical sleuth, trying to figure out whether a person or company cheated, and how much they cheated. Thus it was with a lot of interest that I read a recent article in USA Today describing suspicious activity that went on some standardized tests in DC schools.

It seems that standardized tests at certain DC schools have improved dramatically. For example, the article says, "in 2008, 84% of fourth-grade math students were listed as proficient or advanced, up from 22% for the previous fourth-grade class." Of course, this could just be part of the amazing turn around.

However, the review found that this dramatic change corresponded with a another interesting statistic: the school had a very high number of erased answers that were changed from wrong answers to right answers (WTR erasures). Again, here's what the article said: "On the 2009 reading test, for example, seventh-graders in one Noyes classroom averaged 12.7 wrong-to-right erasures per student on answer sheets; the average for seventh-graders in all D.C. schools on that test was less than 1. The odds are better for winning the Powerball grand prize than having that many erasures by chance."

Here's my problem with this logic: the calculation of the chances assumes that each student is acting independently and erasing much more than usual. In other words, the chances are calculated assuming that the students are randomly grouped by school with respect to the number of WTR erasures they have, and thus no school should have a particularly high or low number of erasures: number of erasures and the associated school would be statistically independent.

This statistical independence assumption falls apart if there is cheating, wherein teachers erase wrong answers and change them to correct answers after the test is completed. However, the statistical independence assumption also could also fall apart for innocuous reasons.

Suppose the students at this school were instructed to arbitrarily fill in the last 10 questions immediately upon beginning the exam (this might be a good strategy if there is no penalty for guessing and if many students do not finish the exam). Then, for the ones who get to the end of the test, they are erasing most of their guesses. This is a completely legitimate strategy, but it would make raise the number of WTR erasures a great deal. A lot of more complicated test taking strategies would also lead to more erasures, and if this school in particular taught those strategies, there would be a very high chance that there would be far more erasures at this school than at others, and some of the people interviewed cited strategies that may have led to more erasures.

Thus, the high erasure rate, even WTR erasures, may have a relatively simple explanation: this school effectively coached the kids in test taking while other schools did not or coached the children differently.

The article provides a link to several documents summarizing the results of the analysis. What I find interesting is that the worst school, BS Monroe ES, in terms of WTR erasures, also has a lot of WTW (wrong to wrong erasures). On average, this school has about between 2 and 3 WTW erasures per student, or about 1 WTW for every 5 WTR erasures. A more interesting, and I think more revealing, analysis would be to see how this ratio compares to the normal ratio. If the normal ratio is 1 WTW to 5 WTR, it indicates cheating may not have been the reason for the erasures (unless the cheaters were purposefully erasing some and changing them to wrong answers--which seems unliklely since there is no indication potential cheaters realized erasures could be detected at all). If the general ratio is far from 5 to 1, it would be another indicator of a different process going on at BS Monroe ES, perhaps involving cheating though it is still hard to rule out other, innocuous explanations that involve test-taking strategy.

Another analysis would be to look at the WTR vs. WTW erasures student by student. Presumably, students who answered a higher percentage of un-erased problems correctly would have a better ratio of WTR to WTW erasures. If that were not true, then it would lead more clearly to the conclusion that someone else was doing the erasing.

The research revealed in the article shows the correlation of two things: a dramatic increase in test scores and a dramatic number of WTR erasures. Cheating is one explanation for these increases. Another, however, is the implementation of a smart test-taking strategy at the school, which might well be part of an overall program to increase the test scores and improve the school. A statistical test can have a seemingly dramatic result (less likely than winning the lottery), but while defeating a specific hypothesis (independence of erasures by school), it doesn't necessarily prove another hypothesis (cheating).

Throw away your cold medicine again?

2010-09-28T14:09:00.000-07:00

A couple years ago, I wrote about a study that looked at the effect of a seawater nasal spray on the health of children (see that post).
Yesterday's New York Times, explored a very similar claim. Anahad O'Connor's column, "Really? The Claim: Gargling With Salt Water Can Ease Cold Symptoms," looks at a study of 387 Japanese adults aged 18 to 65 (see this page for an abstract). Treatment groups gargled with PLAIN water or a "povidone-iodine" solution. Those gargling with plain water did the best, with 0.17 URTIs (upper respiratory tract infections) every 30 person-days, meaning about 1 in 6 get a URTI per month if they gargle with water. The control group had a rate of .26, meaning about 1 in 4 got a URTI. The iodine group had a rate of .24, also meaning about 1 in 4 go a URTI.

So water looks pretty good. The only caveat, and it is the same as the issue I mentioned in the earlier post, is that the outcomes were self-measured. The people doing the gargling reported whether or not they had a URTI. IN Japan, where the study was performed, there is a strong bias toward water gargling, at least according to the abstract of the study, which says: "Gargling to wash the throat is commonly performed in Japan, and people believe that such hygienic routine, especially with gargle medicine, prevents upper respiratory tract infections (URTIs)." In fact, the article reports that those in the control group gargled one time a day on average as well l (but those in the treated group gargled around 3 times a day). This affinity for water gargling and the belief that it stops infection may result in water-gargles reporting fewer infections, thus throwing the results of the study into question.

The New York Times, by the way, gives recommendations based on an upcoming book by Philip Hagen, to gargle with *salt* water, but cites this study, which is referring to *plain* water only.

My conclusion? If you THINK it is going to work, it's fairly likely water gargling will be effective, and it is a lot cheaper than buying some kind of preventative medicine. If you don't think it will work, this study provides little help in deciding whether it actually will work.

You asked for it, you got it. Toyota!

2010-03-15T15:21:00.000-07:00

I think that's how the ad line went. When? maybe 25 years ago.

Well, it seems to apply now. Sudden acceleration. Mention a problem with a car, any problem with any car, and people will start crawling out of the wood-work with the complaint. Why? It's a numbers game. There were more than 100,000 pri-i(?) sold in the US in 2005-9. With that many people driving them around, any tiny problem that is reported is going to be "substantiated" by others. Those of us old enough to remember the Audi 5000 found the high correlation between those Audi's with sudden acceleration and those sold to 85 year-old ladies inexplicable (studies mostly concluded it was driver error--see a recent article here in Wired).

The latest, after the brake-related Prius recall, is the claim of sudden acceleration. A guy in California managed to call 911 while it was happening--pretty amazing, huh? Unless, of course, you made it all up. Here's what the current thoughts about it are (from wikipedia):
"On March 8, 2010, a 2008 Prius allegedly uncontrollably accelerated to 94 miles per hour on a California Highway (US), and the Prius had to be stopped with the verbal assistance of the California Highway Patrol as news cameras watched ^[86]. Subsequent to the event, media investigations uncovered suspicious information about the alleged runaway Prius driver, 61-year old James Sikes, including false police reports, suspect insurance claims, theft and fraud allegations, television aspirations, and bankruptcy.^[87]^[88] Sikes was found to be US$19,000 behind in his Prius car payments and had $US700,000 in accumulated debt.^[87] Sikes stated he wanted a new car as compensation for the incident.^[87]^[89] Analyses by Edmunds.com and Forbes found Sikes' acceleration claims and fears of shifting to neutral implausible, with Edmunds concluding that "in other words, this is BS",^[90] and Forbes comparing it to the balloon boy hoax.^{[88]"

Notwithstanding the apparent CA tale above, the reality is that the rare problem is a tough nut to crack statistically. Suppose there is an issue in 1 in 10,000 Prius' and that this issue only crops up on one in 10,000 rides on those cars. Thus, it's a 1 in 100 million car rides in Prius. Even among those, it may be a very short-lived problem and not cause any injury or accident. Such a rare problem might be drowned out by other driver error problems, such as accidently hitting the gas instead of the break, perceiving that the car is accelarating when it is not, hitting both the gas and the break simultaneously in an attempt to hit the break. Each of these things can be exceedingly rare (1 in a million) and still be 100 times as common as the real problem.

There are other ways to go about teasing out rare events. In the lab, a machine could possibly simulate conditions that were occurring when the supposed sudden acceleration took place and see if it is repeatable. Yet these conditions are hard to figure out, as they are determined with the imperfect information of the person reporting the incident. As might be the case with the recent report, that person could be lying, but even if not, they are likely shooken up enough that they cannot remember the exact conditions very well. Consider airline crashes, where we often have very objective information (the black box), but it is still very difficult to figure out what happened and why.

One thing seems certain to be true: we won't know whether or not Prius cars are at fault for a long time to come, and far fewer of them will be bought in the next couple years.}

More germs = less disease?

2009-12-09T15:57:00.000-08:00

So says an article in today's Science Daily, which reports on a recent study at Northwestern of children from the Philippines. The study finds that children from the Philippines have much lower levels of C-reactive protein (CRP), which indicates better resistance to disease. Exposure to germs was much higher for the children in the Philippines.

So what's wrong with this study? It's a very tenuous association, and from what I can gather in the articles, no attempt was made to ensure the children in the U.S. that were compared to the children in the Philippines were similar in other ways. They might be different in CRP due to other environmental or hereditary factors. Perhaps it's the weather? The diet? One of any number of things could account for the difference.

In addition, the study appears to ignore the much higher infant mortality rate and much lower life expectancy in the Philippines (you can try www.indexmundi.com for life expectancy and other information by country). In other words, even if higher germ exposure does mean lower CRP, does it actually mean less disease and longer life? The broad indication is that it does not.

In order for the study to be valid, it needs to adjust for whatever inherent differences (in addition to germ exposure) exist between Phillipino and US children, and then see if CRP levels are still different. An even better way to do such a study would be to study children living in similar environments (same place, socio-economic situation, etc.) and determine if the ones exposed to more germs had lower levels of CRP when they reached adult-hood.

I've seen articles (see this for example, but I can't find a more definitive one at this time) that indicate that children with early exposure to farm animals have fewer allergies, but nothing showing exposure to more serious germs is good. And some of the germs that we are exposed to are more than just common germs--they are deadly. It might be that those who are exposed to these deadly germs early, and live, are much better off later in life, but that is no reason to expose them to those germs unnecessarily. Of course, you wouldnt give your child a deadly disease so that, if they survived, they'd be resistant to it later in life.

We live in a society that is sometimes alarmist concerning germs, and I have written about this. Yet this doesn't mean that, on the whole, a clean environment does not promote good health, and the article cited above seems to only have the most tenous of indications that it may not.

Why Swine Flu is not a bunch of hogwash

2009-10-29T07:47:00.000-07:00

This updates my previous blog: "Why Swine flu is a bunch of hogwash?"

Things have changed a bit in the months since that blog, and the hysteria I cited has leveled off. President Obama did declare a swine flu emergency a couple days ago, but I think that was a good idea.

Here is what has changed:
1) Swine flu deaths have been at epidemic levels the last three weeks. The chart below (from the CDC) shows flu and pneumonia deaths as a percentage of all deaths. The upper black line indicates epidemic level, and the red line is the current level. The graph shows four years of weekly figures.While this graph doesn't look too serious, and 2008 levels were much further above the threshold at their peak, the scary thing here is that it is so early in the season. This graph serves as a reminder, too, that every year the flu kills thousands of people, and the flu vaccine could prevent a large number of those deaths.

2) Hospitals are already getting crowded. One of the big problems with a real epidemic is the overcrowding of hospitals. This means that the really sick people cannot get treatment, and that is part of the reason the emergency was declared. See this article in USA Today about over-crowding. ok, so it's USA Today, a paper that loves hyperbole, but, again, it's early in the season and any indication of overcrowding at this point is scary.

3) The vaccine is not yet fully available. The regular flu vaccine has been out for weeks. Unfortunately, almost none of the flu this year seems to be covered by that vaccine. The majority seems to be 2009 H1N1 (the swine flu). See this chart for a breakdown. Note the orange/brown is 2009 H1N1, and note the yellow means it is not tested for sub-type, so almost all typed flu is swine flu.
That's why I am worried. The other concern is that, even when the vaccine does come out, people won't take it. See my brother's blog about why you should and the crazies who say you should not.

Redskins are lucky to play bad teams, but how lucky?

2009-10-15T13:07:00.000-07:00

A recent article in Yahoo Sports pointed out that the Washington Redskins are the first team in history to play six winless teams in a row. Here is their schedule so far (also according to the article cited above):

Week 1 -- at New York Giants (0-0)

Week 2 -- vs. St. Louis Rams (0-1)

Week 3 -- at Detroit Lions (0-2)

Week 4 -- vs. Tampa Bay Buccaneers (0-3)

Week 5 -- at Carolina Panthers (0-3)

Week 6 -- vs. Kansas City Chiefs (0-5)

The author of the article, Chris Chase (or, as he notes, his dad-let's call him Mr. Chase), calculates the odds of this as 1 in 32,768. This calculation is incorrect and far too high for several reasons, which I get to below. But first, let me explain how the calculation was likely performed.

The calculation assumes, plausibly, that the Redskins have the same chance of playing any given team (unlike some college teams, who purposely make their schedules easy, this is not possible in the NFL).

The calculation also assumes, not plausibly, that teams that have thus far won no games have a 50-50 chance of winning each game. The implicit assumption there is that all NFL teams are evenly matched. The fact is that there are a few really good teams, a few really bad teams, and a bunch of teams in the middle. Thus, there are likely to be a bunch of winless teams after 5 games, and not, as the incorrect calculation below implies, only 1 winless team of 32 after 5 games.

Finally, the calculation, apparently in a careless error, assumes the chances of playing a winless team the first week are 50-50, when, of course, all teams are winless the first week.

So the Mr. Chase's (incorrect) calculation is
Week 1 chances: 50% ( 1 in 2)
Week 2 chances: 50% (1 in 2)
Week 3 chances: 50%*50%=25% (1 in 4)
Week 4 chances: 50%*50%*50%=12.5% (1 in 8)
Week 5 chances: 50%*50%*50%=12.5% (same as week 3 because the team they played had only played three games)
Week 6 chances: 50%*50%*50%*50%*50%=3.125% (1 in 32)

A law of probability is that the chance of two unrelated events happening is the product of their individual chances. Thus, if the chance of rain today is 50% and the chance of rain tomorrow is 50%, the chance of rain both days is 25%, if those chances are unrelated (which, by the way, they probably aren't). This is why the chances for multiple losses are multiplied together.

But back to the football schedule. To calculate the chances of 6 straight games against winless teams, Mr. Chase reasonably multiplied the 6 individual chances (again it assumed the 6 matchups were unrelated):
50% * 50% * 25% * 12.5% * 12.5% * 6.25% = .003%, or 1 in 32,768.
SO, the 32,768 is the number reported in the article.

The easy correction is that the chances of playing a winless team in the first game is 100%, so the calculation should be:
100% * 50% * 25% * 12.5% * 12.5% * 6.25% = .006%, or 1 in 16,384.
This error has been pointed out in comments on the article.

In addition, other comments point out the other major flaw: teams do not have equal probability of losing. Thus the chance that a team will be, say, 0-2 is not 25% (50%*50%) but something else, depending on the quality of the teams. At the extreme, half the teams lose every game and half win every game (this of course assumes losing teams only play against winning teams, but it is possible).

The reality is certainly not this extreme, which would imply a 50-50 chance each week of playing all losing teams (and thus a 1 in 32 chance of playing 6 in a row). So, how do we figure out the reality?

The easiest way is to look at, each week the percent of teams that are winless. If we assume the Redskins have an equal chance of playing each team, then we can compute the odds each week (click on the week to see the linked source). Note that everything is out of 31 teams instead of 32 because the Redskins can't play themselves.

Week 1: 31 out of 31 teams winless. Chances: 31/31=100%
Week 2: 15 out of 31 teams winless. Chances 15/31=48% (I am assuming no byes first week and I know redskins lost their first game).
Week 3: 8 out of 31 teams winless. Chances: 8/31=26%
Week 4: 6 out of 31 teams winless. Chances: 6/31 = 19%
Week 5: 6 out of 31 teams winless. Chances 6/31 = 19%
Week 6: 4 of 31 teams winless. Chances: 4/31 = 13%

So the actual chances, assuming the Redskins have an equal chance of playing each team each week and cannot play themselves, are: 100%*48%*26%*19%*19%*13% = 0.06%, or 1 in about 1,700. Much more likely than 1 in 32,000 but still pretty unlikely.

And after all these easy games, how are they doing? Unluckily for Redskins fans, not too well...they're 2-3 going into Sunday's game against the winless Chiefs.

Unemployment down but joblessness is up?

2009-08-12T08:14:00.001-07:00

There was a bit of interesting news that came out Friday--the nations unemployment rate actually declined, from 9.5% to 9.4%. This is true despite the fact that there was a net loss of jobs of 247,000 (see the NY Times article). How could this happen?

Well, the unemployment rate is calculated by taking the number unemployed and dividing by the labor force: Unemployment Rate= Number Unemployed / Labor Force.

The numerator in the equation, Number Unemployed, is defined as the number of people not employed minus anyone who hasn't looked for a job in the last 4 weeks. The denominator of the equation, Labor Force, is defined as the Number Unemployed plus the number of people currently working (either full or part-time).

Thus, if people give up (and giving up is defined as not looking for the last 4 weeks), they are no longer counted in either the numerator or denominator of the equation. And that is exactly what happened between June and July of this year. According to the BLS (bureau of labor statistics), 637,000 people left the labor force between June and July. Thus, even though the number of people employed fell (by a seasonally adjusted 155,000), the unemployment rate also fell, because the number of people looking for work fell also (267,000). The net result was a drop in unemployment even though fewer people were working and more people lost jobs than found jobs.

A note about the math. At first blush, you may wonder whether it matters, since the people not looking are removed both from the numerator (Number Unemployed) and denominator (Labor Force). But mathematically, it does matter. Suppose we have a ratio 2/10, which equals 20%. Subtract 1 from the numerator and 1 from the denominator and you have 1/9, which equals 11.1%. Thus we subtracted the same number from the numerator and denominator but we did not end up with the same 20%. Instead we ended up with far less (11.1%).

The general rule is that the ratio falls when subtracting the same number from the numerator and denominator as long as the ratio is less than 1. So, 2/10>1/9 but 20/10<19/9, for example. What this means for the unemployment rate (which is always less than 1 since 1 is 100% unemployment) is that when people leave the work force, the unemployment rate is somewhat artificially reduced. This is why we had more people losing their jobs but a decline in unemployment last month.

I would guess that the labor force drop-offs would be far higher during deeper recessions where many despair of getting work or decide to take a break from their search, and this guess is borne out by recent information on the BLS site, which cites the increase in discouraged workers this last year: "Among the marginally attached, there were 796,000 discouraged workers in July, up by 335,000 over the past 12 months. (The data are not seasonally adjusted.) Discouraged workers are persons not currently looking for work because they believe no jobs are available for them."

This NY Times chart of unemployment uses a more reasonable definition and shows unemployment far higher than the official 9.4% rate. It includes all those who have looked for a job in the past year as well as part-time workers who want full-time work as part of the unemployed, and the unemployment rate is between 10 and 20%, depending on the state.

Riding a bike? Wear a helmet.

2009-06-12T12:34:00.000-07:00

Now that the sun has finally come out in NYC today after what seems like weeks of rain and cold weather, it seems an appropriate time to talk about one of my favorite summer recreational activities--riding a bike.

Growing up in the 1970s, I don't think I ever saw a helmet, much less wore one. However, in the same way we've figured out that seatbelts (and airbags) save lives, we also now know that biking with a helmet makes you safer. The Consumer Product Safety Council reported that wearing a helmet can decrease risk (of head injury) by as much as 85%.

Sadly, there are still a lot of enthusiasts out there that have a take no prisoners type attitude about wearing helmets, even implying that they are less safe (see for instance the helmet section of this web page in bicycle universe). Yet I think anyone who understands the statistics will see that the "freedom" of riding without a helmet is far outweighed by the risk.

The Insurance Institute for Highway Safety (IIHS) has long been a great source for safety information. They've got the same goal that hopefully most of us do, reducing deaths and injuries. In a 2003 report, the IIHS reports that child bicycle deaths has declined by more than 50% since 1975 (despite increased biking and presumably because most children wear helmets now). In addition, about 92% of all bicycle deaths were cyclists not wearing helmets (see this report). The same report also shows that while child bicycle deaths have declined precipitously(from 675 in 1975 to 106 in 2007), adult deaths have increased since 1975 (from 323 to 583).

Helmet usage is harder to figure out, but most sources put overall use around 50%, with children's use higher. This means that, given that 92% of deaths are cyclists not wearing helmets, you stand about 11 times the chance of getting killed if you don't wear a helmet. This number can be played with a little and wittled down if you assume, say, that cyclists not wearing helmets bike more dangerously, but there would have to be enormous differences for helmets to be shown to be ineffective. Moreover, all the major scientific studies show large positive effects from helmet usage (see this ANTI-helmet site for a summary of the case-control studies).

So why, when you search the internet for helmet effectiveness, or read through the literarture of a number of pro-cycling organizations, do they cast dispersions upon helmet use? This one, for me, is an enigma. I understood why the auto industry was against airbags and seatbelts (they cost money) and why the cigarette and gun manufacturers are against regulation, but why do people care so much about us not wearing helmets. I can think of only a couple of things: a) cyclists want bike lanes and other safety measures without committing to anything on their own, and b) some are too lazy/cool to bother with a helmet. Of course, I'm a cyclist and clearly, I'm all for helmets (and yes, laws requiring them). I also think that if we want state and city governments to take us seriously about increasing cyclist safety through new bike lanes, changing traffic patterns, and building of greenways, we need to do our part, too.

Why Swine Flu is a bunch of hogwash.

2009-05-18T07:24:00.000-07:00

I first thought of writing about this a couple of weeks ago, when the nationwide hysteria concerning swine flu was just beginning, but then, as quickly as it came, it went. Now, with the first death from swine flu in NY, the front pages of the major newspapers have returned to the topic. The New York Times article and headline, was, as always, something close to languid. However, the NY post's article and photos, are, also as usual, a bit hysterical. My son's school, apparent readers of the post, have covered all the water fountains with plastic bags, perhaps unaware that the CDC clearly states there seems to be little or no chance of infection through drinking water.

What's more is that, so far, this flu has been a very minor flu, with about 5,000 documented cases and 6 deaths. The blog of record relays that the "regular" flu has already killed something like 13,000 people in the US this year (it's not clear whether this is derived from the CDC's annual estimate of 36,000). This amounts to about 100 people a day.

While one CDC scientist estimates the number of people with the swine flu are 50,000 or so, this estimate assumes that under-reporting of swine flu is the same as under-reporting of flu in general. Given the focus on swine flu, I expect that under-reporting of it is far lower than of general flu, and thus, the true number with the swine flu is far fewer than 50,000. The CDC's currently weekly flu report shows about one-third of the 1,286 new cases as swine flu (novel H1N1). The same report has a great graph, showing an irregular spike in flu diagnosis, just at the time when reported flu usually falls.

There are three pieces of good news, despite the scary spiked graph. First, with spring, flu cases quickly fall, because flu spreads less when people are further away from each other (i.e., outside instead of inside). Second, cases are already falling (though it's only two weeks of data). Third, all types of flu diagnosis increased in the last two weeks versus the several weeks leading up to May), implying that one of the reasons (perhaps the only reason) for the spike is that we are testing much more than usual, due to the swine flu outbreak.

Thus, Swine flu has so far killed a documented 6 people in the U.S. out of more than 5,000 confirmed cases.

In conclusion, though our own hysteria may drive documented cases up some, and lead to my children having to bring a water bottle to school, the swine flu does not appear to be particularly dangerous or deadly.