<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-4274782648018949782</id><updated>2012-02-07T14:25:38.342-08:00</updated><title type='text'>What are the chances?</title><subtitle type='html'>Musings on everyday probability.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>36</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-3772640057111008480</id><published>2011-11-11T09:32:00.000-08:00</published><updated>2011-11-14T06:33:52.241-08:00</updated><title type='text'>Born to Run?</title><content type='html'>About a year ago, I read a book called "Born to Run," by Christopher McDougall, who last week wrote&lt;a href="http://www.nytimes.com/2011/11/06/magazine/running-christopher-mcdougall.html?pagewanted=1&amp;amp;sq=born%20to%20run&amp;amp;st=cse&amp;amp;scp=1"&gt; an article in the New York Times Magazine&lt;/a&gt; on the same subject.  &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;McDougall's basic premise is that we were faster and less injury-prone before we started wearing all these fancy running shoes and that they are what's causing running injuries.  For example, in the New York times article:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;&lt;span class="Apple-style-span" style="white-space: pre;"&gt;"&lt;/span&gt;Back in the ’60s, Americans 'ran way more and way faster in the thinnest little shoes, and we&lt;/span&gt;&lt;span class="Apple-tab-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); white-space: pre; "&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;never got hurt,' Amby Burfoot, a longtime Runner’s World editor and former Boston Marathon&lt;/span&gt;&lt;span class="Apple-tab-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); white-space: pre; "&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;champion, said during a talk before the Lehigh Valley Half-Marathon I attended last year. 'I&lt;/span&gt;&lt;span class="Apple-tab-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); white-space: pre; "&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;never even remember talking about injuries back then,' Burfoot said. 'So you’ve got to wonder&lt;/span&gt;&lt;span class="Apple-tab-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); white-space: pre; "&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;what’s changed.'"&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;Statistics frowns on such anecdotal evidence, though it does make a good story.  Did we really run faster?  There are a lot of facts that we can look at though average times aren't among them.  Marathon records (&lt;a href="http://en.wikipedia.org/wiki/Marathon_world_record_progression"&gt;shown in Wikipedia&lt;/a&gt;) for men have indeed only downticked a little since the sixties.  In 1970, Ron Hill of the UK (close enough, runnig-shoe wise, to be considered american?) set a record of 2:09:29.   This year, a new record of 2:03:38 was set (the most recent US record was 2:05:38 in 2002).  Six minutes in 40 years doesnt seem like much, but is it because of the shoes or because the sport has matured?  And are Americans seen less because running isnt really a big competitive sport here?&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;When you look at women's times, the changes are much more dramatic.  Women more recently began running marathons and fewer participated in the sport in general until relatively recently.  In 1970, the women's marathon record was 3:02:53 (set by an american).  In 2003, Paula Radcliffe (England) ran it in 2:15:25.  That's a 47 minute improvement, or nearly 2 minutes per mile.  In the 2011 New York Marathon, 40 women from the US bested the 1970 record time (see marathon site &lt;a href="http://www.ingnycmarathon.org/Results.htm"&gt;here&lt;/a&gt; for results).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;So, I can't agree that we ran "way faster" 40 years ago.  This doesn't mean that bare foot runners are slower than shoed runners because changes over the last fourty years in the level of competition, and improvements in training and fitness, rather than shoes, might have been the factors contributing to improved times.  &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="font-size: 15px; line-height: 22px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;How about injury rates?  Do people get more injuries with running shoes than without?  Unfortunately, any data on injury rates is tainted by the changes in the makeup of the population that runs (from a small, highly fit population to a large more population more varied in fitness--think of the then-overweight President Clinton running with a stop at McDonalds post-jog), and there haven't been any studies that directly compare injuries over time for barefoot running against running shoe running.  A good summary article is &lt;a href="http://www.sportsscientists.com/2011/06/barefoot-running-shoes-and-born-to-run.html"&gt;here.&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt; A recent article in &lt;a href="http://www.nature.com/nature/journal/v463/n7280/full/nature08723.html"&gt;Nature&lt;/a&gt;, while not looking at historical data, supports McDougall's contention that running shoes can be more harmful than bare feet when running.   The article is lead-authored by Daniel Lieberman, a big advocate of barefoot running, so his bias may have been to look at things he believed were helpful about barefoot running and not at aspects of barefoot running that may be harmful.  The article looks at impact forces and not at injuries, and doesn't consider that runners with shoes may be able to change their stride to reduce the impact forces (McDougall says this is hard to do with running shoes, and, from my own experience, I tend to agree, though I don't think it is impossible).  &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="font-size: 15px; line-height: 22px;"&gt;The statistical net-net is that there is no direct evidence either way right now.  I admit some bias but I would say that the lack of evidence, given the power and money behind the shoe industry, tends to make me believe that, at best, fancy shoes are no better than bare feet, because if there were an effect in favor of shoes, I would certainly think we'd have seen a study by now (this is something correctly pointed out by McDougall and other advocates of barefoot running).  Therefore, don't be surprised if you see me running with feet &lt;/span&gt;&lt;i style="font-size: 15px; line-height: 22px; "&gt;au-naturel &lt;/i&gt;&lt;span class="Apple-style-span" style="font-size: 15px; line-height: 22px; "&gt;someday soon.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', times, serif; font-size: 15px; line-height: 22px; background-color: rgb(255, 255, 255); "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-3772640057111008480?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/3772640057111008480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=3772640057111008480' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/3772640057111008480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/3772640057111008480'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2011/11/born-to-run.html' title='Born to Run?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-2139084367184738765</id><published>2011-03-30T06:32:00.001-07:00</published><updated>2011-03-30T08:37:56.528-07:00</updated><title type='text'>Detecting cheating</title><content type='html'>In my professional work, I like being the statistical sleuth, trying to figure out whether a person or company cheated, and how much they cheated.  Thus it was with a lot of interest that I read a recent article in &lt;a href="http://www.usatoday.com/news/education/2011-03-28-1Aschooltesting28_CV_N.htm?csp=hf"&gt;USA Today&lt;/a&gt; describing suspicious activity that went on some standardized tests in DC schools.&lt;div&gt;&lt;br /&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;It seems that standardized tests at certain DC schools have improved dramatically.  For example, the article says, "i&lt;span class="Apple-style-span" style="line-height: 22px; "&gt;n 2008, 84% of fourth-grade math students were listed as proficient or advanced, up from 22% for the previous fourth-grade class."  Of course, this could just be part of the amazing turn around.  &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px; "&gt;However, the review found that this dramatic change corresponded with a another interesting statistic: the school had a very high number of erased answers that were changed from wrong answers to right answers (WTR erasures).  &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: 'times new roman'; line-height: 22px; font-size: large; "&gt;Again, here's what the article said: "On the 2009 reading test, for example, seventh-graders in one Noyes  classroom averaged 12.7 wrong-to-right erasures per student on answer  sheets; the average for seventh-graders in all D.C. schools on that test  was less than 1. The odds are better for winning the Powerball grand  prize than having that many erasures by chance."&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;Here's my problem with this logic: the calculation of the chances assumes that each student is acting &lt;/span&gt;&lt;i style="line-height: 22px; "&gt;independently &lt;/i&gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;and erasing much more than usual.  In other words, the chances are calculated assuming that the students are randomly grouped by school with respect to the number of WTR erasures they have, and thus no school should have a particularly high or low number of erasures: number of erasures and the associated school would be statistically independent.  &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;This statistical independence assumption falls apart if there is cheating, wherein teachers erase wrong answers and change them to correct answers after the test is completed.  However, the statistical independence assumption also could also fall apart for innocuous reasons.  &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;Suppose the students at this school were instructed to arbitrarily fill in the &lt;/span&gt;&lt;i style="line-height: 22px; "&gt;last &lt;/i&gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;10 questions immediately upon beginning the exam (this might be a good strategy if there is no penalty for guessing and if many students do not finish the exam).  Then, for the ones who get to the end of the test, they are erasing most of their guesses.  This is a completely legitimate strategy, but it would make raise the number of WTR erasures a great deal.  A lot of more complicated test taking strategies would also lead to more erasures, and if this school in particular taught those strategies, there would be a very high chance that there would be far more erasures at this school than at others, and some of the people interviewed cited strategies that may have led to more erasures.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;Thus, the high erasure rate, even WTR erasures, may have a relatively simple explanation: this school effectively coached the kids in test taking while other schools did not or coached the children differently.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;The article provides a &lt;/span&gt;&lt;a href="http://www.documentcloud.org/documents/73991-day-three-documents?loc=interstitialskip" style="line-height: 22px; "&gt;link to several documents&lt;/a&gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt; summarizing the results of the analysis. What I find interesting is that the worst school, BS Monroe ES, in terms of WTR erasures, also has a lot of WTW (wrong to wrong erasures).  On average, this school has about between 2 and 3 WTW erasures per student, or about 1 WTW for every 5 WTR erasures.  A more interesting, and I think more revealing, analysis would be to see how this ratio compares to the normal ratio.  If the normal ratio is 1 WTW to 5 WTR, it indicates cheating may not have been the reason for the erasures (unless the cheaters were purposefully erasing some and changing them to wrong answers--which seems unliklely since there is no indication potential cheaters realized erasures could be detected at all).  If the general ratio is far from 5 to 1, it would be another indicator of a different process going on at BS Monroe ES, perhaps involving cheating though it is still hard to rule out other, innocuous explanations that involve test-taking strategy.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;Another analysis would be to look at the WTR vs. WTW erasures student by student.  Presumably, students who answered a higher percentage of un-erased problems correctly would have a better ratio of WTR to WTW erasures.  If that were not true, then it would lead more clearly to the conclusion that someone else was doing the erasing.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;The research revealed in the article shows the correlation of two things: a dramatic increase in test scores and a dramatic number of WTR erasures.  Cheating is one explanation for these increases.  Another, however, is the implementation of a smart test-taking strategy at the school, which might well be part of an overall program to increase the test scores and improve the school.  A statistical test can have a seemingly dramatic result (less likely than winning the lottery), but while defeating a specific hypothesis (independence of erasures by school), it doesn't necessarily prove another hypothesis (cheating).&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  &gt;&lt;span class="Apple-style-span" style="line-height: 22px; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-2139084367184738765?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/2139084367184738765/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=2139084367184738765' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/2139084367184738765'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/2139084367184738765'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2011/03/detecting-cheating.html' title='Detecting cheating'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-5172231994596503027</id><published>2010-09-28T14:09:00.000-07:00</published><updated>2010-09-28T17:21:06.529-07:00</updated><title type='text'>Throw away your cold medicine again?</title><content type='html'>A couple years ago, I wrote about a study that looked at the effect of a seawater nasal spray on the health of children (see &lt;a href="http://what-are-the-chances.blogspot.com/2008/01/throw-away-your-cold-medicine.html"&gt;that post&lt;/a&gt;).&lt;br /&gt;Yesterday's New York Times, explored a very similar claim.  Anahad O'Connor's &lt;a href="http://www.nytimes.com/2010/09/28/health/28real.html?ref=health"&gt;column, "Really?  The Claim: Gargling With Salt Water Can Ease Cold Symptoms," &lt;/a&gt;looks at  a study of 387 Japanese adults aged 18 to 65 (see &lt;a href="http://www.ajpm-online.net/article/S0749-3797%2805%2900258-8/abstract"&gt;this page for an abstract)&lt;/a&gt;.  Treatment groups gargled with PLAIN water or a "povidone-iodine" solution.  Those gargling with plain water did the best, with 0.17 URTIs (upper respiratory tract infections) every 30 person-days, meaning about 1 in 6 get a URTI per month if they gargle with water.  The control group had a rate of .26, meaning about 1 in 4 got a URTI.  The iodine group had a rate of .24, also meaning about 1 in 4 go a URTI.&lt;br /&gt;&lt;br /&gt;So water looks pretty good.  The only caveat, and it is the same as the issue I mentioned in the earlier post, is that the outcomes were self-measured.  The people doing the gargling reported whether or not they had a URTI.  IN Japan, where the study was performed, there is a strong bias toward water gargling, at least according to the abstract of the study, which says: &lt;a href="http://www.ajpm-online.net/article/S0749-3797%2805%2900258-8/abstract"&gt;"Gargling to wash the throat is commonly performed in Japan, and people  believe that such hygienic routine, especially with gargle medicine,  prevents upper respiratory tract infections (URTIs)."&lt;/a&gt;  In fact, the article reports that those in the control group gargled one time a day on average as well l (but those in the treated group gargled around 3 times a day).  This affinity for water gargling and the belief that it stops infection may result in water-gargles reporting fewer infections, thus throwing the results of the study into question.&lt;br /&gt;&lt;br /&gt;The New York Times, by the way, gives recommendations based on an upcoming book by Philip Hagen, to gargle with *salt* water, but cites this study, which is referring to *plain* water only.&lt;br /&gt;&lt;br /&gt;My conclusion?  If you THINK it is going to work, it's fairly likely water gargling will be effective, and it is a lot cheaper than buying some kind of preventative medicine.  If you don't think it will work, this study provides little help in deciding whether it actually will work.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-5172231994596503027?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/5172231994596503027/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=5172231994596503027' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5172231994596503027'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5172231994596503027'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2010/09/throw-away-your-cold-medicine-again.html' title='Throw away your cold medicine again?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-7020769658230908283</id><published>2010-03-15T15:21:00.000-07:00</published><updated>2010-03-15T16:01:27.274-07:00</updated><title type='text'>You asked for it, you got it. Toyota!</title><content type='html'>I think that's how the ad line went.  When?  maybe 25 years ago.&lt;br /&gt;&lt;br /&gt;Well, it seems to apply now.  Sudden acceleration.  Mention a problem with a car, any problem with any car, and people will start crawling out of the wood-work with the complaint.  Why?  It's a numbers game.  There were more than 100,0&lt;span style="color: rgb(0, 0, 0);"&gt;00 &lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);" class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;pri&lt;/span&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;-i(?) &lt;/span&gt;&lt;a href="http://www2.toyota.co.jp/en/news/08/0515.html"&gt;sold in the US in 2005-9&lt;/a&gt;.  With that many people driving them around, any tiny problem that is reported is going to be "substantiated" by others.  Those of us old enough to remember the Audi 5000 found the high correlation between those Audi's with sudden acceleration and those sold to 85 year-old ladies inexplicable (studies mostly concluded it was driver error--see a &lt;a href="http://www.wired.com/autopia/2010/03/unintended-acceleration/"&gt;recent article here in Wired&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;The latest, after the brake-related &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;Prius&lt;/span&gt; recall, is the claim of sudden &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_2"&gt;acceleration&lt;/span&gt;.  A guy in California managed to call 911 while it was happening--pretty amazing, huh?  Unless, of course, you made it all up.  &lt;a href="http://en.wikipedia.org/wiki/Toyota_Prius"&gt;Here's what the current thoughts about it are (from &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;wikipedia&lt;/span&gt;):&lt;/a&gt;&lt;br /&gt;   &lt;span style="font-size:85%;"&gt;"On March 8, 2010, a 2008 &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4"&gt;Prius&lt;/span&gt; allegedly uncontrollably accelerated to 94 miles per hour on a California Highway (US), and the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5"&gt;Prius&lt;/span&gt; had to be stopped with the verbal assistance of the California Highway Patrol as news cameras watched &lt;sup id="cite_ref-85" class="reference"&gt;&lt;a href="http://en.wikipedia.org/wiki/Toyota_Prius#cite_note-85"&gt;&lt;span&gt;[&lt;/span&gt;86&lt;span&gt;]&lt;/span&gt;&lt;/a&gt;&lt;/sup&gt;. Subsequent to the event, media investigations uncovered suspicious information about the alleged runaway &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6"&gt;Prius&lt;/span&gt; driver, 61-year old James &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7"&gt;Sikes&lt;/span&gt;, including false police reports, suspect insurance claims, theft and fraud allegations, television aspirations, and bankruptcy.&lt;sup id="cite_ref-fox40_86-0" class="reference"&gt;&lt;a href="http://en.wikipedia.org/wiki/Toyota_Prius#cite_note-fox40-86"&gt;&lt;span&gt;[&lt;/span&gt;87&lt;span&gt;]&lt;/span&gt;&lt;/a&gt;&lt;/sup&gt;&lt;sup id="cite_ref-hoax_87-0" class="reference"&gt;&lt;a href="http://en.wikipedia.org/wiki/Toyota_Prius#cite_note-hoax-87"&gt;&lt;span&gt;[&lt;/span&gt;88&lt;span&gt;]&lt;/span&gt;&lt;/a&gt;&lt;/sup&gt; &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8"&gt;Sikes&lt;/span&gt; was found to be US$19,000 behind in his &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_9"&gt;Prius&lt;/span&gt; car payments and had $US700,000 in accumulated &lt;a href="http://en.wikipedia.org/wiki/Debt" title="Debt"&gt;debt&lt;/a&gt;.&lt;sup id="cite_ref-fox40_86-1" class="reference"&gt;&lt;a href="http://en.wikipedia.org/wiki/Toyota_Prius#cite_note-fox40-86"&gt;&lt;span&gt;[&lt;/span&gt;87&lt;span&gt;]&lt;/span&gt;&lt;/a&gt;&lt;/sup&gt; &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10"&gt;Sikes&lt;/span&gt; stated he wanted a new car as compensation for the incident.&lt;sup id="cite_ref-fox40_86-2" class="reference"&gt;&lt;a href="http://en.wikipedia.org/wiki/Toyota_Prius#cite_note-fox40-86"&gt;&lt;span&gt;[&lt;/span&gt;87&lt;span&gt;]&lt;/span&gt;&lt;/a&gt;&lt;/sup&gt;&lt;sup id="cite_ref-88" class="reference"&gt;&lt;a href="http://en.wikipedia.org/wiki/Toyota_Prius#cite_note-88"&gt;&lt;span&gt;[&lt;/span&gt;89&lt;span&gt;]&lt;/span&gt;&lt;/a&gt;&lt;/sup&gt; Analyses by &lt;a href="http://en.wikipedia.org/wiki/Edmunds.com" title="Edmunds.com"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_11"&gt;Edmunds&lt;/span&gt;.com&lt;/a&gt; and &lt;i&gt;&lt;a href="http://en.wikipedia.org/wiki/Forbes" title="Forbes"&gt;Forbes&lt;/a&gt;&lt;/i&gt; found &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12"&gt;Sikes&lt;/span&gt;' acceleration claims and fears of shifting to neutral implausible, with &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_13"&gt;Edmunds&lt;/span&gt; concluding that "in other words, this is BS",&lt;sup id="cite_ref-il_89-0" class="reference"&gt;&lt;a href="http://en.wikipedia.org/wiki/Toyota_Prius#cite_note-il-89"&gt;&lt;span&gt;[&lt;/span&gt;90&lt;span&gt;]&lt;/span&gt;&lt;/a&gt;&lt;/sup&gt; and &lt;i&gt;Forbes&lt;/i&gt; comparing it to the &lt;a href="http://en.wikipedia.org/wiki/Balloon_boy_hoax" title="Balloon boy hoax"&gt;balloon boy hoax&lt;/a&gt;.&lt;sup id="cite_ref-hoax_87-1" class="reference"&gt;&lt;a href="http://en.wikipedia.org/wiki/Toyota_Prius#cite_note-hoax-87"&gt;&lt;span&gt;[&lt;/span&gt;88&lt;span&gt;]&lt;/span&gt;&lt;/a&gt;"&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;Notwithstanding the apparent CA tale above, the reality is that the rare problem is a tough nut to crack statistically.  Suppose there is an issue in 1 in 10,000 Prius' and that this issue only crops up on one in 10,000 rides on those cars.  Thus, it's a 1 in 100 million car rides in Prius.  Even among those, it may be a very short-lived problem and not cause any injury or accident.  Such a rare problem might be drowned out by other driver error problems, such as accidently hitting the gas instead of the break, perceiving that the car is accelarating when it is not, hitting both the gas and the break simultaneously in an attempt to hit the break.  Each of these things can be exceedingly rare (1 in a million) and still be 100 times as common as the real problem.&lt;br /&gt;&lt;br /&gt;There are other ways to go about teasing out rare events.  In the lab, a machine could possibly simulate conditions that were occurring when the supposed sudden acceleration took place and see if it is repeatable.  Yet these conditions are hard to figure out, as they are determined with the imperfect information of the person reporting the incident.  As might be the case with the recent report, that person could be lying, but even if not, they are likely shooken up enough that they cannot remember the exact conditions very well.  Consider airline crashes, where we often have very objective information (the black box), but it is still very difficult to figure out what happened and why.&lt;br /&gt;&lt;br /&gt;One thing seems certain to be true: we won't know whether or not Prius cars are at fault for a long time to come, and far fewer of them will be bought in the next couple years.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;/sup&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-7020769658230908283?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/7020769658230908283/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=7020769658230908283' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7020769658230908283'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7020769658230908283'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2010/03/you-asked-for-it-you-got-it-toyota.html' title='You asked for it, you got it. Toyota!'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-9118519962082752827</id><published>2009-12-09T15:57:00.000-08:00</published><updated>2009-12-09T16:37:50.380-08:00</updated><title type='text'>More germs = less disease?</title><content type='html'>So says an article in today's &lt;a href="http://www.sciencedaily.com/releases/2009/12/091208192005.htm"&gt;Science Daily&lt;/a&gt;, which reports on a recent study at Northwestern of children from the Philippines.  The study finds that children from the Philippines have much lower levels of C-reactive protein (CRP), which indicates better resistance to disease.  Exposure to germs was much higher for the children in the Philippines. &lt;br /&gt;&lt;br /&gt;So what's wrong with this study?  It's a very tenuous association,  and from what I can gather in the articles, no attempt was made to ensure the children in the U.S. that were compared to the children in the Philippines were similar in other ways.  They might be different in CRP due to other environmental or hereditary factors.  Perhaps it's the weather?  The diet?  One of any number of things could account for the difference. &lt;br /&gt;&lt;br /&gt;In addition, the study appears to ignore the much higher infant mortality rate and much lower life expectancy in the Philippines (you can try www.indexmundi.com for life expectancy and other information by country).  In other words, even if higher germ exposure does mean lower CRP, does it actually mean less disease and longer life?  The broad indication is that it does not. &lt;br /&gt;&lt;br /&gt;In order for the study to be valid, it needs to adjust for whatever inherent differences (in addition to germ exposure) exist between Phillipino and US children, and then see if CRP levels are still different.  An even better way to do such a study would be to study children living in similar environments (same place, socio-economic situation, etc.) and determine if the ones exposed to more germs had lower levels of CRP when they reached adult-hood.&lt;br /&gt;&lt;br /&gt;I've seen articles (see &lt;a href="http://www.britannica.com/bps/additionalcontent/18/36399953/But--dogs-farm-animals-and-laboratory-rats-are-good-news%C3%84"&gt;this &lt;/a&gt;for example, but I can't find a more definitive one at this time) that indicate that children with early exposure to farm animals have fewer allergies, but nothing showing exposure to more serious germs is good.  And some of the germs that we are exposed to are more than just common germs--they are deadly.  It might be that those who are exposed to these deadly germs early, and live, are much better off later in life, but that is no reason to expose them to those germs unnecessarily.  Of course, you wouldnt give your child a deadly disease so that, if they survived, they'd be resistant to it later in life. &lt;br /&gt;&lt;br /&gt;We live in a society that is sometimes alarmist concerning germs, and I have &lt;a href="http://what-are-the-chances.blogspot.com/2009/05/why-swine-flu-is-bunch-of-hogwash.html"&gt;written about this.  &lt;/a&gt;Yet this doesn't mean that, on the whole, a clean environment does not  promote good health, and the article cited above seems to only have the most tenous of indications that it may not.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-9118519962082752827?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/9118519962082752827/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=9118519962082752827' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/9118519962082752827'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/9118519962082752827'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/12/more-germs-less-disease.html' title='More germs = less disease?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-7311522485595195506</id><published>2009-10-29T07:47:00.000-07:00</published><updated>2009-10-29T08:08:33.707-07:00</updated><title type='text'>Why Swine Flu is not a bunch of hogwash</title><content type='html'>This updates my previous blog: &lt;a href="http://what-are-the-chances.blogspot.com/2009/05/why-swine-flu-is-bunch-of-hogwash.html"&gt;"Why Swine flu is a bunch of hogwash?"&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Things have changed a bit in the months since that blog, and the hysteria I cited has leveled off. President Obama did declare a &lt;a href="http://www.msnbc.msn.com/id/33459423/"&gt;swine flu emergency a couple days ago&lt;/a&gt;, but I think that was a good idea.&lt;br /&gt;&lt;br /&gt;Here is what has changed:&lt;br /&gt;1) Swine flu deaths have been at epidemic levels the last three weeks. The chart below (&lt;a href="http://www.cdc.gov/flu/weekly/"&gt;from the CDC&lt;/a&gt;)  shows flu and pneumonia deaths as a percentage of all deaths. The upper black line indicates epidemic level, and the red line is the current level. The graph shows four years of weekly figures.&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_fp158_J456I/Sums0DoJOSI/AAAAAAAAACY/YcFeZhNh1U0/s1600-h/Flumortality_2009-10.gif"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_fp158_J456I/Sums0DoJOSI/AAAAAAAAACY/YcFeZhNh1U0/s320/Flumortality_2009-10.gif" alt="" id="BLOGGER_PHOTO_ID_5398035638707108130" border="0" /&gt;&lt;/a&gt;While this graph doesn't look too serious, and 2008 levels were much further above the threshold at their peak, the scary thing here is that it is so early in the season.  This graph serves as a reminder, too, that every year the flu kills thousands of people, and the flu vaccine could prevent a large number of those deaths.&lt;br /&gt;&lt;br /&gt;2) Hospitals are already getting crowded.  One of the big problems with a real epidemic is the overcrowding of hospitals.  This means that the really sick people cannot get treatment, and that is part of the reason the emergency was declared.  See &lt;a href="http://www.usatoday.com/news/health/2009-10-26-swine-flu-hospitals_N.htm"&gt;this article&lt;/a&gt; in USA Today about over-crowding.  ok, so it's USA Today, a paper that loves hyperbole, but, again, it's early in the season and any indication of overcrowding at this point is scary.&lt;br /&gt;&lt;br /&gt;3) The vaccine is not yet fully available.  The regular flu vaccine has been out for weeks.  Unfortunately, almost none of the flu this year seems to be covered by that vaccine.  The majority seems to be 2009 H1N1 (the swine flu).  See this chart for a breakdown.  Note the orange/brown is 2009 H1N1, and note the yellow means it is not tested for sub-type, so almost all typed flu is swine flu.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_fp158_J456I/SumuFFuMXGI/AAAAAAAAACg/ouPAPVEctxI/s1600-h/virus_strains_2009_10.gif"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 240px;" src="http://4.bp.blogspot.com/_fp158_J456I/SumuFFuMXGI/AAAAAAAAACg/ouPAPVEctxI/s320/virus_strains_2009_10.gif" alt="" id="BLOGGER_PHOTO_ID_5398037030838754402" border="0" /&gt;&lt;/a&gt;That's why I am worried.  The other concern is that, even when the vaccine does come out, people won't take it.  See my &lt;a href="http://genome.fieldofscience.com/2009/10/more-misinformation-on-flu-from-mercola.html"&gt;brother's blog &lt;/a&gt;about why you should and the crazies who say you should not.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-7311522485595195506?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/7311522485595195506/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=7311522485595195506' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7311522485595195506'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7311522485595195506'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/10/why-swine-flu-is-not-bunch-of-hogwash.html' title='Why Swine Flu is not a bunch of hogwash'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_fp158_J456I/Sums0DoJOSI/AAAAAAAAACY/YcFeZhNh1U0/s72-c/Flumortality_2009-10.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-6170468664256002727</id><published>2009-10-15T13:07:00.000-07:00</published><updated>2009-10-15T14:23:37.783-07:00</updated><title type='text'>Redskins are lucky to play bad teams, but how lucky?</title><content type='html'>&lt;a href="http://sports.yahoo.com/nfl/blog/shutdown_corner/post/Redskins-are-first-in-history-to-play-six-straig?urn=nfl,195479"&gt;A recent article&lt;/a&gt; in Yahoo Sports pointed out that the Washington Redskins are the first team in history to play six winless teams in a row.  Here is their schedule so far (also according to the article cited above):&lt;br /&gt;&lt;p&gt;Week 1 -- at &lt;a href="http://sports.yahoo.com/nfl/teams/nyg/"&gt;New York Giants&lt;/a&gt; (0-0)&lt;/p&gt;&lt;p&gt;Week 2 -- vs. &lt;a href="http://sports.yahoo.com/nfl/teams/stl/"&gt;St. Louis Rams&lt;/a&gt; (0-1)&lt;/p&gt;&lt;p&gt;Week 3 -- at &lt;a href="http://sports.yahoo.com/nfl/teams/det/"&gt;Detroit Lions&lt;/a&gt; (0-2)&lt;/p&gt;&lt;p&gt;Week 4 -- vs. &lt;a href="http://sports.yahoo.com/nfl/teams/tam/"&gt;Tampa Bay Buccaneers&lt;/a&gt; (0-3)&lt;/p&gt;&lt;p&gt;Week 5 -- at &lt;a href="http://sports.yahoo.com/nfl/teams/car/"&gt;Carolina Panthers&lt;/a&gt; (0-3)&lt;em&gt;&lt;/em&gt; &lt;/p&gt;&lt;p&gt;Week 6 -- vs. Kansas City Chiefs (0-5)&lt;/p&gt;The author of the article, Chris Chase (or, as he notes, his dad-let's call him Mr. Chase), calculates the odds of this as 1 in 32,768.  This calculation is incorrect and far too high for several reasons, which I get to below.  But first, let me explain how the calculation was likely performed. &lt;br /&gt;&lt;br /&gt;The calculation assumes, plausibly, that the Redskins have the same chance of playing any given team (unlike some college teams, who purposely make their schedules easy, this is not possible in the NFL). &lt;br /&gt;&lt;br /&gt;The calculation also assumes, not plausibly, that teams that have thus far won no games have a 50-50 chance of winning each game.  The implicit assumption there is that all NFL teams are evenly matched.  The fact is that there are a few really good teams, a few really bad teams, and a bunch of teams in the middle.  Thus, there are likely to be a bunch of winless teams after 5 games, and not, as the incorrect calculation below implies, only 1 winless team of 32 after 5 games. &lt;br /&gt;&lt;br /&gt;Finally, the calculation, apparently in a careless error, assumes the chances of playing a winless team the first week are 50-50, when, of course, all teams are winless the first week.&lt;br /&gt;&lt;br /&gt;So the Mr. Chase's (incorrect) calculation is&lt;br /&gt;Week 1 chances: 50% ( 1 in 2)&lt;br /&gt;Week 2 chances: 50% (1 in 2)&lt;br /&gt;Week 3 chances: 50%*50%=25% (1 in 4)&lt;br /&gt;Week 4 chances: 50%*50%*50%=12.5% (1 in 8)&lt;br /&gt;Week 5 chances: 50%*50%*50%=12.5% (same as week 3 because the team they played had only played three games)&lt;br /&gt;Week 6 chances: 50%*50%*50%*50%*50%=3.125% (1 in 32)&lt;br /&gt;&lt;br /&gt;A law of probability is that the chance of two unrelated events happening is the product of their individual chances.  Thus, if the chance of rain today is 50% and the chance of rain tomorrow is 50%, the chance of rain both days is 25%, if those chances are unrelated (which, by the way, they probably aren't).  This is why the chances for multiple losses are multiplied together.&lt;br /&gt;&lt;br /&gt;But back to the football schedule.  To calculate the chances of 6 straight games against winless teams, Mr. Chase reasonably multiplied the 6 individual chances (again it assumed the 6 matchups were unrelated):&lt;br /&gt;50% * 50% * 25% * 12.5% * 12.5% * 6.25% = .003%, or 1 in 32,768.&lt;br /&gt;SO, the 32,768 is the number reported in the article.&lt;br /&gt;&lt;br /&gt;The easy correction is that the chances of playing a winless team in the first game is 100%, so the calculation should be:&lt;br /&gt;100% * 50% * 25% * 12.5% * 12.5% * 6.25% = .006%, or 1 in 16,384.&lt;br /&gt;This error has been pointed out in comments on the article.&lt;br /&gt;&lt;br /&gt;In addition, other comments point out the other major flaw: teams do not have equal probability of losing.  Thus the chance that a team will be, say, 0-2 is not 25% (50%*50%) but something else, depending on the quality of the teams.  At the extreme, half the teams lose every game and half win every game (this of course assumes losing teams only play against winning teams, but it is possible).&lt;br /&gt;&lt;br /&gt;The reality is certainly not this extreme, which would imply a 50-50 chance each week of playing all losing teams (and thus a 1 in 32 chance of playing 6 in a row).  So, how do we figure out the reality?&lt;br /&gt;&lt;br /&gt;The easiest way is to look at, each week the percent of teams that are winless.  If we assume the Redskins have an equal chance of playing each team, then we can compute the odds each week (click on the week to see the linked source).  Note that everything is out of 31 teams instead  of 32 because the Redskins can't play themselves.&lt;br /&gt;&lt;br /&gt;Week 1: 31 out of 31 teams winless.  Chances: 31/31=100%&lt;br /&gt;Week 2: 15 out of 31 teams winless. Chances 15/31=48%  (I am assuming no byes first week and I know redskins lost their first game).&lt;br /&gt;&lt;a href="http://www.nowpublic.com/sports/nfl-standings-week-3-complete-nfl-standings"&gt;Week 3:&lt;/a&gt; 8 out of 31 teams winless.  Chances: 8/31=26%&lt;br /&gt;&lt;a href="http://bleacherreport.com/articles/268430-2009-nfl-power-rankings-week-4/page/3"&gt;Week 4:&lt;/a&gt; 6 out of 31 teams winless. Chances: 6/31 = 19%&lt;br /&gt;&lt;a href="http://www.nowpublic.com/sports/week-5-nfl-power-rankings-2009"&gt;Week 5&lt;/a&gt;: 6 out of 31 teams winless. Chances 6/31 = 19%&lt;br /&gt;&lt;a href="http://espn.go.com/nfl/standings"&gt;Week 6&lt;/a&gt;: 4 of 31 teams winless.  Chances: 4/31 = 13%&lt;br /&gt;&lt;br /&gt;So the actual chances, assuming the Redskins have an equal chance of playing each team each week and cannot play themselves, are: 100%*48%*26%*19%*19%*13% = 0.06%, or 1 in about 1,700.  Much more likely than 1 in 32,000 but still pretty unlikely.&lt;br /&gt;&lt;br /&gt;And after all these easy games, how are they doing?  Unluckily for Redskins fans, not too well...they're 2-3 going into Sunday's game against the winless Chiefs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-6170468664256002727?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/6170468664256002727/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=6170468664256002727' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6170468664256002727'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6170468664256002727'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/10/redskins-are-lucky-to-play-bad-teams.html' title='Redskins are lucky to play bad teams, but how lucky?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-6303513907125154748</id><published>2009-08-12T08:14:00.001-07:00</published><updated>2009-08-25T18:38:05.193-07:00</updated><title type='text'>Unemployment down but joblessness is up?</title><content type='html'>There was a bit of interesting news that came out Friday--the nations unemployment rate actually declined, from 9.5% to 9.4%.  This is true despite the fact that there was a net loss of jobs of 247,000 (see the &lt;a href="http://www.nytimes.com/2009/08/08/business/economy/08jobs.html?_r=1"&gt;NY Times article&lt;/a&gt;).  How could this happen?&lt;br /&gt;&lt;br /&gt;Well, the unemployment rate is calculated by taking the number unemployed and dividing by the labor force: Unemployment Rate= Number Unemployed / Labor Force.&lt;br /&gt;&lt;br /&gt;The numerator in the equation, Number Unemployed, is defined as the number of people not employed &lt;span style="font-style: italic;"&gt;minus&lt;/span&gt; anyone who hasn't looked for a job in the last 4 weeks.  The denominator of the equation, Labor Force, is defined as the Number Unemployed plus the number of people currently working (either full or part-time). &lt;br /&gt;&lt;br /&gt;Thus, if people give up (and giving up is defined as not looking for the last 4 weeks), they are no longer counted in either the numerator or denominator of the equation.  And that is exactly what happened between June and July of this year.  According to the BLS (&lt;a href="http://www.bls.gov/news.release/empsit.nr0.htm"&gt;bureau of labor statistics&lt;/a&gt;), 637,000 people left the labor force between June and July.  Thus, even though the number of people employed fell (by a seasonally adjusted 155,000), the unemployment rate also fell, because the number of people looking for work fell also (267,000).  The net result was a drop in unemployment even though fewer people were working and more people lost jobs than found jobs.&lt;br /&gt;&lt;br /&gt;A note about the math.  At first blush, you may wonder whether it matters, since the people not looking are removed both from the numerator (Number Unemployed) and denominator (Labor Force).  But mathematically, it does matter.    Suppose we have a ratio 2/10, which equals 20%.  Subtract 1 from the numerator and 1 from the denominator and you have 1/9, which equals 11.1%. Thus we subtracted the same number from the numerator and denominator but we did not end up with the same 20%.  Instead we ended up with far less (11.1%).&lt;br /&gt;&lt;br /&gt;The general rule is that the ratio falls when subtracting the same number from the numerator and denominator as long as the ratio is less than 1.  So, 2/10&gt;1/9 but 20/10&lt;19/9, for example.   What this means for the unemployment rate (which is always less than 1 since 1 is 100% unemployment) is that when people leave the work force, the unemployment rate is somewhat artificially reduced.  This is why we had more people losing their jobs but a decline in unemployment last month. &lt;br /&gt;&lt;br /&gt;I would guess that the labor force drop-offs would be far higher during deeper recessions where many despair of getting work or decide to take a break from their search, and this guess is borne out by recent information on the BLS site, which cites the increase in discouraged workers this last year: "Among the marginally attached, there were 796,000 discouraged workers&lt;span style="font-family:monospace;"&gt; &lt;/span&gt;in July, up by 335,000 over the past 12 months. (The data are not seasonally adjusted.) Discouraged workers are persons not currently looking for work because they believe no jobs are available for them."&lt;br /&gt;&lt;br /&gt;This &lt;a href="http://www.nytimes.com/interactive/2009/07/15/business/economy/20090715-leonhardt-graphic.html?scp=10&amp;amp;sq=unemployment&amp;amp;st=cse"&gt;NY Times chart&lt;/a&gt; of unemployment uses a more reasonable definition and shows unemployment far higher than the official 9.4% rate.  It includes all those who have looked for a job in the past year as well as part-time workers who want full-time work as part of the unemployed, and the unemployment rate is between 10 and 20%, depending on the state.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-6303513907125154748?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/6303513907125154748/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=6303513907125154748' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6303513907125154748'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6303513907125154748'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/08/unemployment-down-but-joblessness-is-up.html' title='Unemployment down but joblessness is up?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-6880239828351643996</id><published>2009-06-12T12:34:00.000-07:00</published><updated>2009-06-12T13:38:35.904-07:00</updated><title type='text'>Riding a bike?  Wear a helmet.</title><content type='html'>Now that the sun has finally come out in NYC today after what seems like weeks of rain and cold weather, it seems an appropriate time to talk about one of my favorite summer recreational activities--riding a bike.&lt;br /&gt;&lt;br /&gt;Growing up in the 1970s, I don't think I ever saw a helmet, much less wore one.  However, in the same way we've figured out that seatbelts (and airbags) save lives, we also now know that biking with a helmet makes you safer.  The &lt;a href="http://www.cpsc.gov/cpscpub/prerel/prhtml98/98062.html"&gt;Consumer Product Safety Council reported&lt;/a&gt; that wearing a helmet can decrease risk (of head injury) by as much as 85%.&lt;br /&gt;&lt;br /&gt;Sadly, there are still a lot of enthusiasts out there that have a take no prisoners type attitude about wearing helmets, even implying that they are less safe (see for instance the helmet section of &lt;a href="http://bicycleuniverse.info/transpo/almanac-safety.html"&gt;this web page&lt;/a&gt; in bicycle universe).  Yet I think anyone who understands the statistics will see that the "freedom" of riding without a helmet is far outweighed by the risk.&lt;br /&gt;&lt;br /&gt;The Insurance Institute for Highway Safety (IIHS) has long been a great source for safety information.  They've got the same goal that hopefully most of us do, reducing deaths and injuries.  In a &lt;a href="http://www.iihs.org/externaldata/srdata/docs/sr3802.pdf"&gt;2003 report&lt;/a&gt;, the IIHS reports that child bicycle deaths has declined by more than 50% since 1975 (despite increased biking and presumably because most children wear helmets now).  In addition, about 92% of all bicycle deaths were cyclists not wearing helmets (&lt;a href="http://www.iihs.org/research/fatality_facts_2007/bicycles.html"&gt;see this report&lt;/a&gt;).  The same report also shows that while child bicycle deaths have declined precipitously(from 675 in 1975 to 106 in 2007), adult deaths have increased since 1975 (from 323 to 583).&lt;br /&gt;&lt;br /&gt;Helmet usage is harder to figure out, but most sources put overall use around 50%, with children's use higher.  This means that, given that 92% of deaths are cyclists not wearing helmets, you stand about 11 times the chance of getting killed if you don't wear a helmet.  This number can be played with a little and wittled down if you assume, say, that cyclists not wearing helmets bike more dangerously, but there would have to be enormous differences for helmets to be shown to be ineffective.  Moreover, all the major scientific studies show large positive effects from helmet usage (see &lt;a href="http://www.cyclehelmets.org/1139.html"&gt;this ANTI-helmet site for a summary of the case-control studies).&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;So why, when you search the internet for helmet effectiveness, or read through the literarture of a number of pro-cycling organizations, do they cast dispersions upon helmet use?  This one, for me, is an enigma.  I understood why the auto industry was against airbags and seatbelts (they cost money) and why the cigarette and gun manufacturers are against regulation, but why do people care so much about us not wearing helmets.  I can think of only a couple of things: a) cyclists want bike lanes and other safety measures without committing to anything on their own, and b) some are too lazy/cool to bother with a helmet.  Of course, I'm a cyclist and clearly, I'm all for helmets (and yes, laws requiring them).  I also think that if we want state and city governments to take us seriously about increasing cyclist safety through new bike lanes, changing traffic patterns, and building of greenways, we need to do our part, too.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-6880239828351643996?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/6880239828351643996/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=6880239828351643996' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6880239828351643996'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6880239828351643996'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/06/riding-bike-wear-helmet.html' title='Riding a bike?  Wear a helmet.'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-7533345005525603211</id><published>2009-05-18T07:24:00.000-07:00</published><updated>2009-05-18T08:35:51.355-07:00</updated><title type='text'>Why Swine Flu is a bunch of hogwash.</title><content type='html'>I first thought of writing about this a couple of weeks ago, when the nationwide hysteria concerning swine flu was just beginning, but then, as quickly as it came, it went.  Now, with the first death from swine flu in NY, the front pages of the major newspapers have returned to the topic.  The &lt;a href="http://www.nytimes.com/indexes/2009/05/18/pageone/scan/index.html"&gt;New York Times&lt;/a&gt; article and headline, was, as always, something close to languid.  However, the &lt;a href="http://www.nypost.com/seven/05182009/news/regionalnews/swine_outbreak_now_fatal_169815.htm"&gt;NY post's article and photos&lt;/a&gt;, are, also as usual, a bit hysterical.  My son's school, &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_0"&gt;apparent&lt;/span&gt; readers of the post, have covered all the water fountains with plastic bags, perhaps unaware that the &lt;a href="http://www.cdc.gov/h1n1flu/qa.htm"&gt;CDC clearly states&lt;/a&gt; there seems to be little or no chance of infection through drinking water.&lt;br /&gt;&lt;br /&gt;What's more is that, so far, this flu has been a very minor flu, with about 5,000 documented cases and 6 deaths.   The &lt;a href="http://www.theblogofrecord.com/tag/annual-influenza-deaths/"&gt;blog of record relays&lt;/a&gt; that  the "regular" flu has already killed something like 13,000 people in the US this year (it's not clear whether this is derived from the &lt;a href="http://www.cdc.gov/flu/about/disease/us_flu-related_deaths.htm"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;CDC's&lt;/span&gt; annual estimate of 36,000).&lt;/a&gt;  This amounts to about 100 people a day.&lt;br /&gt;&lt;br /&gt;While one&lt;a href="http://www.webmd.com/cold-and-flu/news/20090515/cdc-100,000plus-in-us-have-swine-flu-half-swine-flu"&gt; CDC scientist estimate&lt;/a&gt;s the number of people with the swine flu are 50,000 or so, this estimate assumes that under-reporting of swine flu is the same as under-reporting &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_2"&gt;of&lt;/span&gt; flu  in general.  Given the focus on swine flu, I expect that under-reporting of it is far lower than of general flu, and thus, the true number with the swine flu is far fewer than 50,000.  The &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;CDC's&lt;/span&gt; currently weekly flu report shows about &lt;a href="http://www.cdc.gov/flu/weekly/"&gt;one-third of the 1,286 new cases as swine flu (novel H1N1)&lt;/a&gt;.  The same report has a great graph, showing an irregular spike in flu diagnosis, just at the time when reported flu usually falls.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_fp158_J456I/ShF7uJ-JfAI/AAAAAAAAACQ/aJ1xxPLW2W8/s1600-h/image181.gif"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_fp158_J456I/ShF7uJ-JfAI/AAAAAAAAACQ/aJ1xxPLW2W8/s320/image181.gif" alt="" id="BLOGGER_PHOTO_ID_5337183066291534850" border="0" /&gt;&lt;/a&gt;There are three pieces of good news, despite the scary spiked graph.  First, with spring, flu cases  quickly fall, because flu spreads less when people are further away from each other (i.e., outside instead of inside).  Second, cases are already falling (though it's only two weeks of data).  Third, all types of flu diagnosis increased in the last two weeks versus the several weeks leading up to May), implying that one of the reasons (perhaps the only reason) for the spike is that we are testing much more than usual, due to the swine flu outbreak. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Thus, Swine flu has so far killed a documented 6 people in the U.S.   out of more than 5,000 confirmed cases. &lt;br /&gt;&lt;br /&gt;In conclusion, though our own hysteria may drive documented cases up some, and lead to my children having to bring a water bottle to school, the swine flu does not appear to be particularly dangerous or deadly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-7533345005525603211?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/7533345005525603211/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=7533345005525603211' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7533345005525603211'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7533345005525603211'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/05/why-swine-flu-is-bunch-of-hogwash.html' title='Why Swine Flu is a bunch of hogwash.'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_fp158_J456I/ShF7uJ-JfAI/AAAAAAAAACQ/aJ1xxPLW2W8/s72-c/image181.gif' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-5573431544175694544</id><published>2009-04-27T18:59:00.000-07:00</published><updated>2009-04-27T19:02:25.131-07:00</updated><title type='text'>Facebook and grades</title><content type='html'>I don't have a long post for today, but I want to briefly discuss the discussion of a study on Facebook and grades.  It was the subject of the Wall Street Journal's Numbers Guy blog last week: &lt;a href="http://blogs.wsj.com/numbersguy/facebook-studys-caveats-are-lost-in-the-news-feed-672/"&gt;http://blogs.wsj.com/numbersguy/&lt;/a&gt; .&lt;br /&gt;&lt;br /&gt;The basic question is under what conditions should we publicize results, and should we wait for peer review?&lt;br /&gt;&lt;br /&gt;Here was my comment:&lt;br /&gt;I think if the caveats were printed along with the study results, then the publication is reasonable. Otherwise, we are being a bit paternalistic by implying that the general public cannot understand the caveats but we researchers can.&lt;div class="commentContent"&gt; &lt;p&gt;Suppose instead this was a study linking domestic air travel through a particular city to a new and deadly virus (say, swine flu?). Then there might be more reason to be more cautious (and paternalistic), because the cost of being wrong is very high. Still, there would be the counter-argument that not publishing could endanger people’s lives. We always have this trade-off, I believe, between unintentionally misleading people that a study is correct when it is not, and vice-versa. &lt;/p&gt; &lt;p&gt;In this open era, especially, I think the balance leans towards publishing, where the blogging/commenting public will quickly crucify the poor research and finding supporting evidence for good research.&lt;/p&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-5573431544175694544?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/5573431544175694544/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=5573431544175694544' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5573431544175694544'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5573431544175694544'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/04/facebook-and-grades.html' title='Facebook and grades'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-9090754785129771369</id><published>2009-03-23T14:22:00.000-07:00</published><updated>2009-03-23T20:24:34.407-07:00</updated><title type='text'>How big a sample?</title><content type='html'>Suppose we want to figure out what percentage of BIGbank's 1,000,000 loans are bad.  We also want to look at &lt;span style="font-size:78%;"&gt;small&lt;/span&gt;bank, with 100,000 loans.  Many people seem to think you'd need to look at 10 times as many loans from BIGbank as you would for &lt;span style="font-size:78%;"&gt;small&lt;/span&gt;bank.&lt;br /&gt;&lt;br /&gt;The fact is that you would use the same size sample, in almost all practical circumstances, for the two populations above.  Ditto if the population were 100,000,000 or 1,000.&lt;br /&gt;&lt;br /&gt;The reasons for this, and the concept behind it, go back to the early part of the 20th century when modern experimental methods were developed by (Sir) Ronald A. Fisher.  Though Wikipedia correctly sites Fisher in its &lt;a href="http://en.wikipedia.org/wiki/Design_of_experiments"&gt;entry&lt;/a&gt; on experimental design, the seminal &lt;a href="http://www.amazon.com/s/ref=nb_ss_b?url=search-alias%3Dstripbooks&amp;amp;field-keywords=ronald+a.+fisher+design+of+experiments&amp;amp;x=0&amp;amp;y=0"&gt;book, Design of Experiments, is out of stock at Amazon &lt;/a&gt;(for $157.50, you can get a re-print of this and two other texts together in a single book).  Luckily, for a mere $15.30, you can get David Salsburg's (no relation and he spells my name wrong! ;-) ) &lt;a href="http://www.amazon.com/Lady-Tasting-Tea-Statistics-Revolutionized/dp/0805071342/ref=sr_1_1?ie=UTF8&amp;amp;s=books&amp;amp;qid=1237844065&amp;amp;sr=1-1"&gt;&lt;span style="font-style: italic;"&gt;A Lady Tasting Tea&lt;/span&gt;&lt;/a&gt;, which talks about Fisher's work.  Maybe this is why no one knows this important fact about sample size--because we statisticians have bought up all the books that you would otherwise be breaking down the doors (or clogging the internet) to buy.  Fisher developed the idea of using randomization to create a mathematical and probability framework around making inferences of data.  In English?  He figured out a great way to do experiments, and this idea, or randomization, is what allows us to make statistical inferences about all sorts of things (and the lack of randomization is what sometimes makes it very difficult to prove otherwise obvious things).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Why doesn't (population) size matter?&lt;/span&gt;&lt;br /&gt;To answer this question, we have to use the concept of randomization, as developed by Fisher. First, let's think about the million loans we want to know about at BIGbank.  Each of them is no doubt very different, and we could probably group them into thousands of different categories.  Yet, let's ignore that and just look at the two categories we care about: 1) good loan or 2) bad loan.  Now, with enough time studying a given loan, suppose we can reasonably make a determination about which category it falls into.  Thus, if we had enough time, we could look at the million loans and figure out that G% are good and B% (100% - G%) are bad.&lt;br /&gt;&lt;br /&gt;Now suppose that we took BIGbank's loan database (ok, we need to assume they know who they loaned money to), and &lt;span style="font-style: italic;"&gt;randomly &lt;/span&gt;sampled 100 loans from it.   Now, stop for a second.  Take a deep breath.  You have just entered probability bliss -- all with that one word, &lt;span style="font-style: italic;"&gt;randomly&lt;/span&gt;.  The beauty to what we've just done is that we've taken a million disparate loans  and with them, formed a set of 100 "good"s and "bad"s, that are identical in their probability distribution.  This means that each of the 100 sampled loans that we are about to draw has exactly a G% chance of being a good one and a B% chance of being a bad one, corresponding to the actual proportions in the population of 1,000,000.&lt;br /&gt;&lt;br /&gt;If this makes sense so far, skip this paragraph.  Otherwise, envision the million loans as quarters lying on a football field.    Quarters heads up denote good loans and quarters tails up denote bad loans.    We randomly select a single coin.  What chance does it have of being heads up?  G%, of course, because exactly G% of the million are heads up and we had an equal chance of selecting each one.&lt;br /&gt;&lt;br /&gt;Now, once we actually select (and look at) one of the coins, the chances for the second selection change slightly, because where we had G% exactly, now there is one less quarter to choose from, so we have to adjust accordingly.  However, that adjustment is very slight.  Suppose, G were 90%.  Then, we'd have, for the second selection, if the first were a good coin, a 899999/999999 chance of selecting another good one (that's an 89.99999% chance instead of a 90% chance).  For  &lt;span style="font-size:78%;"&gt;small&lt;/span&gt;bank, we'd be looking at a whopping reduction to an 89.9999% chance from a 90% chance.  This gives an inkling of why population size, as long as it is much bigger than sample size, doesn't much matter. &lt;br /&gt;&lt;br /&gt;So, now we have a sample set of 100 loans.  We find that 80 are good and 20 are bad.  Right off, we know that, whether dealing with the 100,000 population or the 1,000,000 population, that our best guess for the percentage of good loans, G, is 80%.  That is because of how we selected our sample.  It doesn't matter one bit how different the loans are.  They are just quarters on a football field.  It follows from the fact that we selected them randomly.&lt;br /&gt;&lt;br /&gt;We also can calculate several other facts, based on this sample.  For example, if the actual number of good loans were 90% (900,000 out of 1,000,000), we'd get 80 or fewer in our sample of 100 only 0.1977% of the time.  The corresponding figure, if we had sampled from the population of 100,000 (and had 90,000 good loans), would be 0.1968%.  What does this lead us to conclude?  Very likely, the proportion of "good" loans is less than 90%.  We can continue to do this calculation for different possible values of G:&lt;br /&gt;If G were 89%: .586% of the time would you get 80 or fewer.&lt;br /&gt;If G were 88%: 1.47% of the time would you get 80 or fewer.&lt;br /&gt;If G were 87%: 3.12% of the time would you get 80 or fewer.&lt;br /&gt;If G were 86.3%: 5.0% of the time would you get 80 or fewer.&lt;br /&gt;If G were 86%: 6.14% of the time would you get 80 or fewer.&lt;br /&gt;In each of the above cases, the difference between a population of 1,000,000 and 100,000 loans makes a difference only at the second decimal place, if that.&lt;br /&gt;&lt;br /&gt;Such a process allows us to create something called a confidence interval.  A confidence interval kind of turns this calculation on its head and says, "Hey, if we only get 80 or fewer in a sample 1.47% of the time when the population is 88% good, and I got only 80 good loans in my sample, it doesn't sound too likely that the population is 88% good."  The question then becomes, at what percentage would you start to worry?&lt;br /&gt;&lt;br /&gt;For absolutely no reason at all (and I mean that), people seem to like to limit this percent to 5%.  Thus, in the example above, most would allow that, if we estimated G such that 5% (or more) of the time, 80 or fewer of 100 loans would be good (where 80 is the number of good in our sample), then they would feel comfortable.   Thus, for the above, we would say, with "95% confidence, 86.3% or fewer of the loans in the population are good."  We could just as well have figured out the number that corresponded to 1% and stated the above in terms of 99% confidence, with the corresponding higher G or figured out the number that corresponds to 30% and stated the above in terms of 70% confidence.  However, everyone seems to love 5% and the 95% confidence that goes with it.&lt;br /&gt;&lt;br /&gt;Back to sample size versus population.  As stated above, the population size, though 10 times bigger, doesn't makes a difference.  For a given probability above, we are using the hypergeometric distribution to calculate the exact figure (the mathematics behind it are discussed some in my&lt;a href="http://what-are-the-chances.blogspot.com/2008/02/7-letter-words-and-8-card-suits.html"&gt; earlier post&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;Here are some of the chances associated with a G of 85% and a sample size of 100 that yields 80 good loans.&lt;br /&gt;Population  infinite      : 10.65443%&lt;br /&gt;Population 1,000,000: 10.65331%&lt;br /&gt;Population 100,000    : 10.64%&lt;br /&gt;Population 10,000      : 10.54%&lt;br /&gt;Population 1,000        :    9.49%&lt;br /&gt;Population 500           :    8.21%&lt;br /&gt;This example follows the rule of thumb:  you can ignore the population size unless the sample is at least 10% of the population.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-9090754785129771369?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/9090754785129771369/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=9090754785129771369' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/9090754785129771369'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/9090754785129771369'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/03/how-big-sample.html' title='How big a sample?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-1498091934505655986</id><published>2009-03-18T08:13:00.000-07:00</published><updated>2009-03-18T08:36:22.305-07:00</updated><title type='text'>7 letter scrabble word redux</title><content type='html'>A recent &lt;a href="http://blogs.wsj.com/numbersguy/the-scrabble-statistics-637/"&gt;article &lt;/a&gt;by the Wall Street Journal's "Numbers Guy" has re-surfaced one of my old &lt;a href="http://what-are-the-chances.blogspot.com/2008/02/7-letter-words-and-8-card-suits.html"&gt;posts&lt;/a&gt; regarding scrabble.  In it I said that after the first turn, you must get an 8-letter word to use all your letters, because your seven letters need to connect to an existing word. &lt;br /&gt;&lt;br /&gt;This, of course, is not correct, as was pointed out in comments to the Numbers Guy's blog (this was also pointed out by my sister).  All you need to do to use all your letters with a 7-letter word is find a place to connect that is parallel to an existing word.  For example, 'weather' could be connected parallel to a word ending in 'E", since 'we' is a word.&lt;br /&gt;&lt;br /&gt;Maybe that's why my sister won so many scrabble games against me when I was a kid.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-1498091934505655986?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/1498091934505655986/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=1498091934505655986' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/1498091934505655986'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/1498091934505655986'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/03/7-letter-scrabble-word-redux.html' title='7 letter scrabble word redux'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-6528899770889610477</id><published>2009-03-11T17:59:00.001-07:00</published><updated>2009-03-11T18:33:10.457-07:00</updated><title type='text'>Are same-sex classes better?</title><content type='html'>Yesterday's New York Times had an &lt;a href="http://www.nytimes.com/2009/03/11/education/11gender.html?_r=1&amp;amp;hp"&gt;article,&lt;/a&gt; "&lt;span style="font-size:100%;"&gt;Boys and Girls Together, Taught Separately in Public School," &lt;/span&gt;about same-sex classes in New York City.  In particular, the article focused on P.S. 140 in the Bronx.  The article looks upon such classes favorably, despite the fact that there is, as far as I can tell, no evidence that such classes lead to better achievement.&lt;br /&gt;&lt;br /&gt;In particular, the article states: "Students of both sexes in the co-ed fifth grade did better on last year’s state tests in math and English than their counterparts in the single-sex rooms, and this year’s co-ed class had the highest percentage of students passing the state social studies exam."&lt;br /&gt;&lt;br /&gt;In other words, the City is continuing this program, even though the evidence indicates that not only are students in same-sex classes doing no better, they are doing worse!  The principal, who has introduced some programs that have achieved material results, said: "“We will do whatever works, however we can get there...we thought this would be another tool to try.”  This seems reasonable, but the article states,"...unlike other programs aimed at improving student performance, there is no extra cost."  There may not be a monetary cost, but making these students laboratory rats in someone's education research project doesn't help them, and, apparently in this case, hurts them.  Not to mention the opportunity cost of not exposing these children to other programs that might actually help.&lt;br /&gt;&lt;br /&gt;To be fair, the scholarly literature is not consistent in its conclusions about whether same-sex classes  improve achievement.  However, many of the U.S. studies showed little or no improvement.  See, for example:&lt;br /&gt;&lt;a href="http://www.jstor.org/pss/2668225"&gt;Singh and Vaught's study&lt;/a&gt;&lt;br /&gt;&lt;a href="http://aer.sagepub.com/cgi/content/abstract/34/3/485"&gt;LePore and Warren&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;On the other hand, some English and Australian studies indicate that, at least for girls, same-sex classes or schools may result in higher achievement (see, for example, &lt;a href="http://www.ingentaconnect.com/content/routledg/tsed/1999/00000021/00000004/art00001"&gt;Gillibrand E.; Robinson P.; Brawn R.; Osborn A.&lt;/a&gt;) while others indicate that there are no differences (see &lt;a href="http://www.jstor.org/pss/1393325"&gt;Harker&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;So the literature seems to be mixed, and I would imagine there are numerous confounding factors that make this something hard to measure--for example, typical single-sex classes in New York City consist of low-income minority students, where the boys are seen as being at-risk more than the girls.  Contrast with the British and other foreign studies, where the girls are the greater concern for under-achievement.&lt;br /&gt;&lt;br /&gt;Despite this, it's questionable how long it is ethical to continue a program, like the one at P.S. 140, where the current known outcome is that boys and girls are doing worse in same-sex classes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-6528899770889610477?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/6528899770889610477/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=6528899770889610477' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6528899770889610477'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6528899770889610477'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/03/are-same-sex-classes-better.html' title='Are same-sex classes better?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-8677121087234800672</id><published>2009-02-03T06:31:00.000-08:00</published><updated>2009-02-03T07:12:28.142-08:00</updated><title type='text'>The age-old NY subway question--unlimited or pay-per-ride?</title><content type='html'>Before you get onto the &lt;a href="http://www.mta.info/nyct/maps/submap.htm"&gt;New York City subway&lt;/a&gt; these days, you have to purchase a metro card.  For us daily commuters, the choice would appear to be obvious--purchase an unlimited card.  With it, you can get on and off the subway as many times as you like within some period of time.  Surely, the &lt;a href="http://www.mta.info/metrocard/mcgtreng.htm#payper"&gt;MTA prices it &lt;/a&gt;to make it worth the money.&lt;br /&gt;&lt;br /&gt;However, when I do the calculation for my own behavior, I never seem to get my money's worth from&lt;a href="http://www.mta.info/metrocard/mcgtreng.htm#unlimited"&gt; the unlimited&lt;/a&gt;.  This is because the unlimited card price is always higher than the cost of buying a per ride card if you are only using the card to commute to work during the week.  Even if you are using it for one round trip on the weekend, you would still pay less by buying a "pay-per-ride" card unless you buy a 30-day card.&lt;br /&gt;&lt;br /&gt;The following table shows the cost of each unlimited ride card, followed by the amount of trips that could be purchased for that same amount.  Because the MTA gives a 15% bonus for all "par per ride" purchases over $7, the nominal value in the table shows the value that will be shown on your metro card if you purchase a pay per ride card.&lt;br /&gt;&lt;table str="" style="border-collapse: collapse; width: 457pt;" width="609" border="0" cellpadding="0" cellspacing="0"&gt;&lt;col style="width: 48pt;" span="2" width="64"&gt;  &lt;col style="width: 67pt;" width="89"&gt;  &lt;col style="width: 57pt;" width="76"&gt;  &lt;col style="width: 71pt;" width="95"&gt;  &lt;col style="width: 59pt;" width="78"&gt;  &lt;col style="width: 48pt;" width="64"&gt;  &lt;col style="width: 59pt;" width="79"&gt;  &lt;tbody&gt;&lt;tr style="height: 135.75pt;" height="181"&gt;   &lt;td class="xl30" style="height: 135.75pt; width: 48pt;" width="64" height="181"&gt;Unlimited   Days&lt;/td&gt;   &lt;td class="xl30" style="width: 48pt;" width="64"&gt;Unlimited Cost&lt;/td&gt;   &lt;td class="xl30" style="width: 67pt;" width="89"&gt;Nominal Value if purchased as a   "pay per ride" card&lt;/td&gt;   &lt;td class="xl30" style="width: 57pt;" width="76"&gt;Trips if purchased per trip&lt;/td&gt;   &lt;td class="xl30" style="width: 71pt;" width="95"&gt;Trips used if going to and from   work only, 5 days a week&lt;/td&gt;   &lt;td class="xl30" style="width: 59pt;" width="78"&gt;Trips lost if buying unlimited   only for work versus purchasing regular card&lt;/td&gt;   &lt;td class="xl30" style="width: 48pt;" width="64"&gt;Also one weekend trip each week&lt;/td&gt;   &lt;td class="xl30" style="width: 59pt;" width="79"&gt;Trips lost if buying unlimited   versus purchasing regular card with 1 weekly fun trip&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td class="xl24" style="height: 12.75pt;" num="" height="17"&gt;1&lt;/td&gt;   &lt;td class="xl26" num="7.5"&gt;$7.50&lt;/td&gt;   &lt;td class="xl26" num="8.625" fmla="=+B2*1.15"&gt;$8.63&lt;/td&gt;   &lt;td class="xl25" num="4.3125" fmla="=+C2/2"&gt;4.3&lt;/td&gt;   &lt;td class="xl24"&gt;na&lt;/td&gt;   &lt;td class="xl24"&gt;na&lt;/td&gt;   &lt;td class="xl24"&gt;na&lt;/td&gt;   &lt;td class="xl24"&gt;&lt;br /&gt;&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td class="xl24" style="height: 12.75pt;" num="" height="17"&gt;7&lt;/td&gt;   &lt;td class="xl26" num="25"&gt;$25.00&lt;/td&gt;   &lt;td class="xl26" num="28.75" fmla="=+B3*1.15"&gt;$28.75&lt;/td&gt;   &lt;td class="xl25" num="14.375" fmla="=+C3/2"&gt;14.4&lt;/td&gt;   &lt;td class="xl24" num=""&gt;10&lt;/td&gt;   &lt;td class="xl28" num="4.375" fmla="=+D3-E3"&gt;4.4&lt;/td&gt;   &lt;td class="xl24" num="" fmla="=+E3+2"&gt;12&lt;/td&gt;   &lt;td class="xl27" num="2.375" fmla="=+D3-G3"&gt;2.4&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td class="xl24" style="height: 12.75pt;" num="" height="17"&gt;14&lt;/td&gt;   &lt;td class="xl26" num="47"&gt;$47.00&lt;/td&gt;   &lt;td class="xl26" num="54.05" fmla="=+B4*1.15"&gt;$54.05&lt;/td&gt;   &lt;td class="xl25" num="27.024999999999999" fmla="=+C4/2"&gt;27.0&lt;/td&gt;   &lt;td class="xl24" num=""&gt;20&lt;/td&gt;   &lt;td class="xl28" num="7.0250000000000004" fmla="=+D4-E4"&gt;7.0&lt;/td&gt;   &lt;td class="xl24" num="" fmla="=+E4+4"&gt;24&lt;/td&gt;   &lt;td class="xl27" num="3.0249999999999999" fmla="=+D4-G4"&gt;3.0&lt;/td&gt;  &lt;/tr&gt;  &lt;tr style="height: 12.75pt;" height="17"&gt;   &lt;td class="xl24" style="height: 12.75pt;" num="" height="17"&gt;30&lt;/td&gt;   &lt;td class="xl26" num="81"&gt;$81.00&lt;/td&gt;   &lt;td class="xl26" num="93.15" fmla="=+B5*1.15"&gt;$93.15&lt;/td&gt;   &lt;td class="xl25" num="46.575000000000003" fmla="=+C5/2"&gt;46.6&lt;/td&gt;   &lt;td class="xl24" num=""&gt;44&lt;/td&gt;   &lt;td class="xl28" num="2.5750000000000002" fmla="=+D5-E5"&gt;2.6&lt;/td&gt;   &lt;td class="xl24" num=""&gt;52&lt;/td&gt;   &lt;td class="xl29" num="-5.4249999999999998" fmla="=+D5-G5"&gt;-5.4&lt;/td&gt;  &lt;/tr&gt; &lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;The first line shows the one-day card, which can be purchased for $7.50.  You can use that same $7.50 to purchase $8.63 in value instead, which will be good for 4 trips plus $0.63.  Thus, you'd only want to get an unlimited one-day ride if you were making at least 2 round trips.&lt;br /&gt;&lt;br /&gt;The 7 day unlimited costs $25.  If you use that same $25 to instead purchase a pay-per-ride card, you get $28.75 of value, entitling you to 14 trips (plus $0.75 additional of stored value).  If you go to work every weekday during the 7 day period, you'd use just 10 trips (5 round trips).  If you also use the card for one round trip during the weekend, you are up to 12 trips, still 2.4 trips short of what you could have purchased with the $25 for a pay-per-ride.&lt;br /&gt;&lt;br /&gt;As the table shows, you are always better off purchasing pay-per-ride cards instead of unlimited cards if you are just using your metro card for commuting.   Even if you take one round trip in addition to work every week, only the 30-day unlimited would be worth it, and this only if you go to work every weekday during the period &lt;span style="font-style: italic;"&gt;and&lt;/span&gt; use the card once each weekend.    Many people work at home from time to time and there is typically a federal holiday each month,  so the 30-day figures are optimistic.&lt;br /&gt;&lt;br /&gt;The other issue with the unlimited is a psychological one: I get upset if I forget my unlimited card or end up not taking the subway a couple days when I could have used the card.  With the pay-per-ride, you only pay for what you use.  Perhaps more annoyingly, the pay-per-ride cards display the amount left each time you enter the subway, but the unlimited cards do not tell you the number of days left on your card when you enter the subway, and thus, if you don't keep track it yourself, you will be jammed in the legs with a locked turnstile at least once a month when you purchase an unlimited card.&lt;br /&gt;&lt;br /&gt;I realize there are some who not only commute to work but also very frequently take subway trips to go out or run errands.  For those, the unlimited cards may be worth it.  For others, stick with the pay-per-ride.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-8677121087234800672?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/8677121087234800672/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=8677121087234800672' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/8677121087234800672'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/8677121087234800672'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/02/age-old-ny-subway-question-unlimited-or.html' title='The age-old NY subway question--unlimited or pay-per-ride?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-6348455102877637814</id><published>2009-01-20T11:41:00.000-08:00</published><updated>2009-01-20T12:16:17.945-08:00</updated><title type='text'>Nutty about Peanuts.</title><content type='html'>Visiting South Carolina this weekend, I picked up an old Southern favorite from &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;Publix&lt;/span&gt;: peanut butter cookies (I didn't see my real favorite: boiled peanuts).  No sooner had I returned home than my Mom admonished me for buying them, because they were unsafe, and possibly tainted with Salmonella.&lt;br /&gt;&lt;br /&gt;Sure enough, there was an article in &lt;a href="http://www.thestate.com/747/story/657299.html"&gt;The State&lt;/a&gt; confirming the outbreak.  So far, around 500 people around the country have been sickened (and possibly 6 deaths) from what is believed to be contaminated peanuts.  &lt;a href="http://www.usatoday.com/news/health/2009-01-19-peanut-butter-problems_N.htm"&gt;USA Today confirms&lt;/a&gt; the continuing "epidemic" today.   While these figures seem high, 500 people sickened with food poisoning in a period of four months, across the entire U.S., is hardly a risk worth mentioning.  According to &lt;a href="http://www.wrongdiagnosis.com/f/food_poisoning/prevalence.htm"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;wrongdiagnosis&lt;/span&gt;&lt;/a&gt;&lt;a href="http://www.wrongdiagnosis.com/f/food_poisoning/prevalence.htm"&gt;.com&lt;/a&gt;, the number of incidents of food poisoning or sickness is 200,000 a day.    OK, you might say, but Salmonella is pretty serious and if you don't take antibiotics you might be laid up for several days.  Fine, but the same site says that there are &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_2"&gt;about&lt;/span&gt; 1.4 million cases of Salmonella annually, or about 3,835 a day (the &lt;a href="http://www.cdc.gov/nczved/dfbmd/disease_listing/salmonellosis_gi.html"&gt;CDC says about 40,000 cases are reported annually&lt;/a&gt;, but that there are many more unreported).&lt;br /&gt;&lt;br /&gt;So why are we getting &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_3"&gt;exercised&lt;/span&gt; about a mere 4 cases a day, as with the current outbreak?  My best answer is that 1) it makes for interesting news, 2) any problem that affects so broadly a population, even with &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_4"&gt;minuscule&lt;/span&gt; or infinitesimal risk, is seen by reporters as being important, and 3) people cannot easily assess their relative risk.&lt;br /&gt;&lt;br /&gt;As for me, I explained to my Mom that I'm not too concerned, and quickly had a peanut butter cookie before she could run back to the store.  After waiting a day or so to make sure I was Salmonella-&lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_5"&gt;free&lt;/span&gt;, the rest of the family followed. ;-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-6348455102877637814?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/6348455102877637814/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=6348455102877637814' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6348455102877637814'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6348455102877637814'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2009/01/nutty-about-peanuts.html' title='Nutty about Peanuts.'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-1807160788874306151</id><published>2008-12-05T08:15:00.000-08:00</published><updated>2008-12-05T09:08:21.976-08:00</updated><title type='text'>Are we entering "unprecedented" territory?</title><content type='html'>&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_fp158_J456I/STlbX_RSPmI/AAAAAAAAABs/_uA_cUAOYzI/s1600-h/DJIA1928to2008.JPG"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 265px;" src="http://1.bp.blogspot.com/_fp158_J456I/STlbX_RSPmI/AAAAAAAAABs/_uA_cUAOYzI/s320/DJIA1928to2008.JPG" alt="" id="BLOGGER_PHOTO_ID_5276348906120298082" border="0" /&gt;&lt;/a&gt;(click graph for greater resolution)&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;The reports of gloom and doom are abounding, and, I must admit, I believe most of them.&lt;br /&gt;&lt;br /&gt;I am going to focus purely on the stock market, because the data is readily available, and because I believe the broader economic problems are only just beginning.  My &lt;a href="http://what-are-the-chances.blogspot.com/2008_03_01_archive.html"&gt;March blog &lt;/a&gt;pointed out that in the stocks versus bonds 20-year view, stocks almost always won, but the results are much more mixed over shorter periods.  I also need to point out that I overestimated the results for stocks by assuming dividends were not included in the indices.  For the Dow indices, the subject of much of that discussion, dividends are included (see the &lt;a href="http://www.djindexes.com/mdsidx/index.cfm?event=showAvgMethod"&gt;Dow Jones&lt;/a&gt; site), so the graphs in that blog are correct, but the numbers should not be adjusted further for dividends, meaning that stocks' edge over bonds is less impressive.&lt;br /&gt;&lt;br /&gt;Today's post, though, is really about the graph above, showing 1 ,10, and 20 year returns on the Dow since 1928 (from December to December).  From December 3, 2007 through December 1, 2008, the Dow lost 37% of its value.  This horrible run is beaten only once, from December 1930 to December 1931, when the Dow lost 53%.  The years 1930, 1937,  and 1974 (again, December to December) were the only other years where the 12 month loss was more than 20%.&lt;br /&gt;&lt;br /&gt;Thus, historically, though not unprecedented, the yearly drop in the Dow is, well, statistically "improbable" (that is, if you base your probabilities only on history).  While the 10 and 20 year numbers are much more in line with history, they are still on the low end of the distribution.  The last time the 10-year change was negative, as it is now, was 30 years ago, in 1978, in the waning years of very tough economic times. &lt;br /&gt;&lt;br /&gt;The next few months will start to indicate how deep an economic hole we've dug for ourselves, but the stock market numbers are not encouraging, and the extent to which the economy is dependent on the market (in the sense that assets are tied to it) seems much more like the 20s and 30s than like the 70s.  Let's hope I'm wrong.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-1807160788874306151?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/1807160788874306151/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=1807160788874306151' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/1807160788874306151'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/1807160788874306151'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/12/are-we-entering-unprecedented-territory.html' title='Are we entering &quot;unprecedented&quot; territory?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_fp158_J456I/STlbX_RSPmI/AAAAAAAAABs/_uA_cUAOYzI/s72-c/DJIA1928to2008.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-2360467506371547997</id><published>2008-10-31T05:48:00.000-07:00</published><updated>2008-10-31T06:10:03.730-07:00</updated><title type='text'>Election Prediction Explained</title><content type='html'>&lt;p class="MsoNormal"&gt;So here's the explanation.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;I am following 3 major websites now:&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;a href="http://electoral-vote.com/"&gt;www.electoral-vote.com&lt;/a&gt; – This consolidates polls by state to predict the count.&lt;span style=""&gt;  &lt;/span&gt;Electoral-vote apparently uses simple averaging to consolidate its data.&lt;span style=""&gt;  &lt;/span&gt;I prefer this method because it requires little interpretation on their part.&lt;span style=""&gt;  &lt;/span&gt;Interpretation involves assumptions about bias in the polls, and I believe it is hard to figure out the exact impact of the bias or even the direction. Electoral-vote has Obama at 364. At this time in 2004, they had Kerry at 283 (see &lt;a href="http://www.electoral-vote.com/evp2004/Pres/Maps/Oct31.html"&gt;this page&lt;/a&gt;), whereas his election day total was 252, with the main difference being Florida. More telling, the "strong" Obama States total 264 votes, as opposed to 95 for Kerry at this point.&lt;br /&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;a href="http://www.fivethirtyeight.com/"&gt;www.fivethirtyeight.com&lt;/a&gt; – This consolidates polls by state to predict the count using some complex weighting system. It’s a neat idea but it’s end result is about the same as averaging, and I am not at all convinced it is better.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;They’&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;ve&lt;/span&gt; got Obama at 346.5, much more than the 270 needed to win.&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;br /&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;a href="http://www.gallup.com/"&gt;www.gallup.com&lt;/a&gt; – This well-established survey company is different from the two above in that they actually conduct the polls. &lt;st1:city st="on"&gt;Gallup&lt;/st1:city&gt; is showing primarily national results, and has Obama significantly up, both in raw percentages and when adjusting for “likely” voters—people &lt;st1:city st="on"&gt;&lt;st1:place st="on"&gt;Gallup&lt;/st1:place&gt;&lt;/st1:city&gt; has determined are likely to vote, based on two different models.  Gallup's daily tracking polls has &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;Obama's&lt;/span&gt; lead almost unchanged since the start of October (never more than the statistical error).&lt;br /&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;My conclusion from the above---Obama will be the next &lt;st1:place st="on"&gt;&lt;st1:country-region st="on"&gt;US&lt;/st1:country-region&gt;&lt;/st1:place&gt; President.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So why the change from before, when I said polls are difficult to trust and spoke of biases?&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Three reasons:&lt;/p&gt;  &lt;p class="MsoNormal"&gt;1) the closer we get to the election, the better correlation between intentions and actions&lt;/p&gt;  &lt;p class="MsoNormal"&gt;2) the closer we get to the election, the fewer undecided voters.&lt;span style=""&gt;  A recent &lt;a href="http://www.reuters.com/article/vcCandidateFeed2/idUSN3035791520081031"&gt;Reuters poll&lt;/a&gt; shows this at about 2%.  Even if it is 5% and the undecided break 4 to 1 for McCain, he's going to lose.&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;3) The biases appear to lean in Obama’s favor: more younger voters likely and more early voters.&lt;span style=""&gt;  &lt;/span&gt;Very biased reporting from Grandma in S.C. says that lots of young people were out voting early (she spent 2 hours on line to vote early, by the way).&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-2360467506371547997?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/2360467506371547997/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=2360467506371547997' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/2360467506371547997'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/2360467506371547997'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/10/election-prediction-explained.html' title='Election Prediction Explained'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-5035217343697413730</id><published>2008-10-30T18:43:00.000-07:00</published><updated>2008-10-30T18:44:28.396-07:00</updated><title type='text'>Election Prediction</title><content type='html'>Ok, sure I waited until nearly the end, but here's my prediction:&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Obama wins, with 401 Electoral votes.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I'll explain why tomorrow.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-5035217343697413730?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/5035217343697413730/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=5035217343697413730' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5035217343697413730'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5035217343697413730'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/10/election-prediction.html' title='Election Prediction'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-7545648826204378120</id><published>2008-10-08T13:30:00.000-07:00</published><updated>2008-10-08T13:41:30.260-07:00</updated><title type='text'>Election Polls</title><content type='html'>A short note about election polls, which I've been following somewhat religiously for the last few weeks.&lt;br /&gt;&lt;br /&gt;Election polls differ in at least four significant ways from actual voting. &lt;br /&gt;&lt;br /&gt;First, polls are typically of around 1,000 people or less, which means that at best, they are statistically precise to within plus or minus three percent.  This means that a six point difference between 2 candidates may be nothing more than sampling error (i.e., a statistical anomaly).&lt;br /&gt;&lt;br /&gt;Second, polls tend to be of the general population and not of likely Electoral College votes, which is how the election is counted (but see&lt;a href="http://www.electoral-vote.com/"&gt;&lt;/a&gt; &lt;a href="http://www.electoral-vote.com/"&gt;electoral-vote.com&lt;/a&gt; for a count of Electoral votes, according to polls).  As we know from recent elections, the Electoral vote percentages frequently (and seemingly increasingly) do not correspond to popular vote percentages.&lt;br /&gt;&lt;br /&gt;Third, polls are snapshots on how people feel on a certain day.  Americans seem to be particularly fickle in their opinions recently, perhaps due to the economic turmoil, so don't trust that today's lead won't disappear tomorrow.&lt;br /&gt;&lt;br /&gt;Finally, many polls do not remove unlikely voters (though you do see some figures concerning "likely voters").  Polls of people who do not vote are fairly useless, but pollster's haven't been very successful in predicting who will actually vote.  Thus, the tendency is to include respondents who are registered and say they plan to vote, without looking at their demographics to see what they've done in the past.&lt;br /&gt;&lt;br /&gt;For all these reasons, if you're an Obama supporter, you should be worried and if you're a McCain supporter, you should have some hope.  Either way, vote!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-7545648826204378120?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/7545648826204378120/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=7545648826204378120' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7545648826204378120'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7545648826204378120'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/10/election-polls.html' title='Election Polls'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-3898058487816914326</id><published>2008-08-28T08:13:00.000-07:00</published><updated>2008-08-28T09:16:07.571-07:00</updated><title type='text'>The Atlantic Monthly is criminally misusing statistics</title><content type='html'>I spent the last week vacationing in South Carolina, where my parent's house seems to have Atlantic &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;Monthly's&lt;/span&gt; and Harper's from the dawn of time.  What luck, then, that one of the most interesting articles (at least statistically) was in an issue as recent as the July/August 2008 issue of the Atlantic.  The article is called &lt;a href="http://www.theatlantic.com/doc/200807/memphis-crime"&gt;"American Murder Mystery" and it's by Hanna Rosin&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The article talks of the recent increase in violent crime in mid-sized cities.  In many of these cities, government housing projects (called "Section 8" housing) have been torn down.  In their place, the government has provided the poor with rent subsidies so that they can move to private housing.  Rosin describes how Phyllis &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;Betts&lt;/span&gt; and Richard &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;Janikowski&lt;/span&gt;, of the University of Memphis, tie the increase in crime in these cities to the destruction of these projects.  A striking quote in the article is from the Memphis police chief: &lt;a href="http://www.theatlantic.com/doc/200807/memphis-crime"&gt;'“It used to be the criminal element was more confined,” said Larry Godwin, the police chief. “Now it’s all spread out."'  &lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The primary statistical evidence given in the article of an association between crime and former Section 8 residents, is a map that shows areas with high incidents of crime correspond to areas with a large number of people with Section 8 subsidies (i.e., former residents of housing projects).  As convincing as this might sound, it has a fatal flaw: the map looks at total incidents rather than crime rate.  This means that an area with 10,000 people and 100 crimes (and 100 Section 8 subsidy recipients) will look much worse than an area with 100 people and 1 crime (and 1 Section 8 subsidy recipient).  However, both areas have the same rate of crime, and, presumably, the same odds of being a victim of crime (see my earlier blog about the &lt;a href="http://what-are-the-chances.blogspot.com/2008/01/where-is-safest-place-to-live.html"&gt;safest place to live&lt;/a&gt; for some explanation of the use of rates in measuring crime).   Yet in &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;Betts&lt;/span&gt; and &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4"&gt;Janikowski's&lt;/span&gt; analysis, the area with 10,000 people has a higher number Section 8 subsidy recipients and higher crime, thus "proving" their theory of association.&lt;br /&gt;&lt;br /&gt;Of course, there will be both a greater number of Section 8 subsidy recipients and a greater number of crimes in the area with 10,000 people than in the area with 100 people .  Thus, while the map presented in the Atlantic article does indeed seem to indicate that there is higher crime in areas where there are more Section 8 subsidies, this differential might be entirely an artifact of population density, and, in fact, the crime rate may be completely unrelated to where Section 8 subsidy recipients reside.   Without an adjustment for population density, the inferences made from the association are statistically meaningless.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-3898058487816914326?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/3898058487816914326/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=3898058487816914326' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/3898058487816914326'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/3898058487816914326'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/08/atlantic-monthly-indicted-for-criminal.html' title='The Atlantic Monthly is criminally misusing statistics'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-2255639115786680252</id><published>2008-07-02T02:05:00.000-07:00</published><updated>2008-07-10T06:24:10.478-07:00</updated><title type='text'>Statistics in Politics - Lies and Damn Lies</title><content type='html'>The nice thing about politicians and the newspaper columnists that write about them is that they lie &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;a lot&lt;/span&gt; about statistics.  That makes writing a blog that points out the errors easy to create.  This weeks subject is David Brooks' latest &lt;a href="http://www.nytimes.com/2008/07/01/opinion/01brooks.html"&gt;New York Times column.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Brooks takes issue with &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;Obama&lt;/span&gt;’s claim that his fundraising is from a broad base of small donors, and goes on to compare &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;Obama&lt;/span&gt; money raised to McCain money raised by special interest group.  I am not going to attack the actual dollar figures that Brooks gives.  He cites no sources whatsoever, so that makes them hard to attack anyway.  Instead, I am going to show how presenting raw numbers without proper context creates a biased picture.&lt;br /&gt;&lt;br /&gt;Let’s take Brooks’ first claim:  He says “lawyers account for the biggest chunk of Democratic donations” and have donated $18 million, as compared to $5 million for McCain.  This sounds like 1) &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;Obama&lt;/span&gt; is getting most (“biggest chunk”) of his donations from one big special interest group (lawyers) and 2) &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4"&gt;Obama&lt;/span&gt; is getting 3 times as much of his donations from this group as McCain.&lt;br /&gt;&lt;br /&gt;Here’s the problem: &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5"&gt;Obama&lt;/span&gt; has &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6"&gt;out raised&lt;/span&gt; McCain by more than 2 to 1.  According to &lt;a href="http://www.cbsnews.com/blogs/2008/06/23/politics/horserace/entry4202380.shtml"&gt;CBS News&lt;/a&gt;, &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7"&gt;Obama&lt;/span&gt;’s total amount raised is $295.5 million &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8"&gt;compared&lt;/span&gt; to McCain’s $121.9 million.  Thus, the $18 million raised from lawyers represents only 6% of the money raised.  Still a lot of money, but it puts the “biggest chunk” in context.   McCain’s $5 million raised from lawyers, on the other hand, represents 4% of the total money he raised.  Thus, &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_9"&gt;Obama&lt;/span&gt; &lt;em&gt;is&lt;/em&gt; getting more as a percentage from lawyers but instead of 18 to 5, or 3 times as much, it’s 6% to 4%, or 50% more.  Another issue is that there is a difference between individuals who are lawyers and public interest groups for lawyers.  Brooks is trying to blur those lines by grouping all their donations together (to be fair, he does not say "special interest groups").   Sure, a lot of lawyers certainly support some of the public interest groups, but others do not.  Also, these groups can be at odds with one another, so grouping all lawyers together gives you the bigger number but is inaccurate.&lt;br /&gt;&lt;br /&gt;Brooks goes on to compare several other groups of professions.  In each of these areas, &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10"&gt;Obama&lt;/span&gt; receives more money in absolute dollars.  However, in terms of percentage of total donations, McCain is usually always receiving more: from financial securities workers, McCain gets 36% more as a percent of his total; from real estate workers, McCain gets 94% more; from bank workers, McCain gets 82% more; from hedge fund workers, McCain gets 29% more; from medical/health care workers, McCain gets 4% more. &lt;br /&gt;&lt;br /&gt;There are two other areas (in addition to lawyers) where &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_11"&gt;Obama&lt;/span&gt; is receiving more in percentage terms.  The first is “communications and electronics”, where &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12"&gt;Obama&lt;/span&gt; is getting 106% more in percentage terms.  The second is “Professors and other people who work in education.”   In that area, &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_13"&gt;Obama&lt;/span&gt; gets a whopping 4 times as much as McCain as a percentage of total funds raised.  Brooks implies that these are "part of a spontaneous movement of small-money enthusiasts," but he doesn't support that with any evidence showing that these groups are anything more than an unorganized group of individuals--all the polls have indicated that more educated people lean toward Obama, so why wouldn't they give more?&lt;br /&gt;&lt;br /&gt;The last thing that Brooks points out is that although, as &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_14"&gt;Obama&lt;/span&gt; claims, 90% of his donors gave less than $200, only 45% of his donated money comes from such small donors.  This is a good point, and &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_15"&gt;Obama&lt;/span&gt;, who has been &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_16"&gt;claiming&lt;/span&gt; this for awhile, should be called to the mat on it. &lt;br /&gt;&lt;br /&gt;However, it would be more interesting to look at the percent of small donors and money from small donors in McCain’s campaign as a comparison.  You can bet that it’s less than 45% of donated money and less than 90% of donors.  Yet, a comment on Brooks’ article by a &lt;a href="http://blogs.tnr.com/tnr/blogs/the_plank/archive/2008/07/01/just-how-big-are-obama-s-small-donors.aspx"&gt;New Republic blogger&lt;/a&gt; puts it into context, pointing out that “31 percent of Bush's money in 2004 came from donations of $200 or less (compared to 16 percent in 2000). Kerry, meanwhile, raised 37 percent...”  (the blog sites &lt;a href="http://www.ncl.org/publications/ncr/95-3/fulltext2.pdf"&gt;this article&lt;/a&gt; on 2004 donations (by Joseph &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_17"&gt;Graf&lt;/span&gt;) as its source). Thus, 45% is a lot, but the number has been increasing for both parties, with the most obvious reason that the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_18"&gt;Internet&lt;/span&gt; has allowed candidates to easily reach out to everyone, rather than raising most of their money through $1,000 a plate dinners and the like (campaign finance reform, which limits individual contributions, also had a role in bringing up the percentage raised through smaller donors).&lt;br /&gt;&lt;br /&gt;The lesson here is, of course: “don’t believe the numbers.” David Brooks is going to make them look good for McCain--he’s a columnist, not a reporter--just as other columnists are going to make them look good for &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_19"&gt;Obama&lt;/span&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-2255639115786680252?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/2255639115786680252/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=2255639115786680252' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/2255639115786680252'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/2255639115786680252'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/07/statistics-in-politics-lies-and-damn.html' title='Statistics in Politics - Lies and Damn Lies'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-2596727796162402632</id><published>2008-06-12T01:45:00.000-07:00</published><updated>2008-06-26T04:36:30.542-07:00</updated><title type='text'></title><content type='html'>&lt;p class="MsoNormal"&gt;&lt;b&gt;Let’s make a deal problem&lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;&lt;/p&gt;    &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;Those of us who grew up with the show Let’s Make a Deal can understand the gyst of the let’s make a deal problem right away.  For those of you too young (or old) to remember, here is a summary.&lt;br /&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span style=""&gt;            &lt;/span&gt;Monty Hall, the host, allows you to choose one of three curtains.&lt;span style=""&gt;  &lt;/span&gt;Behind one of the curtains is a new car or another big prize, while behind the other two is a year’s supply of shampoo or the equivalent.&lt;span style=""&gt;  &lt;/span&gt;You choose Curtain 1.&lt;span style=""&gt;  &lt;/span&gt;Monty opens Curtain 2 and shows you it has a year’s supply of the shampoo.&lt;span style=""&gt;  &lt;/span&gt;Then he gives you a choice:&lt;span style=""&gt;      &lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left: 0.75in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=""&gt;a)&lt;span style=";font-family:&amp;quot;;font-size:7;"  &gt;     &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span dir="ltr"&gt;stick with your original decision, or &lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left: 0.75in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=""&gt;b)&lt;span style=";font-family:&amp;quot;;font-size:7;"  &gt;     &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span dir="ltr"&gt;switch to Curtain 3.&lt;/span&gt;&lt;/p&gt;    &lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;The intuitive conclusion is that it doesn’t matter: there are two curtains remaining, and they are equally likely to contain the prize.&lt;span style=""&gt;  &lt;/span&gt;However, in this case, the intuition is wrong.&lt;span style=""&gt;  &lt;/span&gt;If we assume Monty&lt;/p&gt;  &lt;p class="MsoNormal" style="text-indent: 0.5in;"&gt;1) always shows a curtain with the shampoo behind it; &lt;/p&gt;  &lt;p class="MsoNormal" style="text-indent: 0.5in;"&gt;2) never reveals the curtain you chose; and &lt;/p&gt;  &lt;p class="MsoNormal" style="text-indent: 0.5in;"&gt;3) randomly decides which of the remaining two curtains to reveal if the curtain you chose contains the car, &lt;/p&gt;  &lt;p class="MsoNormal" style="text-indent: 0.5in;"&gt;then switching to Curtain 3 gives you a 2/3 chance of winning while sticking to Curtain 1 gives you a 1/3 chance of winning.&lt;/p&gt;    &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;b&gt;&lt;i&gt;Why? &lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;This problem, like many probability problems, is one of information.&lt;span style=""&gt;  &lt;/span&gt;Initially, you have no information about any of the three curtains so each choice gives you a 1/3 chance of winning.&lt;span style=""&gt;  &lt;/span&gt;By showing you the curtain with the shampoo, you have learned nothing new about the curtain you originally chose—because there was no way, whether your curtain had the car or the shampoo, that Monty was going to show you what was behind your curtain.&lt;span style=""&gt;  &lt;/span&gt;Your curtain had, and still has (as far as you know), a 1/3 chance of containing the car.&lt;span style=""&gt;  &lt;/span&gt;However, you did get information about Curtain 3: Monty did not choose to reveal it.&lt;span style=""&gt;  &lt;/span&gt;This could mean one of two things: &lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left: 0.75in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=""&gt;A.&lt;span style=";font-family:&amp;quot;;font-size:7;"  &gt;    &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span dir="ltr"&gt;Curtain 3 has the car, and therefore Monty had to show you Curtain 2, as he would never reveal the curtain with the car (1/3 chance, calculated by taking the 1/3 chance that Curtain 3 has the car and multiplying by the 100% chance that he reveals Curtain 2 when the car is behind Curtain 3); or&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left: 0.75in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=""&gt;B.&lt;span style=";font-family:&amp;quot;;font-size:7;"  &gt;    &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span dir="ltr"&gt;Curtain 1 has the car, and Monty chose to reveal Curtain 2 (1/6th chance, calculated by taking the 1/3 chance that curtain 1 has the car and multiplying by the ½ chance that Monty reveals curtain Number 2 when the car is behind Curtain 1).&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;These probabilities do not sum to 1, because we are excluding the outcomes, now impossible, where Monty reveals Curtain Number 3.&lt;span style=""&gt;  &lt;/span&gt;In order to revise the probabilities to take into account what was revealed by Monty, we need to divide the probabilities in A (1/3) and B (1/6) above by the chances of the two possible remaining outcomes (1/3 plus 1/6 = 1/2).&lt;span style=""&gt;  &lt;/span&gt;Thus, outcome A (car is behind Curtain 3) has a probability of&lt;span style=""&gt;  &lt;/span&gt;(1/3) / (½)= 2/3, while outcome B (car is behind Curtain 1) has a probability of (1/6)/(1/2) = 1/3.&lt;span style=""&gt;  &lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The intuition is as follows: Monty always reveals Curtain 2 or 3 when you choose 1, so you do not get any more information about whether it is behind 1 by this revelation, but you do gain information about 2 and 3 from this revelation, since he never reveals 3 if the car is behind it but does sometimes reveal 3 if the car is not behind it.&lt;span style=""&gt;  &lt;/span&gt;Thus, the fact that Monty did not reveal Curtain 3 tells you something.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;[Note: this problem has been around for awhile, but was made famous by Marilyn Vos Savant’s discussion of it and the subsequent outcry by those who insisted her answer, the correct one, was wrong.&lt;span style=""&gt;  &lt;/span&gt;See, for example:&lt;span style=""&gt;  &lt;/span&gt;&lt;a href="http://www.letsmakeadeal.com/problem.htm"&gt;http://www.letsmakeadeal.com/problem.htm&lt;/a&gt;]&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;b&gt;&lt;i&gt;Technical Explanation&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;There are a whole class of problems in probability that involve updating the chances based on new information.&lt;span style=""&gt;  &lt;/span&gt;These problems are solved according to Bayes’ Rule, after a law in probability that specifies how to update probabilities with new information (for a full discussion, including discussion of whether the Reverend Bayes was actually the first to discover this theorem, see the Wikipedia entry: http://en.wikipedia.org/wiki/Bayes'_theorem).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;To understand Bayes’ Rule, we need to first know the notation used for conditional probability.&lt;span style=""&gt;  &lt;/span&gt;We use the vertical line ( | ) to denote a condition and, as in prior blogs, P(A) is the probability that event A occurs.&lt;span style=""&gt;  &lt;/span&gt;Thus, P(A|R2) is the probability that A occurs, given that R2 already occurred.&lt;span style=""&gt;  &lt;/span&gt;Bayes' Rule is:&lt;/p&gt;  &lt;p class="MsoNormal"&gt;P(A|R2) = P(R2|A)*P(A) / P(R2)&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So let:&lt;/p&gt;&lt;p class="MsoNormal"&gt; A=event that prize is under Curtain 3&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span style=""&gt;  &lt;/span&gt;&lt;span style=""&gt;        &lt;/span&gt;R2= event that Monty reveals the curtain 2 contents&lt;/p&gt;&lt;p class="MsoNormal"&gt;C=event that prize is under Curtain 1&lt;br /&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Now we can figure out the right side of the Bayes’ Rule equation, in order to figure out P(A| R2).&lt;span style=""&gt;  &lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;We know P(R2|A) = 1, because Monty won’t reveal curtain 3 when it contains the prize and he won’t reveal curtain 1 because you chose curtain 1.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;P(A) = P(C) = 1/3 ==&gt; remember, this one is unconditional, so given three curtains, there’s a 1/3 chance of the prize being behind each.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;To figure out P(R2), it is useful to note that for any events R2 and A,  P(R2 and A) = P(A) * P(R2|A)&lt;/p&gt;  &lt;p class="MsoNormal"&gt;In our case, the P(R2) is the sum of the probabilities of 2 exclusive events: &lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left: 85.2pt; text-indent: -49.2pt;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=""&gt;1)&lt;span style=";font-family:&amp;quot;;font-size:7;"  &gt;                      &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span dir="ltr"&gt;prize is under curtain 3 (event A) and &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_0"&gt;Monty&lt;/span&gt; reveals curtain 2 (event R2): 1/3 * 1=1/3&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left: 85.2pt; text-indent: -49.2pt;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=""&gt;2)&lt;span style=";font-family:&amp;quot;;font-size:7;"  &gt;                      &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span dir="ltr"&gt;prize is under curtain 1 (event C) and &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_1"&gt;Monty&lt;/span&gt; reveals curtain 2 (event R2) 1/3*1/2 = 1/6.&lt;span style=""&gt;  &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;This sum, 1/3 plus 1/6 is ½=P(R2).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Thus, by Bayes Rule, P(A| R2) = (1*1/3) / ½ = 2/3&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Just for fun, now you can compute P(C|R2) = P(prize is under Curtain 1 given that Curtain 2 is revealed) = 1/3 using Bayes’ Rule.&lt;/p&gt;    &lt;p class="MsoNormal"&gt;&lt;b&gt;&lt;i&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;br /&gt;False Positives in Cancer Diagnoses&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;    &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;br /&gt;The outcome of Bayes’ Rule can be very confusing, and is important to keep in mind in more important problems than the Let’s Make a Deal problem.&lt;span style=""&gt;  &lt;/span&gt;For example, suppose an MRI for breast cancer has a false negative rate of 1/100, meaning that the test will incorrectly indicate that you do not have cancer when you in fact do 1 in 100 times.&lt;span style=""&gt;  &lt;/span&gt;Similarly, the test might also have a false positive rate of 1 in 100, meaning that the test will incorrectly indicate that you do have cancer when in fact you do not 1 in 100 times (false positive rates for &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;MRIs&lt;/span&gt; over time can be much higher, because they are frequently done once or twice a year: see the &lt;a href="http://www.sciencedaily.com/releases/2008/03/080325192329.htm"&gt;recent article&lt;/a&gt; about a study of false positives in &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;MRIs&lt;/span&gt; for breast cancer screening, which were around 25% over time.&lt;/p&gt;    &lt;p class="MsoNormal"&gt;Suppose your MRI result just came out positive for breast cancer.&lt;span style=""&gt;  &lt;/span&gt;What are the chances you actually have breast cancer?&lt;/p&gt;  &lt;p class="MsoNormal"&gt;First, it’s useful to know that around 250,000 women a year get breast cancer (see &lt;a href="http://www.breastcancer.org/about_us/press_room/press_kit/cancer_facts.jsp?gclid=CKjg2ZO77pMCFQkYQgod3Q-pzA"&gt;this site&lt;/a&gt;)  and there are about 60 million women above the age of 40 (see &lt;a href="http://www.census.gov/population/www/socdemo/men_women_2006.html"&gt;census site&lt;/a&gt;), when most cases occur.&lt;span style=""&gt;  &lt;/span&gt;This represents an annual infection rate of nearly 1 in 200.&lt;/p&gt;    &lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;Let’s define the probabilities:&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span style=""&gt;            &lt;/span&gt;P(C) = Probability of breast cancer in a given year = 1/200 = 0.005&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span style=""&gt;            &lt;/span&gt;P(D| not C) = Probability that MRI diagnosed cancer given that you do NOT have cancer = false positive = 1/100 =0.01&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span style=""&gt;            &lt;/span&gt;P(N|C) = Probability of MRI did not diagnose given that you have cancer = false negative = 1/100= .01&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span style=""&gt;            &lt;/span&gt;P(D|C) = 1-P(N|C) = Probability that MRI diagnosed cancer given that you have cancer = 99/100 =0.99&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;We want P(C|D) = Probability of cancer, given a cancer diagnosis by MRI.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Before using Bayes’ Rule, we can first define P(D) as the sum of the probabilities of all exclusive events that include D.&lt;span style=""&gt;  &lt;/span&gt;In English, the chance of diagnosis is the sum of 1) the chance that you have cancer and are diagnosed and 2) the chance that you do not have cancer and are diagnosed.&lt;span style=""&gt;  &lt;/span&gt;Thus, P(D) = P(D|C)*P(C) + P(D|not C)* P(not C) = 0.99*0.005 + 0.01* .995 =.0149&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Using Bayes’ Rule:&lt;/p&gt;  &lt;p class="MsoNormal"&gt;P(have cancer given the MRI result shows cancer) = P(C|D) = P(D|C)*P(C)/ P(D) = 0.99 * .005 / .0149 = .33 or about 1/3.&lt;/p&gt;    &lt;p class="MsoNormal"&gt;Thus, a very effective MRI test for cancer, which gives the wrong result only 1% of the time, is still suspect when it gives a result of cancer.&lt;span style=""&gt;  &lt;/span&gt;In fact, an MRI diagnosis of cancer indicates only a 1/3 probability of actually having cancer (keep in mind while there are indications that false positives I used here for the MRI are made up, though they do appear to be at least in the 1% range).&lt;span style=""&gt;  &lt;/span&gt;&lt;/p&gt;    &lt;p class="MsoNormal"&gt;It’s easy to understand what happens logically when you imagine that 200 women come in for screening.&lt;span style=""&gt;  &lt;/span&gt;Only 1 will probably have cancer, since the cancer rate is about 1 in 200.&lt;span style=""&gt;  &lt;/span&gt;The MRI will almost surely diagnose her (99% chance).&lt;span style=""&gt;  &lt;/span&gt;For the other 199, the MRI will indicate no cancer for all but about 1%, which means it will indicate cancer for about 2 of them.&lt;span style=""&gt;  &lt;/span&gt;Thus, of the 3 cases where the MRI indicates cancer, 2 of them will be false indications.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-2596727796162402632?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/2596727796162402632/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=2596727796162402632' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/2596727796162402632'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/2596727796162402632'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/06/lets-make-deal-problem-those-of-us-who.html' title=''/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-1271665999864038220</id><published>2008-05-08T22:28:00.000-07:00</published><updated>2008-05-11T20:50:13.730-07:00</updated><title type='text'>Why are there too many boys in China?</title><content type='html'>For a long time now, the ratio of males to females in China has been increasing.  In fact, one of the most recent articles I could find on it was from 2004, where the ratio stood at around 120 boys to every 100 girls &lt;a href="http://www.msnbc.msn.com/id/5953508"&gt;(see the msnbc article)&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;It's clear to most that the combination of the one child law, preventing most chinese couples from having more than one child, and the preference in China for boys, is driving this (though there are other explanations, including the possibilities of different effects of some diseases: &lt;a href="http://www.economics.harvard.edu/faculty/barro/files/bw05_02_28.pdf"&gt;see this business week article&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;There are two sinister mechanisms for ensuring that your only child is a boy: selective abortion or infanticide.   Yet there is another option: just have another baby if the first is a girl, and don't tell the government.  I think this third option is more likely, because I do not think most families can afford an abortion (illegal for sex selection) and very few mothers would kill their babies.&lt;br /&gt;&lt;br /&gt;So how much does this non-reporting need to happen to change the ratio from the normal 106 to 100 male to female births to the abnormal 120 to 100?&lt;br /&gt;&lt;br /&gt;The answer to this is the combination of three things: 1) percent of births that are girls (with no intervention), 2) percent of families that have another baby (hoping for a boy), given the first is a girl, and 3) the percent of families that do not tell the government about the first baby.&lt;br /&gt;&lt;br /&gt;Lets call these percentages, Pg, P2, and  Ps (for girl, 2nd child, and secret).  Lets also call Pr the reported percent of girls, which is 100/220, or 45.45%.  We'll assume also for simplicity that families quit trying when they have a boy or have 2 children, whichever comes first.  Also, we'll assume families always report the first child if it is a boy or if they have no more children.&lt;br /&gt;&lt;br /&gt;Pg is known at around 100/206=48.54%&lt;br /&gt;P2 and Ps are unknown.&lt;br /&gt;&lt;br /&gt;We want to figure out what P2 and Ps could lead to the Pr being 45.45% when Pg is 48.54%.&lt;br /&gt;&lt;br /&gt;First, consider that, given the ground rules above, the following are the types of families that can exist (in birth order):&lt;br /&gt;B (boy, one child only)&lt;br /&gt;G (girl, one child only)&lt;br /&gt;GB (girl boy, two children)&lt;br /&gt;GG (girl girl, two children)&lt;br /&gt;&lt;br /&gt;To figure out the percent of girls reported, we need the total girls reported divided by the total children reported.  This is easy to figure out for each combination above:&lt;br /&gt;B  = 0 girls / 1 child&lt;br /&gt;G = 1 Girl / 1 child&lt;br /&gt;GB =  0 girls / 1 child Ps percent of the time and  1 girls  / 2 children (1- Ps percent of the time)&lt;br /&gt;GG = 1 girl / 1 child Ps percent of the time and 2 girls / 2 children (1-Ps percent of the time)&lt;br /&gt;&lt;br /&gt;We are almost there.  Now we just need to sum the numerators multiplied by their probabilities and the denominators multiplied by their probabilities.  Here are the probabilities of each family combination:&lt;br /&gt;B = 1-Pg&lt;br /&gt;G = Pg*(1 - P2)  ==&gt; It's just Pg times the percent of families who do not have more children&lt;br /&gt;GB = Pg*(P2)*(1-Pg) It's the chances of a girl, followed by the decision to have a 2&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;nd&lt;/span&gt;, followed by having a boy.&lt;br /&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1"&gt;GG&lt;/span&gt; = Pg*(P2)*Pg=Pg^2*P2&lt;br /&gt;&lt;br /&gt;Thus the numerator (number of girls reported average is):&lt;br /&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;Num&lt;/span&gt; = (1-Pg)*0 +&lt;br /&gt;             Pg*(1-P2)*1 +&lt;br /&gt;             Pg*P2*(1-Pg)*&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;Ps&lt;/span&gt;*0 +&lt;br /&gt;             Pg*P2*(1-Pg)*(1-&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4"&gt;Ps&lt;/span&gt;)*1 +&lt;br /&gt;             Pg^2*P2*&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5"&gt;Ps&lt;/span&gt;*1 +&lt;br /&gt;             Pg^2*P2*(1-&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6"&gt;Ps&lt;/span&gt;)*2&lt;br /&gt;&lt;br /&gt;and the denominator (number of children reported on average):&lt;br /&gt;Den = (1-Pg)*1 +&lt;br /&gt;             Pg*(1-P2)*1 +&lt;br /&gt;             Pg*P2*(1-Pg)*&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7"&gt;Ps&lt;/span&gt;*1 +&lt;br /&gt;             Pg*P2*(1-Pg)*(1-&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8"&gt;Ps&lt;/span&gt;)*2 +&lt;br /&gt;             Pg^2*P2*&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_9"&gt;Ps&lt;/span&gt;*1 +&lt;br /&gt;             Pg^2*P2*(1-&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10"&gt;Ps&lt;/span&gt;)*2&lt;br /&gt;&lt;br /&gt;We know that, in China, Pr= &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_11"&gt;Num&lt;/span&gt;/Den = 45.45% and that, in general, Pg=48.54%.  Thus, we can solve .4545=&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12"&gt;Num&lt;/span&gt;/Den in terms of &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_13"&gt;Ps&lt;/span&gt; and P2.&lt;br /&gt;&lt;br /&gt;Since we have 1 equations and 2 unknowns, there are an infinite number of solutions, but here are a few possibilities:&lt;br /&gt;0% have a second child --impossible&lt;br /&gt;10% have a second child -- impossible&lt;br /&gt;15% have a second child and 85% of those keep the first a secret from the government&lt;br /&gt;20%  have a second child and 65% of those keep the first a secret from the government&lt;br /&gt;30% have a second child and 45% of those keep the first a secret from the government&lt;br /&gt;40% have a second child and 35% of those keep the first a secret from the government&lt;br /&gt;50% have a second child and 30% of those keep the first a secret from the government&lt;br /&gt;&lt;br /&gt;One thing to note (that is not &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_14"&gt;necessarily&lt;/span&gt; obvious in these calculations)  is that if  everyone reports all the children they have (&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_15"&gt;Ps&lt;/span&gt;=0), then the percent of girls will be exactly 48.54%, the same as if everyone had one child, as long as infanticide and selective abortion are not occurring.&lt;br /&gt;&lt;br /&gt;But the main point here is that a small number (15%) of couples having second children and not reporting the first girl leads to the warped percentages of baby girls, if there is high under-reporting of these first children.  You do not need to assume that infanticide or selective abortion plays a role at all.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-1271665999864038220?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/1271665999864038220/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=1271665999864038220' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/1271665999864038220'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/1271665999864038220'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/05/why-are-there-too-many-boys-in-china.html' title='Why are there too many boys in China?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-6146243392913285807</id><published>2008-04-15T12:00:00.000-07:00</published><updated>2008-04-15T13:26:17.780-07:00</updated><title type='text'>What's the chance of rain?</title><content type='html'>Everyday probability barely shows up in weather forecasts these days.  For example, Yahoo's weather will say something like "a few showers" to mean that there is a chance of showers.  However, if you are a purist, go to the National Weather Service (NWS) site, where they still make predictions using probabilities.  See the Chicago forecast for this week in &lt;a href="http://weather.yahoo.com/forecast/USIL0225.html#text"&gt;Yahoo&lt;/a&gt; and at the &lt;a href="http://forecast.weather.gov/MapClick.php?CityName=Chicago&amp;amp;state=IL&amp;amp;site=LOT&amp;amp;textField1=41.837&amp;amp;textField2=-87.685&amp;amp;e=1"&gt;NWS site&lt;/a&gt; (the online &lt;a href="http://www.nytimes.com/gst/weather.html?detail=143949"&gt;NY times&lt;/a&gt; can barely be bothered to give any information at all, unless you can interpret their icons). &lt;br /&gt;&lt;br /&gt;When it comes to data, I've always felt more is more, and so if I am really interested in the weather, I go to the NWS site where I'll get more than just "chance of rain" or "a few showers."&lt;br /&gt;&lt;br /&gt;But what does it mean when the weather forecast says there is a 30% chance of rain Wednesday and a 50% chance of rain Thursday? If we focus on a single time period, say Thursday, the conclusion if pretty clear: there's a 50-50 chance of rain.  Put another way, when encountering conditions like this in the past, the NWS model data shows rain half the time and no rain half the time.&lt;br /&gt;&lt;br /&gt;The inference becomes more difficult when we want to ask a more complex question.  For example, suppose I'm going to Chicago Wednesday and returning Thursday night.  I want to know whether to bring an umbrella.  Since I hate lugging an umbrella along, I only want to bring one if there is at least a 75% chance of rain at some point while I'm there.&lt;br /&gt;&lt;br /&gt;It turns out that the answer to this question cannot be determined with the information given (don't you just love that choice on multiple choice tests?). &lt;br /&gt;&lt;br /&gt;Before we explain why, though, we need some definitions and notation.&lt;br /&gt;To do the math for this, we generally define each possible outcome as an &lt;em&gt;event&lt;/em&gt;.  In this case, we have the following events:&lt;br /&gt;Event A: Rains Wednesday&lt;br /&gt;Event B: Rains Thursday&lt;br /&gt;&lt;br /&gt;We are interested in the chance that either Event A or Event B occurs. We have a shorthand for expressing the probability of Event A: "P(A)".&lt;br /&gt;&lt;br /&gt;There is a simple probability formula that is very useful here:&lt;br /&gt;P(A or B) = P(A) + P(B) - P(A and B)&lt;br /&gt;This formula says that the probability of Event A or Event B happening is the probability of A plus the probability of B minus the probability that A and B both happened (the event that A and B occurred is called the &lt;em&gt;intersection&lt;/em&gt; of Events A and B).  This makes sense because if we just added them (as you might intuitively do) we are double counting the times both events occur, and thus we need to subtract out the intersection once at the end.&lt;br /&gt;&lt;br /&gt;In some cases P(A and B)=0.  In other words, Events A and B never occur together.  You may have noticed this comes up when you toss a coin: it is never both heads and tails at the same time (except for that time I got the coin to stand on its side).   Events like these are called mutually exclusive.&lt;br /&gt;&lt;br /&gt;In other cases P(A and B)=P(A)*P(B).   This means the probability of A and B is the product of the probabilities A and B.  In this case, the two events can both occur, but they have nothing to do with each other.  Events like these are called independent events.&lt;br /&gt;&lt;br /&gt;In still other cases, P(A and B) is neither P(A) + P(B) or P(A)*P(B).&lt;br /&gt;&lt;br /&gt;If we assume the events A and B are mutually exclusive, then there's an 80% chance (50+30) of rain either Wednesday or Thursday.  This seems unlikely though, because most storms could last more than an evening.&lt;br /&gt;&lt;br /&gt;If we assume the events A and B are independent, then there's an 65% chance of rain either Wednesday or Thursday. This is a little more complicated to calculate, because we need to figure out the chances of it raining both Wed. and thursday, which we assume is independent and thus is P(A)*P(B)=30%*50%=15%.  Thus:&lt;br /&gt; P(A or B)=P(A) + P(B) - P(A and B)=50% + 30% -15%=65%. &lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;[we could also figure out the chances of not raining either night.  Since the chance of rain is independent, the chance of no rain is also independent.  Also, the chance of rain plus the chance of no rain must be one.  Thus P(no rain Wednesday)=1-P(A)=100%-30%=70%.  Similary, P(not B)=100%-50%=50%.  Then, the chance of no rain either time period = P(not A and not B)=70%*50%=35%.  Thus, there is a 35% chance it will not rain either night, and we can conclude there would be a 65% chance of rain one of the nights, of course all hinging on the independence assumption].&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;okay.  So finally, we have two probabilities 80% and 65%, based on two different and rather extreme assumptions.  On the high side, 80% is the most extreme.  We can see this by seeing that in order to get a larger number, we'd have to plug in a negative probability for the value of P(A and B) in the general formula (which does not assume independence or anything else):&lt;br /&gt;P(A or B) = P(A) + P(B) - P(A and B)&lt;br /&gt;&lt;br /&gt;Since probabilities must always be at least 0 and at most 100%, we cannot have a negative number for P(A and B).  So at most, the chance of rain Wednesday or Thursday is 80%.&lt;br /&gt;&lt;br /&gt;But what about the least the chance might be?  Independence seems a pretty extreme assumption in the other direction, but in fact it is not.  What would lead to the smallest probability is if the two events A and B were highly related--in fact so related that P(B)=1 if A occurs.  This would mean the P(A and B)=30% (the smaller of P(A) and P(B)).  This would lead to a probability that it rains either Wednesday or Thursday of just 50%:&lt;br /&gt;P(A or B) = P(A) + P(B) - P(A and B) = 30% + 50% - 30% = 50%&lt;br /&gt;&lt;br /&gt;So now that we've got rain down, let me go back to the original impetus for this blog: it is easy to make the wrong inference when given information about the chances of a series of events. &lt;br /&gt;&lt;br /&gt;The recent Freakonomics blog about &lt;a href="http://freakonomics.blogs.nytimes.com/2008/04/09/medicine-and-statistics-dont-mix/#more-2483"&gt;chances of birth defects&lt;/a&gt; addresses this issue.  In it, Steven Levitt describes a couple who was told that there was a 1 in 10 chance that a test, which showed an embryo was not vailable, was wrong.  The test was done twice on each of two embryos, and all four times the outcome was that the embryos were not viable.  Thus, the lab told them that the chances of having healthy twins from two such embryos was 1 in 10,000.  Of course, after reading about rain above, you recognize this as the application of the independence assumption (1/10 times 1/10 times 1/10 times 1/10 equals 1 in 10,000).  The couple didn't listen to the lab though, and, nine months later, 2 very vaiable and very healthy babies were born. &lt;br /&gt;&lt;br /&gt;Post hoc, it seems the lab should have (at least) said the chances were somewhere between 1 in 10 and 1 in 10,000.  In addition, the 1 in 10 seems like an awfully round number--could it have been rounded down from some larger probability (1 in 8, 1 in 5, 1 in 3, who knows?).  Levitt wonders whether the whole test is just nonsense in the first place.&lt;br /&gt;&lt;br /&gt;So what do you do when confronted with a critical medical  problem and a slew of probabilities?  There's no easy answer, of course, but I believe gathering as much hard data as possible is important.  Then make sure you distinguish between the inferences made by your doctor, nurse, or lab technician (which are more subject to error) and the underlying probabilities associated with the drug, the test, or the procedure (which are less subject to error).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-6146243392913285807?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/6146243392913285807/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=6146243392913285807' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6146243392913285807'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/6146243392913285807'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/04/whats-chance-of-rain.html' title='What&apos;s the chance of rain?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-669105388068909692</id><published>2008-03-19T12:22:00.000-07:00</published><updated>2008-03-20T04:41:25.576-07:00</updated><title type='text'>Stocks or Bonds?</title><content type='html'>A couple of posts ago, I talked about the question of whether to &lt;a href="http://what-are-the-chances.blogspot.com/2008/02/rent-or-buy.html"&gt;rent or buy&lt;/a&gt; (the answer: in the long run, buy; in the short run, rent). With all the turmoil in the financial market, it seems a reasonable time to visit the question of whether stocks or bonds are a better investment.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;If you look at data from 1929-2007 for the Dow Jones Industrial Average (DJIA), a 20 year investment in stocks yielded an inflation-adjusted return of about 2.6% annually. This is before taking dividends into account, which add at least a couple percent (recently, the dividend yield has been closer to 2% while in the past 4-5% was the norm, see &lt;a href="http://seekingalpha.com/article/46288-stock-dividend-yields-vs-interest-rates-an-80-year-history"&gt;this article for relevant charts&lt;/a&gt;). The net return for stocks, after inflation, is around 6% annually.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;For bonds, the returns also generally beat inflation, but are not as good. Their average is around 2% annually, after subtracting out inflation.&lt;br /&gt;&lt;br /&gt;The following chart shows the DJIA inflation adjusted for 10 and 20 year investments (in blue and purple, downloaded from &lt;a href="http://finance.yahoo.com/q?s=%5EDJI"&gt;Yahoo Finance&lt;/a&gt;), versus Treasury bonds (in yellow). The US Treasury Bond return is based on 20-year bonds when available from the &lt;a href="http://www.federalreserve.gov/releases/h15/data.htm"&gt;Fed &lt;/a&gt;and 10 year bonds or estimating using the article sited above when 20-year returns were not available.&lt;br /&gt;&lt;br /&gt;The year shown is the final year of the investment. Thus, if you made a 20 year investment beginning in 1985, you can look at the points corresponding to 2005 to find out that you would have earned approximately 10% annually, whether it was in bonds (yellow) or stocks (purple for 20 year) by the start of 2005, and that is after inflation.&lt;br /&gt;&lt;br /&gt;[Click on the graph see it in higher resolution]&lt;br /&gt;&lt;img id="BLOGGER_PHOTO_ID_5179772605962451714" style="DISPLAY: block; MARGIN: 0px auto 10px; WIDTH: 452px; CURSOR: hand; HEIGHT: 318px; TEXT-ALIGN: center" height="235" alt="" src="http://2.bp.blogspot.com/_fp158_J456I/R-I_vdOw-wI/AAAAAAAAABE/XDOzdcIlFAs/s320/dowhistoryandbonds.jpg" width="393" border="0" /&gt;&lt;br /&gt;&lt;p&gt;&lt;a href="http://2.bp.blogspot.com/_fp158_J456I/R-F54tOw-vI/AAAAAAAAAA8/n0qU3scMG-4/s1600-h/dowhistoryandbonds.jpg"&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;The yellow line, denoting Treasury bond returns, is mostly below the blue and purple lines, indicating that, for most years, a 10 or 20 year investment in the Dow Jones Index is better than a similar investment in Treasury bonds. In fact, for 57 of the 68 ten-year investments, stocks do better. For 20 year investments, the numbers are even more promising for stocks, which perform better for 56 out of 57 20-year investments.&lt;/p&gt;&lt;p&gt;Treasury Bonds, on the other hand, are considered risk-free. The idea is that there is no risk that the U.S. Treasury will default on its loan (or, at least, if it does default, there are far more serious problems to worry about). On the other hand, with stocks, there are no guarantees that they will not go down.&lt;br /&gt;&lt;br /&gt;This basic idea, that stocks have broadly out-performed bonds and beaten inflation in the long run, seems to be well-understood. This fact, however, does not imply that individual stock investments will outperform bonds, or that a shorter term investment will outperform. &lt;/p&gt;&lt;p&gt;The "truth" about stocks that is implied by this graph, however, is a little deceiving, for three reasons.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;1. The graph tells you little about what might happen to a 20-year investment beginning in a particular year&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The graph shows that for most time periods, stocks do well, but there is an enormous amount of variation, even for the long-term investments considered here. If you happened to make some long-term investments in 1962, at the end of a recession and the beginning of a long boom, you'd still be out of luck if you needed that money in 1982, when its real value would have been far less than when you invested it (the pink point corresponding to 1982 shows a 1% annual loss for the 20 prior years, amounting to about an 18% total loss in real dollars).&lt;/p&gt;&lt;p&gt;[The "safe" Treasury bond portfolio, however, would have done far worse, losing about twice as much over the same period. This is not because the US defaulted on its bonds. Instead, it's because of inflation. The 20-year bond you bought in 1962, yielding 4%, did not keep up with inflation. This is the long-term, somewhat hidden risk for bonds.]&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2. The graph averages over a portfolio of stocks&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The other issue is that despite the graphs very clear implication that stocks are better, even in bad times, this does not necessarily imply that individual stocks are better. Returns on an individual stock, or even on a small portfolio of stocks, vary much more wildy than the Dow Jones average shown in the graph. Also, in recent years, the Treasury has issued inflation-indexed bonds, which guarantee a &lt;em&gt;real&lt;/em&gt; return above 0, thus insuring the yellow line in a future graph will be more than 0 (&lt;a href="http://www.treasurydirect.gov/indiv/products/prod_tips_glance.htm"&gt;see information on inflation-indexed bonds&lt;/a&gt;). &lt;/p&gt;&lt;p&gt;&lt;strong&gt;3. The future is not now&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;While it seems convincing that 80 years of history show stocks in a very positive light, we only need to look at Japan, whose Nikkei average, since 1985, has lost about 15% adjusting for inflation (it's far more than 50% for an investment made near Japan's stock market peak). There are some indications that the current U.S. problems (real estate boom and bust, credit problems) are worse than Japan's. Much of U.S. investment in the last few decades has been fueled by the safety of the dollar. The rise of the Euro and of globalization has already begun to change that, and a continuing fall in the dollar will almost certainly cause inflation, which was devastating to the stock market in the 1970's. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Bottom Line&lt;/strong&gt;&lt;br /&gt;There's a lot of evidence that, in the past, a long-term stock investment paid off, relative to both risk-free bonds and inflation. However, there is no guarantee this party will continue, especially if there is a sea-change in dollar investments. &lt;/p&gt;&lt;p&gt;Where's my retirement money? Almost all in stocks...but almost all foreign stocks.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-669105388068909692?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/669105388068909692/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=669105388068909692' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/669105388068909692'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/669105388068909692'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/03/stocks-or-bonds.html' title='Stocks or Bonds?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_fp158_J456I/R-I_vdOw-wI/AAAAAAAAABE/XDOzdcIlFAs/s72-c/dowhistoryandbonds.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-3330755302526165326</id><published>2008-02-24T01:26:00.000-08:00</published><updated>2008-03-16T04:24:07.278-07:00</updated><title type='text'>7 letter words and 8 card suits</title><content type='html'>A recent commenter asked whether the following is true: "ETAERIO is the most likely seven-letter word to get in scrabble."&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;As all scrabble players know, if you use all 7 of your letters, you get a bonus. However, except for the first turn, you'd need an 8-letter word to achieve this--ETAERIO would not do. Thus, I am going to try to answer the question: "What are the chances of getting the letters ETAERIO on your initial turn in scrabble (you also have to hope you are first)?" I have no hope of finding out whether this is the most likely seven letter word, because I can't automatically check all letter combinations, but I will try to give some guidance there as well.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;To find the chances of gettting ETAERIO, we need the number of combinations that produce these letters divided by the total number of combinations. In other words, we have to go back to 12-th grade math, where we all learned (or sort-of learned) permutations and combinations.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;There are 100 tiles in scrabble and we are choosing 7. Thus, there are 100 ways to to choose the first tile, 99 ways to choose the second, and so forth down to 94. If we chose them in order, we'd have 100*99*98*97*96*95*94 permutations. However, we don't care about the order, so we have to take the above product and divide by the number of ways we can permute the 7 tiles, which is 7*6*5*4*3*2*1. The shorthand way to express this number of combinations is "100 choose 7" or&lt;br /&gt;&lt;img id="BLOGGER_PHOTO_ID_5170493345048194034" style="DISPLAY: block; MARGIN: 0px auto 10px; CURSOR: hand; TEXT-ALIGN: center" height="65" alt="" src="http://3.bp.blogspot.com/_fp158_J456I/R8FITYkea_I/AAAAAAAAAAs/9M0olnWK_wA/s320/combo1.bmp" width="63" border="0" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;By dividing the two ratios above ((100*99*98*97*96*95*94) / (7*6*5*4*3*2*1)), we come up with 16,007,560,800. Since most letters appear multiple times, the number of possible letter combinations is far less, and to know the chances of getting ETAERIO, I need to know how many times each letter appears.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Thus, I found the &lt;a href="http://en.wikipedia.org/wiki/Scrabble_letter_distributions"&gt;letter distributions on Wikipedia&lt;/a&gt; (counting our own scrabble pieces would probably not do with a three-year old distributing them around the house). The most common ones as follows:&lt;br /&gt;E - 12 tiles&lt;br /&gt;A, I - 9 tiles&lt;br /&gt;O - 8 tiles&lt;br /&gt;N, R, T - 6 tiles&lt;br /&gt;D, L, S, and U - 4 tiles&lt;br /&gt;other letters - 3 or less, but not relevant here&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;To figure out the chances of getting ETAERIO, we need to know the number of combinations that produce it. We need 2 E's, 1 T, 1 A, 1 R, 1 I, and 1 O. It turns out that the number of ways is the product of each of these implied combinations. Thus, it is "12 choose 2" (E's) times "6 choose 1" (T) times "9 choose 1" (A) and so forth. This comes out to 1,539,648 ways to get the letters in ETAERIO. If we divide this by the total number of combinations (16,007,560,800), we find that there is about a 1 in 10,000 chance of getting ETAERIO as your first 7 letters. Of course, from there, you have to know it is a word and figure out that you can make that word from those letters, since they are not likely to appear in that order.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I could not find another word with a higher probability, but I did find TREASON and TRAINED (both about 1 in 20,000). However, It's clear from the distribution of letter tiles that in order to find a word that beats ETAERIO, you can only use letters appearing in 6 or more tiles.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;BRIDGE HANDS&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Now that we all remember the mechanics of combinations (or at least, we are on the subject of them), let's investigate another oft-asked question around here: what's the chance of being dealt a 7 card suit in bridge? This would be 4 (number of suits) times "13 choose 7" (ways to choose 7 from a suit) times "39 choose 6" (ways to choose the other 6 from the other 3 suits) divided by "52 choose 13" (ways to choose 13 cards from 52). This comes out to about 3.5%, or 3 or 4 times in every 100 hands.&lt;br /&gt;&lt;br /&gt;For an 8-card suit, it is 1 in about 200. For a 9-card suit, it is about 1 in 2,700. Of course, my kids are always asking about the chances of being dealt a 10 card suit or even a 13-card suit:&lt;br /&gt;10-card suit: 1 in 60,738&lt;br /&gt;11-card suit: 1 in 2,746,693 (less than 1 in million)&lt;br /&gt;12-card suit: 1 in 313,123,057 (less than 1 in 300 million)&lt;br /&gt;13-card suit: 1 in 158,753,389,900 (less than 1 in 150 billion)&lt;br /&gt;&lt;br /&gt;The chances aren't too great, but with some really poor shuffling, they've managed the 13-card suit once or twice.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-3330755302526165326?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/3330755302526165326/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=3330755302526165326' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/3330755302526165326'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/3330755302526165326'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/02/7-letter-words-and-8-card-suits.html' title='7 letter words and 8 card suits'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_fp158_J456I/R8FITYkea_I/AAAAAAAAAAs/9M0olnWK_wA/s72-c/combo1.bmp' height='72' width='72'/><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-5936533777804357252</id><published>2008-02-12T02:05:00.000-08:00</published><updated>2008-02-14T10:57:56.568-08:00</updated><title type='text'>Rent or Buy?</title><content type='html'>The answer to the age-old question, according to every grandmother out there, is "buy." But do the data really support this?&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Housing as an Investment&lt;br /&gt;&lt;/strong&gt;Forgetting for the moment about the psychological advantages and disadvantages of buying versus renting, let's look at the Economics, and, of course, the probabilities.&lt;br /&gt;&lt;br /&gt;One of the most fascinating pieces of information to answer this question is a chart put together by Robert Shiller (Yale Economist) as part of his book Irrational Exuberance (he also has an article on the housing market with similar charts in Economists' Voice, March, 2006). Shiller looks at inflation-adjusted housing prices from 1890 to the present. I got the chart from &lt;a href="http://www.investingintelligently.com/wp-content/uploads/2006/08/a_history_of_home_values.png"&gt;this site&lt;/a&gt;, and the whole article is free and downloadable&lt;a href="http://www.bepress.com/ev/"&gt; here &lt;/a&gt;(search on Shiller).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Shiller sets the price of a house in 1890 at 100, and shows how the value varies over time, adjusting for inflation. Thus, in 1947, soon after the war, the value is 110, 10% higher than in 1890. In 1989, the peak of the last boom, it is around 125. And now? Around 200!&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_fp158_J456I/R7MN1okea9I/AAAAAAAAAAc/MWkgCEalXjk/s1600-h/a_history_of_home_values.png"&gt;&lt;img id="BLOGGER_PHOTO_ID_5166488412598725586" style="DISPLAY: block; MARGIN: 0px auto 10px; WIDTH: 423px; CURSOR: hand; HEIGHT: 306px; TEXT-ALIGN: center" height="290" alt="" src="http://2.bp.blogspot.com/_fp158_J456I/R7MN1okea9I/AAAAAAAAAAc/MWkgCEalXjk/s320/a_history_of_home_values.png" width="390" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Besides the obvious "irrational exuberance" of the housing market that is indicated in this graph, another interesting fact comes out: Housing goes up and down over time and in any given 20 or 30 year period, can be either a good investment or a bad one. Sure, if you timed the last couple of booms correctly, you could have made a killing, but the fact is that a house bought in 1960 was basically the same price in 1995, after accounting for inflation. Of course, the person who bought that house could have lived there for 35 years, paying only the cost of upkeep, and, presumably the mortgage.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Total Return to Renting Versus Buying&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;That brings us to the next topic: knowing that a house may or may not give you any real capital appreciation, is it better to buy or rent?&lt;br /&gt;&lt;br /&gt;One of the big arguments I hear from my mother-in-law against renting is that "you are just throwing your money away." Seems like a good point. With renting, you get absolutely nothing out of it, but with buying, after 30 years, you own a house. The problem with this argument is it ignores two things: 1) the down payment, and 2) interest.&lt;br /&gt;&lt;br /&gt;When you buy a house, you put around 20% down. That money then cannot be invested elsewhere. In addition, you pay interest on a loan, whose proceeds are invested in the house. The good thing is that you are using the proceeds from your loan to buy an asset 5 times the value of what you invested in cash. For example, if you buy a $500,000 house, you only have to pay $100,000. Thus, if you are in one of the boom times in Shiller's graph, you get 5 times what his graph shows in return on your $100,000 investment. The flip side, of course, is that in the bust times, you get 5 times the losses.&lt;br /&gt;&lt;br /&gt;Compare this to renting. Here, you keep your $100,000, perhaps investing it in safe 5-year Treasury notes, where you can expect an inflation adjusted return of about 2.5% (see the &lt;a href="http://www.federalreserve.gov/releases/h15/data/Annual/H15_TCMNOM_Y5.txt"&gt;fed site&lt;/a&gt; for T-note rates and the &lt;a href="ftp://ftp.bls.gov/pub/special.requests/cpi/cpiai.txt"&gt;Bureau of Labor Statistics site&lt;/a&gt; for inflation rates--the real return is lower if you go back more than 40 years).&lt;br /&gt;&lt;br /&gt;Now, let's look at renting or buying with specific numbers. Suppose that you have a $500,000 home in mind. For buying the house, your total costs are your mortgage and your return is the amount you get back after the sale, minus the $100,000 you paid as a down payment. For renting, your costs are your rent and your return is the cash you get in interest from your $100,000 investment.&lt;br /&gt;&lt;br /&gt;The following table lays out 6 scenarios (click on the table to see a legible version).  I've adjusted for the tax benefits of the mortgage as well as inflation.&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;a href="http://2.bp.blogspot.com/_fp158_J456I/R7NH2okea-I/AAAAAAAAAAk/o254kPTlNrI/s1600-h/mortgageinteresttable.jpg"&gt;&lt;img id="BLOGGER_PHOTO_ID_5166552201453005794" style="DISPLAY: block; MARGIN: 0px auto 10px; WIDTH: 439px; CURSOR: hand; HEIGHT: 131px; TEXT-ALIGN: center" height="120" alt="" src="http://2.bp.blogspot.com/_fp158_J456I/R7NH2okea-I/AAAAAAAAAAk/o254kPTlNrI/s320/mortgageinteresttable.jpg" width="382" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;I varied the mortgage interest rate and the (annual) house appreciation. The two numbers at the bottom are your out of pocket monthly costs after 5 and 10 years for owning the house. Presumably, this is the amount in rent you should be prepared to pay. Thus, for a house that appreciates at the rate of inflation (roughly what has happened with housing since 1890--it is the 3% appreciation in the 2nd and 5th columns), your monthly costs are $633 if you have a 6% mortgage and sell after 5 years and $1,134 if you have an 8% mortgage. You do much better if you hold the house for 10 years. Not true for the house that does not appreciate. Then, you do better if you hold it for less time. &lt;/p&gt;&lt;p&gt;To really answer the question well, we'd need an accurate prediction of inflation and mortgage rates. In the short term, both of these are pretty easy. The second thing you need is an idea of how long you will hold the house. If the house appreciates at the rate of inflation, then you are better off holding onto it for longer. If it does not, then you are better off selling sooner.&lt;/p&gt;&lt;p&gt;What is apparent is how radically the value changes depending on the assumptions about how much the house will appreciate. Put in some negative numbers and it really gets scary. If the current boom results in a 2% annual nominal depreciation over the next 5 years, then your 5-year monthly cost goes to about $2,500. At 5% a year depreciation (something that occurred over several years with NYC apartments in the early 90s), then your out of pocket is $3,500 per month if you sell after 5 years.&lt;/p&gt;&lt;p&gt;So, should you rent or buy?&lt;/p&gt;&lt;p&gt;Well...it depends.&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-5936533777804357252?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/5936533777804357252/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=5936533777804357252' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5936533777804357252'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5936533777804357252'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/02/rent-or-buy.html' title='Rent or Buy?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_fp158_J456I/R7MN1okea9I/AAAAAAAAAAc/MWkgCEalXjk/s72-c/a_history_of_home_values.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-5690873711679449146</id><published>2008-01-21T23:58:00.000-08:00</published><updated>2008-01-22T01:06:04.985-08:00</updated><title type='text'>Throw away your cold medicine?</title><content type='html'>A recent article in the New York times touts: "&lt;a href="http://www.nytimes.com/2008/01/22/health/research/22nost.html?ref=science"&gt;Seawater Seems to Beat Medicine in Fighting Colds&lt;/a&gt;."   The article goes on to describe a study where "scientists assigned 289 cold or &lt;a title="In-depth reference and news articles about Influenza." href="http://health.nytimes.com/health/guides/disease/the-flu/overview.html?inline=nyt-classifier"&gt;flu&lt;/a&gt; patients ages 6 to 10 to be given a nasal wash three times a day with water from the Atlantic Ocean that had been commercially processed but retained seawater’s trace elements and minerals. As comparison, a group of 101 children used ordinary over-the-counter &lt;a title="In-depth reference and news articles about Cough." href="http://health.nytimes.com/health/guides/symptoms/cough/overview.html?inline=nyt-classifier"&gt;cough&lt;/a&gt; and cold medicines."&lt;br /&gt;&lt;br /&gt;The Times gets this first part wrong.  The "seawater" group got both standard medications and seawater, as explained by the &lt;a href="http://archotol.ama-assn.org/cgi/content/abstract/134/1/67"&gt;journal article on the study&lt;/a&gt;.  So, right away, we are not talking about throwing away our cold medicines.  Instead we might have to buy (for those of us not close to the ocean) seawater (more below about saline vs. seawater).&lt;br /&gt;&lt;br /&gt;But the study does have some interesting results.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Here's the good part&lt;/strong&gt;&lt;br /&gt;The results for the preventative success are most striking: at week 12, 25% of the children in the control group had reported illnesses that caused an absence from school versus only 8% in the treatment group.  This is statistically significant, meaning the results were too large to be explained away by mere chance.  This does not mean, however, that biases in the study could not have caused the difference (no matter how statistically significant, bias, if it exists, can mean an otherwise statistically significant difference is spurious).&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Here's the bad part&lt;/strong&gt;&lt;br /&gt;1.  The study was not blind.  This means that the children (and physicians and parents) were aware of whether the kids were taking the saline solution or not, subjecting the study to a "placebo" effect: kids who were taking the saline might have 'felt' better, but had no less incidence of a cold.  The &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0"&gt;study's&lt;/span&gt; authors make an error in the journal article by stating: "the large number of participants, &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_1"&gt;multi center&lt;/span&gt; design, and consistence of results between individual  parameters (assessed by physician, patient, and parent) lower the risk of bias." &lt;br /&gt;&lt;br /&gt;Bias is not mitigated by sample size -- that is, a large biased group is no better than a small biased group (imagine trying to figure out average height of all men by taking an NBA team, then doing a second study with the average of all NBA teams, saying this lessens the bias). &lt;br /&gt;&lt;br /&gt;Similarly, having three biased parties (physician, patient, and parent) would only reduce the bias if we were comparing it to the bias of the most biased party (say, the parent).&lt;br /&gt;&lt;br /&gt;2. The treatment is no fun.&lt;br /&gt;Perhaps as important as the questionable effect of the study due to bias is the fact that the treatment involves the solution being squirted into the kid's nose 3 times a day for 12 weeks.  I was sort of amazed that the study had so few dropouts (only 11 out of 401 patients in the study dropped out).  The unpleasantness seems barely worth the avoidance of a cold or two.&lt;br /&gt;&lt;br /&gt;3.  Seawater, water or saline-- does it matter?  The study does not provide any comparison of the seawater spray to other sprays, or even a simple water spray.  To its credit, the journal article does not focus on the fact that the solution was seawater but instead on the comparison of nasal wash versus no nasal wash.  In the journal article, there is little indication that seawater would be any better than salt water (the word seawater is mentioned 10 times in the journal article as opposed to saline, which is mentioned 64 times).   BTW, the seawater in the study was processed (and presumably &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_2"&gt;sterilized&lt;/span&gt;), so don't take it literally and go to your neighborhood polluted beach for your solution.&lt;br /&gt;&lt;br /&gt;The Times article, however, focuses on the idea of seawater, as opposed to a simple saline (salt-water) solution. &lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Net Net&lt;/strong&gt;&lt;br /&gt;It does seem that washing your nose out with saline 3 times a day will make your kids feel better and they will miss less school.  But it's unclear whether they are actually any less sick or they just think they are less sick (to that end, maybe giving them a sugar pill each day, telling them it was a special cold pill, would have the same effect).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-5690873711679449146?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/5690873711679449146/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=5690873711679449146' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5690873711679449146'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5690873711679449146'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/01/throw-away-your-cold-medicine.html' title='Throw away your cold medicine?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-298150536138329664</id><published>2008-01-10T00:08:00.000-08:00</published><updated>2008-01-10T00:31:04.716-08:00</updated><title type='text'>Election Math - Update</title><content type='html'>So my guess was wrong. Clinton won. The question is, was it the various selection biases and undecided voters, the measurement error (polls were on &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_0"&gt;Saturday&lt;/span&gt; and &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_1"&gt;Sunday&lt;/span&gt; but the election is on Tuesday), or something else?&lt;br /&gt;&lt;br /&gt;Unfortunately, there is really no way to tell. However, there are two interesting things to note.&lt;br /&gt;&lt;br /&gt;One is that in &lt;a href="http://www.usaelectionpolls.com/2008/polls/pdfs/marist-college-new-hampshire-poll-jan05tojan06.pdf"&gt;one local poll&lt;/a&gt;, they show the percentages both including and excluding those leaning toward a candidate but are undecided. In this case, &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2"&gt;Obama&lt;/span&gt; received 2% more of the vote if the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3"&gt;leaners&lt;/span&gt; are counted, indicating there is some bias in the method that most polls use, which is to count the &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4"&gt;leaners&lt;/span&gt; (who are actually still undecided) rather than exclude them (the polls do exclude the truly undecided, which has hovered in the 5-10% range, but implicitly assume they will vote the same way as the decided voters).&lt;br /&gt;&lt;br /&gt;The second thing to note is that the polls very accurately predicted &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5"&gt;Obama's &lt;/span&gt;percentage and inaccurately predicted Hillary's. &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6"&gt;Obama's&lt;/span&gt; actual percentage was 36%, and the seven major polls predicted &lt;a href="http://www.presidentpolls2008.com/"&gt;36%, 35%, 39%,41%,34%,38%, and 39%&lt;/a&gt; (an average of 37%). Hillary's percentages (&lt;a href="http://www.presidentpolls2008.com/"&gt;same web page&lt;/a&gt;) were 28%, 34%,28%,28%,31%,29%, and 29% (average of 30%) versus her actual of 39% (a difference that is outside of the zone for mere statistical error). Hillary &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_7"&gt;apparently&lt;/span&gt; picked up votes from undecideds, people who were leaning for &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8"&gt;Obama&lt;/span&gt;, and from the Edwards/Richardson camps.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-298150536138329664?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/298150536138329664/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=298150536138329664' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/298150536138329664'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/298150536138329664'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/01/election-math-update.html' title='Election Math - Update'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-5857511900054334830</id><published>2008-01-07T00:39:00.001-08:00</published><updated>2008-01-07T02:26:18.311-08:00</updated><title type='text'>Election Math</title><content type='html'>Today's CNN headline screams out &lt;a href="http://us.cnn.com/2008/POLITICS/01/06/nh.poll/index.html"&gt;"Obama opens double-digit lead over Clinton." &lt;/a&gt;When you read the article, you find that the poll, of 341 Democrats, showed Obama over Clinton, 39% to 29%. This is as compared to a Saturday poll showing them in a dead heat, at &lt;a href="http://www.presidentpolls2008.com/"&gt;33-33&lt;/a&gt; (&lt;a href="http://www.presidentpolls2008.com/"&gt;http://www.presidentpolls2008.com/&lt;/a&gt; is a nice polling site, because it shows the polls side by side and you can link to the details of the statistical error and actual questions asked).&lt;br /&gt;&lt;br /&gt;Did so many people change their minds in one day? If we ignore other polls, then the statistical evidence is not conclusive. Why? Because the margin of error in the 39-29 poll is 5%, meaning that the Obama's percentage is likely somewhere between 34 and 44%. The Saturday poll, also with a 5% margin of error, indicates his percentage is between 28 and 38%. The overlap between these two ranges means the numbers might not have changed at all. Rather, the difference is mere statistical error, which is an artificat of the sampling. By luck of the draw, the Sunday poll may have found more Obama supporters, even though no one changed their mind.&lt;br /&gt;&lt;br /&gt;When comparing two polls taken independently (as above), the error is more than the stated error (5% above) but less than the sum of the stated error in the two polls (5%+5%=10% above). We can compute the error of the difference in these two polls as 7%, which means that Obama's numbers, which appeared to go up by 6% (from 33% to 39%), may have gone down by as much as 1% or up by as much as 13%.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;Some math:  The 7% is computed as the 1.96 multiplied by the square root of the sum of the squared standard deviations of the polls.  The standard deviation of the poll is the error rate (5%) divided by 1.96, or about 2.5%.  In a standard probability distribution called a Normal Distribution, 95% of the data falls between plus or minus 1.96 standard deviations from the mean.  Thus, in the latest poll, the mean for Obama was 39%, with a standard deviation of 2.5%.  &lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Several implicit assumptions are made in computing the error rate in these polls, primarily summarized as: 1) the Normal Distribution is appropriate, 2) the sample is a random sample of all who will vote in the Tuesday Democratic primary, and 3) the answers in this poll are reflective of the way the voters will actually vote come Tuesday.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Assumption Review&lt;/strong&gt;&lt;br /&gt;1) Normal Distribution.  This one is easy.  For a large, random sample and a multiple choice question (who will you vote for), this assumption is always close to reality except when the number polled is very small or the percentages (as are those for say, Kucinich) are close to 0% or 100%.  For Clinton and Obama, there is no real issue here, since the sample size is moderately large and their percentages are in the 30-40% range (a neat demo that shows how close a distribution is to Normal, depending on the sample size and percentage, is at &lt;a href="http://www.ruf.rice.edu/~lane/stat_sim/binom_demo.html"&gt;http://www.ruf.rice.edu/~lane/stat_sim/binom_demo.html&lt;/a&gt;).&lt;br /&gt;&lt;p&gt;2) Random Sample.  This is more difficult.  Suppose Clinton voters go to chuch on Sunday, followed by lunch, while the Obama voters are home-bodies.  This problem can be called selection bias.  If there is church-lunch/home-body selection bias, then, in the Sunday poll, a random dialing of phone numbers would have surfaced more Obama voters and would not have been a random sample of Tuesday's voters, as opposed to Saturday, where you might have gotten more equality.  [There is generally a second underlying issues of refusals--people who refuse to be polled.  If these voters are more likely to vote for Clinton, then Clinton's numbers will be under-stated, but this would be true for both polls that we are comparing.]&lt;/p&gt;&lt;p&gt;3) The difference between how people said they felt in today's poll versus how they will vote in Tuesday's election.  I find this difference, which can be called measurement error, the most troublesome.  Take, for example, the fact that, in the CNN Saturday poll, one-fourth of voters stated that they had not yet decided, another quarter were only leaning toward someone, and just half had definitely decided.  This is, of course, only what people are saying, and often people do not want to admit indecision, so the true numbers of undecided may even be higher.   Still, if the undecided's vote even 60-40 in favor of hillary, it would erase the 10% lead of Obama.&lt;/p&gt;The 5% error rate (and 7% error of difference rate) does not take the above issues into account.  It implicitly assumes they will have no effect.  Thus, the true error rate in election polls is likely far higher. &lt;br /&gt;&lt;br /&gt;If we consider other polls, the Obama lead, and the change seems to be clearer.  In the seven polls published th 6th of January, Obama has an average lead of about 2-3%.  In the 5 polls published Friday and Saturday, Clinton led in all of them, by around 5 points.  We can prove this change, from pre-Sunday to Sunday, is statistically significant.  However, because of the selection bias and measurement error issues above, it may not be indicative of the outcome on Tuesday.&lt;br /&gt;&lt;br /&gt;My personal guess?  Obama by a good margin...but a lot can happen in a day.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-5857511900054334830?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/5857511900054334830/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=5857511900054334830' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5857511900054334830'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/5857511900054334830'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/01/election-math.html' title='Election Math'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-1050361122869826100</id><published>2008-01-01T00:44:00.000-08:00</published><updated>2008-01-01T02:03:09.305-08:00</updated><title type='text'>Where is the safest place to live?</title><content type='html'>If you figure out where to live by the crime rate, many of the safest places to live are outside the United States, where crime, though lower than it &lt;a href="http://www.infoplease.com/ipa/A0873729.html"&gt;used to be&lt;/a&gt;, is still high by Western standards.  Here in Israel, for example, &lt;a href="http://en.wikipedia.org/wiki/Murder#Israel"&gt;fewer than 200 people&lt;/a&gt; are murdered a year.  In the U.S., about 17,000 people were murdered.  Of course, these figures are not comparable due to the fact that Israel has around 6 million people as compared to the U.S.'s 300 million.  &lt;br /&gt;&lt;br /&gt;Crime rates and murder rates are typically adjusted for population by reporting the number of crimes per 100,000 people.  In Israel this rate (for murders) is about 3, as compared to about 6 per 100,000 in the U.S.  Even including terror attacks, the rate has never been as high in Israel as in the U.S.  In &lt;a href="http://en.wikipedia.org/wiki/Murder"&gt;England, the rate is less than 2 per 100,000&lt;/a&gt;, which is in line with most of Western Europe.&lt;br /&gt;&lt;br /&gt;All this is fine and good if you are willing to live abroad just to enjoy a lower crime rate, but, assuming you are looking for a nice place in the US to live, what city might suit your needs vis-a-vis lack of crime? &lt;br /&gt;&lt;br /&gt;I considered the four cities I have lived in (see the &lt;a href="http://www.fbi.gov/ucr/cius2006/offenses/violent_crime/index.html"&gt;FBI site&lt;/a&gt; for all the stats), Columbia, SC; Washington, DC; Phildalephia; and New York.  Of these, New York (sorry Mom) is clearly the safest, with a murder rate of 6 per 100,000 and a violent crime rate of about 0.7% (less than 1 in 100).  The worst is DC (murder: 29, violent crime 1.5%), but Philadelphia is a near tie (26, 1.5%).  Columbia, my home town and my parents' current and future town, is in the middle (13 murders per 100,000 and 1.1% violent crime rate).&lt;br /&gt;&lt;br /&gt;These stark differences mask three important facts.  First, the chances of a random individual being a victim of a violent crime in any given year, is very low, no matter what the city (though it's more likey to get killed by a person than a car in Philly and DC, according to &lt;a href="http://www.nsc.org/lrs/statinfo/odds.htm"&gt;National Safety Council statistics&lt;/a&gt;). &lt;br /&gt;&lt;br /&gt;Second, &lt;a href="http://www.fbi.gov/ucr/cius_04/offenses_reported/violent_crime/murder.html"&gt;most violent crimes are committed by someone we know&lt;/a&gt;, and most of the people reading this blog do not have violent acquaintances, plus m&lt;a href="http://www.fbi.gov/ucr/cius_04/offenses_reported/violent_crime/murder.html"&gt;ost murders are committed with guns&lt;/a&gt;, and most people reading this blog do not have acquaintances with guns.&lt;br /&gt;&lt;br /&gt;Third, crime is highly localized, and no matter which city we live in, we are likely to live in a more affluent area, and &lt;a href="http://en.wikipedia.org/wiki/Crime_in_the_United_States"&gt;these areas have much lower crime&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;So, in conclusion: don't worry, Mom, New York is safer than Columbia, as long as I can keep making a living!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-1050361122869826100?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/1050361122869826100/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=1050361122869826100' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/1050361122869826100'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/1050361122869826100'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2008/01/where-is-safest-place-to-live.html' title='Where is the safest place to live?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-7556547390643638382</id><published>2007-12-13T02:54:00.000-08:00</published><updated>2007-12-13T04:44:46.888-08:00</updated><title type='text'>The lottery: a tax on stupidity?</title><content type='html'>It is often quipped that the lottery is a tax on stupidity (google "lottery: tax on stupidity" and you will see what I mean).  I've always been bothered by this for two reasons: 1) one of the first people who said it to me was an arrogant professor who I always enjoy proving wrong, and 2) someone in my family (who has an advanced degree in mathematics) used to (and may still) play the lottery.  So I figured there must be some fallacy in this statement. &lt;br /&gt;&lt;br /&gt;I use &lt;a href="http://www.nylottery.org/"&gt;New York's Lotto&lt;/a&gt; as an example, because it is a very simple "jackpot" lottery game.  In this game, you choose 6 numbers between 1 and 59 (inclusive).  If all 6 match, you win the jackpot.  The order of the numbers does not matter.  To figure out the chances of a match, you must compute the number of equally likely possible combinations (or just go the &lt;a href="http://www.nylottery.org/ny/nyStore/cgi-bin/ProdSubEV_Cat_401_SubCat_201671_NavRoot_320.htm"&gt;Lottos website&lt;/a&gt; and skip the next paragraph).&lt;br /&gt;&lt;br /&gt;To compute these chance, we first figure the total number of combinations of 6 numbers out of 59.  Suppose we choose the numbers in order.  Then, we have 59 choices for the first number, 58 for the second number, and so forth until we have 54 for the 6th number.  This totals 59*58*57*56*55*54=32,441,381,280 permutations in all.  Since we do not care about the order, however, we need to adjust this number (consider, for example that 11,12,13,14,15,16 and 16,15,14,13,12,11 are both the same set of numbers and considered the same in the Lotto drawing).  This adjustment is made by dividing by the number of possible orderings of six numbers, which is 6*5*4*3*2*1 = 720.  Thus, divide 32,441,381,280 by 720, and you get 45,057,474--the number of possible combinations, of which you choose two for each $1 Lotto ticket.  Your chances of winning, then, are about 1 in 22.5 million. The average jackpot is about $9 million (this jackpot amount is only &lt;a href="http://www.nylottery.org/ny/nyStore/cgi-bin/ProdSubEV_Cat_333603_NavRoot_306.htm?"&gt;obliquely referred to &lt;/a&gt;on the NY Lotto website, in that they say that 40% of revenues go to the jackpot).&lt;br /&gt;&lt;br /&gt;Given these odds, how much do you expect to win if you buy a single ticket (good for choosing two six-number combinations)?  Well, given the odds of 1 in 22.5 million, you would clearly expect to win absolutely nothing! &lt;br /&gt;&lt;br /&gt;But mathematicians don't think this way.  Instead, they compute the expectation as the long run average, and by long-run, I mean really really &lt;strong&gt;LONG-&lt;/strong&gt;run (actually infinite-run, but let's not split hairs).  To give you some idea of this, you would need to play around 15 million times to have a 50% chance of winning at least one time--this would take 41,000 years or so if you played 1 ticket a day.  So, computing the average after a few thousand or even a few million games is likely to get you an average of 0, which is *not* the correct long-run average.&lt;br /&gt;&lt;br /&gt;Instead, this expectation is computed by taking the sum of the probabilities of winning multiplied by the amount won.  In the case of the Lotto, then you win $0 in (22,499,999/22,500,000) games and $9,000,000 in (1/22,500,000) games.  So, the Expected winnings are(22,499,999/22,500,000) *$0 + (1/22,500,000)*$9,000,000 = 40 cents.&lt;br /&gt;&lt;br /&gt;So, you pay a dollar, and "expect" to get 40 cents back.  This is why some people call the lottery a tax on stupidity.  When people say the lottery is a tax on stupidity they are implictly and incorrectly assuming that utility (to throw in an economic term) is based purely on mathematical expectation, and that the utility from $9 million is 9 million times the utility from $1.  Yet I doubt that people are playing the lottery based on some mis-guided mathematical expectation calculation.  $1 or 40 cents.  Who cares?  Either way it's barely worth picking up off the ground. &lt;br /&gt;&lt;br /&gt;Smart people who play the lottery are valuing 2 things against each other --  $1 versus a miniscule chance of $9 million -- and deciding that the value of  $1 to them is less than the value of the chance at the $9 million.  Yes, poor people probably value $1 more than average, but they value a chance, even a small one, of forgetting about their financial woes even more.  &lt;br /&gt;&lt;br /&gt;Let's look at another game that shows the flip-side of this mathematical expectation conundrum.  For all you upper-middle class, non-lottery players out there, consider the following: Would you pay your entire net worth for a 1 in 1,000 chance to win $10 billion?  If your net worth is less than $10 million, this is a game with positive expectation.  For those of us with less than $1 million hanging around the house, the expectation is more than $9 million, but I doubt you'd find any middle-class person willing to play this game. &lt;br /&gt;&lt;p&gt;Why?  Because the risk is too great, no matter what the reward.  It is widely recognized that people place different values on risk.  Risk averse people are willing to lose a small amount of money (or pleasure) to insure they will not lose a large amount of money (or pleasure), even when the mathematical expectation of their transaction is negative.  The best example is insurance &lt;a href="http://en.wikipedia.org/wiki/Lottery"&gt;(Wikipedia's lottery entry points this out)&lt;/a&gt;.  Insurance companies make money not on stupidity but on the fact that people do not want to take large financial risks. &lt;/p&gt;So next time you hear someone say the lottery is a tax on stupidity, tell them about the mathematician who plays, or about the people who turned down a game with an expectation of $9 million.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-7556547390643638382?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/7556547390643638382/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=7556547390643638382' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7556547390643638382'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7556547390643638382'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2007/12/lottery-tax-on-stupidity.html' title='The lottery: a tax on stupidity?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-7960391251399199838</id><published>2007-12-01T15:02:00.000-08:00</published><updated>2007-12-01T16:04:21.075-08:00</updated><title type='text'>Hospital infections-what are the chances?</title><content type='html'>For those of us with aging parents, visits to the hospital can become somewhat routine, unfortunately.  An overnight stay for minor surgery, as my father recently had, isn't particularly worrisome--except for all I've been hearing about hospital infections.  &lt;br /&gt;&lt;br /&gt;The &lt;a href="http://www.cdc.gov/ncidod/dhqp/hai.html"&gt;CDC estimates&lt;/a&gt; that about 100,000 people die from 1.7 million infections that they contract at hospitals each year (this is a 2002 estimate, but recent numbers from less authoritative sources are similar or higher).  This is an enormous number, and with 35 million people hospitalized annually, this translates to a 5% chance of getting an infection, and about a 1 in 350 chance of both getting an infection and dying from it. &lt;br /&gt;&lt;br /&gt;According to the non-profit organization &lt;a href="http://www.hospitalinfection.org/essentialfacts.shtml"&gt;RID&lt;/a&gt;, the 100,000 deaths are more than the deaths from breast cancer, AIDS, and car accidents combined.  And how do we prevent these infections?  Mostly simple cleanliness and sterilization procedures, such as washing hands and washing equipment properly.  Procedures we assumed were practiced by hospital staff across the board. &lt;br /&gt;&lt;br /&gt;So what's the good news?  Several states now are considering or have passed laws that require disclosure of hospital infection rates for each hospital.  Of course, some hospitals will have higher rates simply because they perform more acute care and their patients are sicker.  However, these bills would finally put the responsibility back on the hospital, and would allow a more informed choices about where to schedule a procedure (when we have such a choice).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.bcbs.com/news/national/staph-infections-rampant.html"&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-7960391251399199838?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/7960391251399199838/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=7960391251399199838' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7960391251399199838'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7960391251399199838'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2007/12/hospital-infections-what-are-chances.html' title='Hospital infections-what are the chances?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-7887563319052437709</id><published>2007-11-22T03:41:00.001-08:00</published><updated>2007-11-22T03:42:31.473-08:00</updated><title type='text'>Thanksgiving Brothers</title><content type='html'>I had some role in coming up with these &lt;a href="http://fifthdown.blogs.nytimes.com/2007/11/21/the-brothers-jones-1-in-1-billion/"&gt;statistics&lt;/a&gt; in the New York Times, though I thought it was more like 1 in 100 than 1 in a billion.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-7887563319052437709?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/7887563319052437709/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=7887563319052437709' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7887563319052437709'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/7887563319052437709'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2007/11/thanksgiving-brothers.html' title='Thanksgiving Brothers'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4274782648018949782.post-3991316560916512969</id><published>2007-11-22T02:05:00.000-08:00</published><updated>2007-11-22T03:08:08.418-08:00</updated><title type='text'>Lost luggage - 1 in 138?</title><content type='html'>Thanksgiving is the beginning of peak travel season, and what did I read this morning? That &lt;a href="http://www.iht.com/articles/2007/11/20/business/baggage.php"&gt;1 in 138 checked bags gets lost&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;Now that seems like quite a bit.  It implies that someone on your plane is going to be sitting at their in-law's formal Thanksgiving dinner with jeans and a sweatshirt.  Maybe it serves the in-laws right, but after a few days, it's not just the in-laws who care. &lt;br /&gt;&lt;br /&gt;Yet the statistic above is skewed for a number of reasons.  Most "lost" baggage actually gets found, and often quickly.  In all my flying and that of my family, we've never had a "lost" bag for more than 24 hours (and only once was it overnight).  C&lt;a href="http://www.cheapflights.com/travel-tips/lostbaggage.html"&gt;heapflights.com&lt;/a&gt; (ok, they're biased) cites a Dept. of Transportation statistic that says only 1 in 20,000 bags are permanently lost.&lt;br /&gt;&lt;br /&gt;I would imagine there are other issues at play, also.  Connections are riskier, since it's pretty hard to screw it up on a direct flight, as you watch them put it on the conveyor and tag it, and they only have to get it on one plane.  Of course, the time you really want to check vs. carry on is when you are going to spend 3 hours waiting for your connection and shopping at some random airport. &lt;br /&gt;&lt;br /&gt;But let's get back to 1 in 138.  What does it mean to you?  It means if you are a typical air traveller (10 or fewer flights a year, 1 bag checked each time), you will only get a bag delayed about once every 9 years.  Even if you took 50 flights a year, you are not likely to ever have one lost forever.&lt;br /&gt;&lt;br /&gt;This assumes all flights are the same.  There is a clear difference in international flights and at least some differences among carriers.  I'd imagine, a flight from, say, budapest, to New York, is more likely to have a lost bag than a flight from DC to New York.  When we went from Whenzhou to Gaungzhou in China, they were so concerned with theft, they made us purchase a 10 yuan ($1.25) lock for every suitcase.  I guess it did the trick, because everything arrived, and ontime!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/4274782648018949782-3991316560916512969?l=what-are-the-chances.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://what-are-the-chances.blogspot.com/feeds/3991316560916512969/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4274782648018949782&amp;postID=3991316560916512969' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/3991316560916512969'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4274782648018949782/posts/default/3991316560916512969'/><link rel='alternate' type='text/html' href='http://what-are-the-chances.blogspot.com/2007/11/lost-luggage-1-in-138.html' title='Lost luggage - 1 in 138?'/><author><name>Alan Salzberg</name><uri>http://www.blogger.com/profile/07028973293777181756</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://1.bp.blogspot.com/-HBD_lajWwWc/TsEmaePUxFI/AAAAAAAAAIY/qiXISjfKXXE/s220/IMG_4510.JPG'/></author><thr:total>6</thr:total></entry></feed>
