Marilyn Underestimates the Probability of Sharing Birthdays

Marilyn is Wrong Copyright © 1997-1998 Herb Weiner. All rights reserved.

Ask Marilyn ® by Marilyn vos Savant is a column in Parade Magazine, published by PARADE, 711 Third Avenue, New York, NY 10017, USA. According to Parade, Marilyn vos Savant is listed in the "Guinness Book of World Records Hall of Fame" for "Highest IQ."

In her Parade Magazine column of August 3, 1997, a reader asks Marilyn to explain why, in a randomly chosen group of 50 people, it is virtually certain that at least two will share the same birthday. Rather than explain why this is true, Marilyn asserts that this is "just plain wrong." In a followup column published November 23, 1997, Marily admits that in a randomly chosen group of 58 people, the chances for this being true are 99%.

Sorry, Marilyn

The answer is not "just plain wrong," but rather depends upon how you define "virtually certain." I'm not sure that there's a universally accepted definition of this term. At the very least, you should have revealed the actual probability, so readers could see how close to "virtually certain" the answer actually is.

Charlie Kluepfel <ChasKlu@aol.com> and Hugh Hoskins <hthoskins@earthlink.net> both wrote to point out that in any randomly chosen group of 50 people, the probability is slightly greater than 97% that two will have birthdays on the same date. Since the probability is above 97%, Marilyn was wrong to assume that this well established fact is an "erroneous extrapolation."

For the mathematically inclined, the probability of two people not sharing the same birthday is 365/365 * 364/365. In other words, there are 364 out of 365 chances that two randomly selected people will not share the same birthday. (For the sake of simplicity, these numbers neglect people born on February 29.) For three people, the probability is 365/365 * 364/365 * 363/365. For four peple, the probability is 365/365 * 364/365 * 363/365 * 362/365. For fifty peole, the probability is 365/365 * 364/365 * 363/365 * 362/365 * ... * 316/365. Working out the math, the probability is slightly less than 3% that no two of these fifty people will share the same birthday, or slightly more than 97% that at least two of these fifty people will share the same birthday.

Here's a Calculator Script

Readers with access to a Unix system may be interested in this bc (basic calculator) script by David Aldrich <dga@metronet.com>. bc is a standard Unix utility that can be used to perform an arbitrary precision calculations. This script calculates the probability (between 0 and 1) that the specified number of people share the same birthday. If you have access to a Unix system, you can run this script using the bc command. The scale=50 line specifies that all calculations should be performed using 50 significant digits of precision. Characters entered by the user are in bold; system responses are plain text.

% bc
scale=50
define p(x,y) {
auto i,j;
j=1;
for(i=y-1;i>(y-x);i--) {
j=j*i/y;
}
return(1-j);
}

p(1,365)
0
p(25,365)
.56869970396946388561788409084722390123865271939778
p(50,365)
.97037357957798839991865520436840386584171845099543
p(365,365)
1.00000000000000000000000000000000000000000000000000
quit
%

Why 58?

Charlie Kluepfel <ChasKlu@aol.com> wrote to point out that with only 57 people in a group, the chances are still more than 99% that at least two will share the same birthday. So what's the significance of 58 people? Could it be roundoff error in the program submitted by David Pleacher?

Here are the results for 57 and 58 people, as computed by David Aldrich's bc script:


57: .99012245934116997884528144909275185522219994236295
58: .99166497938926124242286763375497964769434954040491

Because there are 366 days in Leap Years

Adam Frank Nevraumont <Adam_Frank_Nevraumont@uwaterloo.ca> wrote to point out that, despite the fact that both the question and Marilyn's answer use the number 365 rather than 366, David Pleacher's program might have used 366 for the number of days per year. With 366 days per year, the results for 57 and 58 people, as computed by David Aldrich's bc script, would be:


57: .98998979806519874607615199917004994769423765735635
58: .99154876394029074463806275339766511977464326809594

Perhaps this is where the number 58 came from.

Still not correct

Charlie Kluepfel <ChasKlu@aol.com> observed that this might explain Marilyn's answer, but it doesn't make it right.

Being that Marilyn doesn't see fit to publish David Pleacher's program for calculating the probability of a birthday match, we can only surmise that Adam Frank Nevraumont's hypothesis is correct, and that 366 days per year were used, with equal likelihood of any of these days. If so, it represents an error. The following program considers any of the 1461 days in a four-year cycle to be equally likely:

DEFDBL A-Z
DO
  INPUT "# of people:", n
  tProb = 0 ' prob of No Match is first calculated
  ' calc for # of Feb 29's as 0 or 1:
  FOR nFeb29 = 0 TO 1 ' above this "No Match" is impossible as 2 Feb 29s would match
    pOver = (1460 / 1461) ^ (n - nFeb29)
    IF nFeb29 = 1 THEN pOver = n * pOver / 1461
    FOR i = 2 TO n - nFeb29
      pOver = pOver * (365 - i + 1) / 365
    NEXT
    tProb = tProb + pOver
  NEXT
  p = 1 - tProb ' reverse to make prob of at least 1 match
  PRINT USING "###.####"; p * 100
LOOP

This results in
# of people:56
 98.8264
# of people:57
 99.0062
# of people:58
 99.1612

showing that with Feb. 29 birthdays' being 1/4 as likely as any other, there is still over 99% probability that there will be a match among 57 people (as there is about 96% probability that no one will have a Feb 29 birthdate in the group).

A truly rigorous calculation would take into consideration seasonal variations in birthrates. How much the result would be affected is conjectural, but it could only cause the probability of a match to go up, not down.


http://www.wiskit.com/marilyn/birthdays.html last updated June 30, 1998 by herbw@wiskit.com