Thursday, May 8, 2008

Why are there too many boys in China?

For a long time now, the ratio of males to females in China has been increasing. In fact, one of the most recent articles I could find on it was from 2004, where the ratio stood at around 120 boys to every 100 girls (see the msnbc article).

It's clear to most that the combination of the one child law, preventing most chinese couples from having more than one child, and the preference in China for boys, is driving this (though there are other explanations, including the possibilities of different effects of some diseases: see this business week article).

There are two sinister mechanisms for ensuring that your only child is a boy: selective abortion or infanticide. Yet there is another option: just have another baby if the first is a girl, and don't tell the government. I think this third option is more likely, because I do not think most families can afford an abortion (illegal for sex selection) and very few mothers would kill their babies.

So how much does this non-reporting need to happen to change the ratio from the normal 106 to 100 male to female births to the abnormal 120 to 100?

The answer to this is the combination of three things: 1) percent of births that are girls (with no intervention), 2) percent of families that have another baby (hoping for a boy), given the first is a girl, and 3) the percent of families that do not tell the government about the first baby.

Lets call these percentages, Pg, P2, and Ps (for girl, 2nd child, and secret). Lets also call Pr the reported percent of girls, which is 100/220, or 45.45%. We'll assume also for simplicity that families quit trying when they have a boy or have 2 children, whichever comes first. Also, we'll assume families always report the first child if it is a boy or if they have no more children.

Pg is known at around 100/206=48.54%
P2 and Ps are unknown.

We want to figure out what P2 and Ps could lead to the Pr being 45.45% when Pg is 48.54%.

First, consider that, given the ground rules above, the following are the types of families that can exist (in birth order):
B (boy, one child only)
G (girl, one child only)
GB (girl boy, two children)
GG (girl girl, two children)

To figure out the percent of girls reported, we need the total girls reported divided by the total children reported. This is easy to figure out for each combination above:
B = 0 girls / 1 child
G = 1 Girl / 1 child
GB = 0 girls / 1 child Ps percent of the time and 1 girls / 2 children (1- Ps percent of the time)
GG = 1 girl / 1 child Ps percent of the time and 2 girls / 2 children (1-Ps percent of the time)

We are almost there. Now we just need to sum the numerators multiplied by their probabilities and the denominators multiplied by their probabilities. Here are the probabilities of each family combination:
B = 1-Pg
G = Pg*(1 - P2) ==> It's just Pg times the percent of families who do not have more children
GB = Pg*(P2)*(1-Pg) It's the chances of a girl, followed by the decision to have a 2nd, followed by having a boy.
GG = Pg*(P2)*Pg=Pg^2*P2

Thus the numerator (number of girls reported average is):
Num = (1-Pg)*0 +
Pg*(1-P2)*1 +
Pg*P2*(1-Pg)*Ps*0 +
Pg*P2*(1-Pg)*(1-Ps)*1 +
Pg^2*P2*Ps*1 +
Pg^2*P2*(1-Ps)*2

and the denominator (number of children reported on average):
Den = (1-Pg)*1 +
Pg*(1-P2)*1 +
Pg*P2*(1-Pg)*Ps*1 +
Pg*P2*(1-Pg)*(1-Ps)*2 +
Pg^2*P2*Ps*1 +
Pg^2*P2*(1-Ps)*2

We know that, in China, Pr= Num/Den = 45.45% and that, in general, Pg=48.54%. Thus, we can solve .4545=Num/Den in terms of Ps and P2.

Since we have 1 equations and 2 unknowns, there are an infinite number of solutions, but here are a few possibilities:
0% have a second child --impossible
10% have a second child -- impossible
15% have a second child and 85% of those keep the first a secret from the government
20% have a second child and 65% of those keep the first a secret from the government
30% have a second child and 45% of those keep the first a secret from the government
40% have a second child and 35% of those keep the first a secret from the government
50% have a second child and 30% of those keep the first a secret from the government

One thing to note (that is not necessarily obvious in these calculations) is that if everyone reports all the children they have (Ps=0), then the percent of girls will be exactly 48.54%, the same as if everyone had one child, as long as infanticide and selective abortion are not occurring.

But the main point here is that a small number (15%) of couples having second children and not reporting the first girl leads to the warped percentages of baby girls, if there is high under-reporting of these first children. You do not need to assume that infanticide or selective abortion plays a role at all.