What do the Barro-Lee data really say about educational attainment in India
by Siddharth Vij
Yesterday, Nitin Pai linked to a blog post comparing educational attainment in India and Pakistan using numbers from the Barro-Lee (BL) dataset (available for free download on www.barrolee.com). Understandably, Nitin was mortified by some of the numbers for India. However, the manner in which the post’s author interpreted the numbers made things seem worse than they are (which is not to say that they aren’t bad anyhow!). Having worked with the data set, I’ll try to provide the correct interpretation of the numbers and also comment on why we should probably take the BL numbers on India with a slightly larger pinch of salt than for other countries.
Interpreting the Numbers
The table below shows the numbers for India from 1990-2010.. What do these numbers mean? Focusing our attention on 2010, one can see that there are seven key numbers:
- No schooling – 32.7%
- Primary Total – 20.9%
- Primary Completed – 18.9%
- Secondary Total – 40.7%
- Secondary Completed – 1.3%
- Tertiary Total – 5.8%
- Tertiary Completed – 3.1%
If you add up serial numbers 1, 2, 4 and 6, you reach 100%. This is the entire universe – each and every Indian above the age of 15 is assigned to one and only one of these buckets. 33 out of every 100 Indians above the age of 15 in 2010 have had no formal schooling. 21 have been only to primary school, 41 reached as far as secondary school while the rest made it all the way to college. When Mr. Haq says that India has a ‘secondary enrollment of 40.7%’, he is wrong. It is critical to note that BL says nothing about enrollment. Enrollment ratio is a flow measure. BL measures a country’s existing stock of human capital through levels of educational attainment. All that BL tells us is that for 40.7% of Indians above the age of 15, the highest level of educational attainment is secondary schooling. If to this 40.7% you add the 5.8% who have some tertiary education, you come up with a figure of 46.5% Indians above the age of 15 having had some secondary schooling during their life time.
The next point of contention is the interpretation of the three other numbers- the completion rates. Mr. Haq adds up serial numbers 3, 5 and 7 to report that India has a ‘dismal’ completion rate of 23%. Again, this is meaningless. The 23% only means that out of 100 Indians, 23 completed a certain level of education and then did not go to the next level. It does not take into account people who completed their primary (secondary) education and moved on to secondary (tertiary).
For secondary education, Mr. Haq uses the 0.9% (1.3% in the updated version) figure as is to claim that only 1% of India’s secondary school students complete the level. Nitin interpreted it as 1% of 40% meaning that 4 out of every 1000 kids complete secondary school. Both these interpretations are flawed. We’ve already calculated that 46.5 out of every 100 Indians above the age of 15 reached secondary school. Out of these 46.5, 7.1 (1.3+5.8) completed their secondary schooling i.e. about 15% of those who attended some secondary school managed to matriculate. It’s higher than the earlier numbers but it is still shockingly low.
The following summarizes what the BL data for India in 2010 actually says:
- 327 out of every 1000 Indians above the age of 15 have never had any formal schooling
- Of the remaining 673, only 20 dropped out during primary school. Once we got kids into primary school, we managed to make sure that they completed it.
- In secondary school, however, the situation is markedly different. 465 out of every 1000 Indians made it to secondary school but 394 dropped out without completing.
- Only 58 made it to college out of which a little more than half graduated with a degree.
There are two major points to be borne in mind while evaluating the BL numbers. The first one relates to the purpose with which the data was put together. The BL data is used to investigate how output relates to the existing stock of human capital in a country, and to calculate the returns to education. As such, BL is inherently backward looking in terms of education. If a country undergoes a revolution in primary education, it would take almost 10 years for this to show up in BL (since the kids would need to reach at least the age of 15 to enter the data) and even then the effect of the new entrants would be dampened since they’d be only a small fraction of those in the 15+ grouping. It would take a generation to see the effects of education reform on BL data (and on the real world as well!)
Secondly, since BL use census data for their estimates, the raw data is relatively sparse, and they use forward and backward extrapolation to fill in missing data. They split the population above the age of 15 into cohorts of 5 years each- ages 15-19, 20-24, 25-29 and so on. They assume that for ages above 25, educational attainment doesn’t change and so the cohort 30-34 will have the same educational attainment in a particular year as the cohort 25-29 had five years earlier. For the two cohorts below 25, they use the same estimate as that five years earlier and then adjust it for changes in enrollment ratios. For India, they have used data from only 4 censuses (1961-1991). Surprisingly, their data does not incorporate the 2001 census. For Pakistan, they have 6 censuses to rely on including 1998 and 2006. Extrapolation picks up trends but would have missed a structural change in educational attainment. Given that it’s been two decades, such a structural change is conceivable. The Barro-Lee numbers for India should be interpreted with this in mind.
Siddharth Vij can be followed on Twitter at @siddharthvij