# Estimation of migration from census data

## Description of the methods

Estimating migration from census data is not technically complicated. Provided that the census(es) gather the appropriate information and are reasonably accurate it is possible to produce estimates of net immigration (i.e. immigration less emigration) of the foreign-born population (people born outside a particular country) and internal migration between (to and from) sub-national regions of a country, over the period between two censuses.

To estimate net immigration of foreigners one essentially subtracts from the number of foreign-born people enumerated in a census, the number of foreigners expected to have survived since being enumerated in the previous census.

In a similar way, if the censuses record the sub-national region of birth one can estimate net in-migration (i.e. net in-migration of those born outside the region less net out-migration of those born in the region) between sub-national regions of a country. However, if the census asks of people where they were living at some prior point in time, say at the time of the previous census, one is able to estimate directly the number of surviving migrants (i.e. migrants still alive at the time of the latest census) into and out of each sub-national region of the country since that prior point in time.

In order to estimate the number of migrants from the number of surviving migrants at the time of the second census one needs to add to these figures an estimate of the number of migrants who are expected to have died between moving and the time of the latest census.

If the latest census records other information such as year in which the migrant moved to the place at which the person was counted in the census, it is possible also to establish a trend of migration over time.

Migration is different from fertility and mortality both in that migrating is not final in the sense of a birth or death, but also that we are concerned not only with the population of origin, from which the migrant moved (which corresponds to a population exposed to the risk from which rates of migration akin to those of fertility and mortality can be calculated) but we also have a population to which the migrant moves, the destination population. Apart from this, in order to understand migration one is often interested in distinguishing between different types of migration (whether temporary or more permanent, whether circulatory or unidirectional, etc.). For these reasons there is a much wider range of measures and terminology associated with migration than there is with either fertility or mortality. It is not the purpose of this chapter to cover these issues and the interested reader is referred to the standard texts on the subject such as the UN Manual VI (UN Population Division 1970), Shryock and Siegel (1976), Siegel and Swanson (2004).

## Data requirements and assumptions

### Tabulations of data required

• To estimate net immigration of foreigners:
• the number of foreign-born females (males), in five-year age groups, and for an open age interval A+, at two points in time, typically two censuses
• For the deaths: either a suitable model life table or the numbers of native-born females (males), in five-year age groups, and for an open age interval A+, at two points in time, typically two censuses. Failing these, the central crude death rate for the population
• To estimate sub-national regional net in-migration from place of birth data:
• the number of females (males) by sub-national region and by sub-national region of birth, in five-year age groups, and for an open age interval A+, at two points in time, typically two censuses
• For the deaths: either a suitable model life table, the numbers of native-born females (males), in five-year age groups, and for an open age interval A+, at two points in time, typically two censuses or numbers of deaths by region from the vital registration. Failing these, the central crude death rate for the population
• To estimate internal migration between sub-national regions from place of residence at previous census data:
• The numbers of females (males) by sub-national region and by sub-national region at some prior date, typically that of the preceding census, in five-year age groups, and for an open age interval A+.
• If age-specific numbers are not available, aggregated data is still useful for estimating all-age migration.

### Important assumptions

• Estimating net immigration of foreigners:
• Censuses identify all foreign-born people accurately
• One is able to estimate the mortality of the foreign-born population accurately (either that the life table used is appropriate, or that the mortality is the same as that implied by the censuses for the native-born (locally-born) national population)
• No return migration of locally born emigrants
• Estimating sub-national regional net in-migration from place of birth data:
• Censuses count the population by sub-national region accurately and identify the region of birth accurately
• One is able to estimate the mortality of people moving between two regions accurately (either that the life table used is appropriate, or that the mortality is the same as that implied by the censuses for the native-born national population).
• Estimating internal migration between sub-national regions from data on place of residence at previous census:
• Latest census identifies correctly all people who have moved from one region to another since the prior date (e.g. previous census)
• One is able to estimate the mortality of people moving between two regions accurately (either that the life table used is appropriate, or that the mortality is the same as that implied by the censuses for the native-born national population). Since one is estimating in- and out-migration separately (as opposed to net migration) this assumption is of less importance.

## Preparatory work and preliminary investigations

Before applying this method, you should investigate the quality of the data in at least the following dimensions

• age structure of the population (by sub-national region as appropriate); and
• relative completeness of the census counts (by sub-national region as appropriate).

## Caveats and warnings

Estimating migration using place of birth data from two censuses not only requires that the censuses count the population reasonably completely, but that the place of birth be accurately recorded. Often this is not the case, particularly when estimating immigration, where immigrants wish to hide the fact that they are foreign, but also in the case of internal migration where there may have been boundary changes or the respondent is ignorant about the place of birth of the person.

Estimating migration by asking questions of migrants is quite dependent on the census identifying completely all those who have migrated, as well as identifying the place from which moved correctly. To the extent that recent migrants are not yet established as residents of the region to which they have moved at the time of the census, they could be missed in the count.

Net migration, by definition, underestimates the flows of migrants into and out of a region or country. Thus, for example, people who moved into a region and then returned within the period being considered will result in zero net in-migration and yet moved twice.

## Application of the method

### A: Estimating net immigration of foreigners using place of birth data

This method produces estimates of the net immigration of foreigners using place of birth data. It is important to stress that this method does not take into account or measure the immigration of returning native-born people who left the country prior to the previous census and returned before the second census. Thus this method is not recommended for the measurement of immigration where significant return migration of native-born people (for example, after exile or forced migration of refugees) is in progress.

#### Step 1: Decide on survival factors

If data on the number of foreign-born people in the population are available by age group for each census then one needs to estimate the survival factors to be applied to the numbers of foreign-born in the first census to estimate the numbers surviving to the time of the second census. The user can choose between years of life lived in five-yearly age groups (5Lx) based on the standard from the General family of United Nations model life tables or one of any of the four families of Princeton model life tables or a model life table of a population experiencing an AIDS epidemic (Timæus 2004) which appear in the Models spreadsheet of the associated workbook. This spreadsheet also allows the user to input years of life lived in five-yearly age groups of an alternative life table if there is reason to assume that the life table has a similar pattern of mortality to that of the population in question, or failing this, the survival factors can be derived from the proportion of each five-year age group of the native-born population surviving from the first to the second census (assumed to be n years apart, where n is a multiple of 5). Thus

and

$​ S B,n ​$

, the n-year survival factor for a group of people aged x to x + 5 at the previous census, A-n and older at the previous census, and born between censuses, respectively are estimated as follows:

$​ 5 S x,n = 5 L x+n 5 L x or ​ 5 N x+n nb (t+n) ​ 5 N x nb (t) , ​ ∞ S A−n,n = T A T A−n or ​ ∞ N A nb (t+n) ​ ∞ N A−n nb (t) , and ​ S B,n = n L 0 n l 0 or ​ n N 0 nb (t+n) ​ B nb .$

where the superscript nb represents ‘native-born’,

$​ 5 N x nb (t)$

represents the native-born population in the census at time t and Bnb represents the number of native-born births between time t and t + n.

If the data are not available in five-year age groups, the net number of immigrants can still be estimated in total provided we have an estimate of the crude death rate for the population (which might, in the absence of any evidence to the contrary, be assumed to be that of the native-born population).

#### Step 2: Estimate the number of deaths of the immigrants

If data on the number of foreign-born people in the population are available by age group for two censuses (n years apart) then one needs to estimate the number of deaths of foreign-born people (denoted by the superscript F) aged between x and x+5 at the first census (at time t),

$​ 5 D x F ​$

, aged A-n and older at the first census,

$​ ∞ D A−n F ​$

, and those born between the censuses,

$​ D B F ​$

, as follows:

$​ 5 D x F = 1 2 ( ​ 5 N x F (t)⋅ ​ 5 S x,n + ​ 5 N x+n F (t+n) )( 1 ​ 5 S x,n −1 ) , ​ ∞ D A−n F = 1 2 ( ​ ∞ N A−n F (t)⋅ ​ ∞ S A−n,n + ​ ∞ N A F (t+n) )( 1 ​ ∞ S A−n,n −1 ) , and ​ D B F = 1 2 ( ​ n N 0 F (t+n) )( 1 ​ S B,n −1 ) ​.$

where

$5 N x F (t)$

represents the number of foreign-born people according to the census at time t who were aged between x and x+5.

If data and/or survival factors are not available by age group then one can estimate the total number of deaths of the foreign-born people as follows:

$​ ∞ D 0 F = n 2 ( ​ ∞ N 0 F (t)+ ​ ∞ N 0 F (t+n) ) ∞ m 0$

where m0 is an estimate of the crude mortality rate of the population in the country of the census.

However, if the age distribution of the foreign-born population is markedly different from that of the population in the country of the census, then this can produce a poor approximation to the true number of deaths.

#### Step 3: Estimate the net number of immigrants (of foreigners)

If data are available by age group for each census then age-specific net immigration can be estimated as follows:

for x = 0, 5, … , A-5-n where

represents the net number of immigrants between times t and t+n who were aged between x and x + 5 at time t. For x > A - 5 - n

The net number of immigrants of those born between times t and t+n is estimated as follows:

If data and/or survival factors are not available by age group then one would estimate of the total net number of immigrants as follows:

### B: Estimating net internal migration between sub-national regions from place of birth data

Net in-migration into a particular sub-national region from other regions in the country can be estimated in exactly the same way as the international immigration, described above, by replacing the foreign-born population with the population born outside the region.

In addition, applying the same method to data on the change in the numbers of population born in (rather than outside) and living outside the region of interest allows us to estimate the net out-migration of those born in the region to other regions in the country. Subtracting this from the net in-migration of those born outside the region gives an estimate of the overall net in-migration into the region of interest.

If there is reason to suspect that there is a material difference in the mortality experienced by those born outside who moved into the region and those born in the region who moved out, and one has appropriate survival factors then one could apply different survival factors to each when estimating the net number of migrants. However, in practice it is likely that inaccuracies in the census data on place of residence at previous census are likely to outweigh any increase in accuracy achieved by using differential mortality.

### C: Estimating internal migration between sub-national regions from place of residence at previous survey

Net sub-national inter-regional migration is estimated directly from the numbers of people in each region at the time of the census who moved since the previous census by place (e.g. region) they were in at a given prior date (e.g. at the time of the previous census). Confining the estimates to inter-regional flows the sum of the numbers of inter-regional in-migrants should be equal to the sum of inter-regional out-migrants; however, if the data include immigration to the sub-national regions from outside the country one can extend the estimates of in-migration to include international immigration into each region.

Since one of the major areas of interest is the magnitude of inter-regional flows of the population, one is as interested in the total numbers of migrants between regions as one is in the age distributions of particular flows.

The number of migrants is derived from the number of surviving in- and out-migrants as follows:

$5 M x = ( ​ 5 I ′ x − ​ 5 O ′ x + ( ​ 5 I ′ x − ​ 5 O ′ x ) x / ​ 5 S x )/2 ,$

where the superscript (’) represents numbers surviving and 5Ix and 5Ox respectively represent the number of surviving in-migrants into, and the surviving number out-migrants from, a particular region at the time of the second census who were aged between x and x+5 at the second census.

## Worked example

This example uses data on the numbers of males in the population from the South African Census in 2001 and a ‘census replacement survey’, the Community Survey in 2007. (Although the survey was conducted approximately 5.35 years after the night of the census in 2001, it is assumed for the purposes of presentation here to have been exactly five years after the census in 2001.) The examples appear in the Migration_South Africa_males.xlsx workbook.

### A: Estimating net immigration of foreigners using place of birth

#### Step 1: Decide on survival factors

The survival factors are shown in the fifth column of Table 1. The values are derived from (the years of life lived in each age group of) the alternative life table entered in the Models spreadsheet, for those aged 20 to 24 last birthday and those aged 80 and over at the time of the first census, and those born between the two censuses, as follows:

$​ 5 S 20,5 = 5 L 25 5 L 20 = 4.3382 4.4975 =0.96458 ​ ∞ S 80,5 = T 85 T 80 = 0.75180 1.19603 =0.40912 and ​ S B,5 = 5 L 0 5 l 0 = 4.707549 5 =0.94151.$

Table 1 Estimation of deaths of foreign-born and the net number of immigrants by age group, South Africa, 2001-2006

 Age 2001 2006 x 5Sx Age at 2nd census DF Net M B 0.94151 0- 4 8,963 12,577 0 0.97896 0- 4 391 12,968 5- 9 10,390 13,724 5 0.99547 5- 9 242 5,003 10-14 13,508 13,998 10 0.99427 10-14 55 3,664 15-19 27,835 27,943 15 0.98602 15-19 119 14,555 20-24 69,787 59,493 20 0.96458 20-24 616 32,275 25-29 87,381 95,763 25 0.93161 25-29 2,994 28,970 30-34 73,338 100,450 30 0.90960 30-34 6,675 19,743 35-39 66,663 85,490 35 0.89780 35-39 7,563 19,715 40-44 59,152 75,684 40 0.89092 40-44 7,701 16,721 45-49 45,184 66,113 45 0.88633 45-49 7,274 14,234 50-54 40,398 55,913 50 0.87224 50-54 6,154 16,883 55-59 30,640 42,833 55 0.84731 55-59 5,717 8,153 60-64 24,376 34,433 60 0.80885 60-64 5,442 9,234 65-69 17,895 25,588 65 0.75468 65-69 5,353 6,564 70-74 13,561 18,989 70 0.66991 70-74 5,281 6,375 75-79 10,238 12,850 75 0.56388 75-79 5,404 4,693 80-84 7,658 7,461 80+ 0.40912 80-84 5,118 2,341 85+ 4,455 5,305 85+ 7,410 602 Total 611,423 754,608 Total 79,509 222,693

#### Step 2: Estimate the number of deaths

Since we have data on the number of foreign-born people in the population by age group for each census we can estimate the number of deaths of foreign-born people which occurred in the period between the two censuses by age group using the numbers of foreigners in each census given in the second and third columns of Table 1. For those aged 20 to 24 last birthday and those aged 80 and over at the time of the first census, and those born between the two censuses, the calculations are as follows:

$​ 5 D 20 F = 1 2 ( ​ 5 N 20 F (2001)⋅ ​ 5 S 20,5 + ​ 5 N 25 F (2006) )( 1 ​ 5 S 20,5 −1 ) =( 69787⋅0.96458+95763 )( 1 0.96458 −1 )=2994 ​ ∞ D 80 F = 1 2 ( ​ ∞ N 80 F (2001)⋅ ​ ∞ S 80,5 + ​ ∞ N 85 F (2006) )( 1 ​ ∞ S 80,5 −1 ) =( ( 7658+4455 )0.40912+5305 )( 1 0.40912 −1 )=7410 and ​ D B F = 1 2 ( ​ 5 N 0 F (2006) )( 1 ​ S B,5 −1 )=12577( 1 0.94151 −1 )=391 .$

If data and/or survival factors were not available by age group then one could estimate the total number of deaths of the foreign born people as follows, given an estimate of the crude mortality rate in the population of 14 per 1,000:

$​ ∞ D 0 F = 5 2 ( ​ ∞ N 0 F (2001)+ ​ ∞ N 0 F (2006) ) ∞ m 0 = 5 2 ( 611423+754608 ) 14 1000 =47811 .$

#### Step 3: Estimate the net number of immigrants (of foreigners)

Since data are available by age group for each census, age-specific net immigration of those born outside the country can be estimated as follows:If data and/or survival factors were not available by age group then one could estimate the total net number of immigrants as follows:

If data and/or survival factors were not available by age group then one could estimate the total net number of immigrants as follows:

### B: Estimating sub-national regional net in-migration using place of birth

The second and third column of Table 2 show the numbers of people living in the Western Cape province of South Africa who were born outside the province, as counted by the 2001 Census and the 2007 Community Survey, respectively. Although the same survival factors (column 5) have been used as were used in the example of Method A, this should not be the case if it was thought that the mortality experience of native-born and immigrants were very different. The final column of Table 2 gives the net numbers of migrants into the Western Cape who were born in provinces other than the Western Cape for the different age groups. Thus in total 213,911 people born outside the Western Cape moved to the Western Cape (after excluding those who moved out).

Table 2 Estimation of the net number of in-migrants of those born outside by age group, Western Cape, South Africa, 2001-2006

 Age 2001 2006 x 5Sx Age at 2nd census DO Net M (born out) B 0.94151 0- 4 16,443 19,012 0 0.97896 0- 4 591 19,602 5- 9 24,406 28,743 5 0.99547 5- 9 482 12,782 10-14 31,134 30,792 10 0.99427 10-14 125 6,511 15-19 44,478 53,933 15 0.98602 15-19 245 23,043 20-24 74,011 82,526 20 0.96458 20-24 896 38,944 25-29 80,187 89,522 25 0.93161 25-29 2,954 18,466 30-34 65,833 90,783 30 0.90960 30-34 6,074 16,670 35-39 56,393 76,475 35 0.89780 35-39 6,776 17,417 40-44 44,420 59,692 40 0.89092 40-44 6,268 9,567 45-49 32,862 47,612 45 0.88633 45-49 5,338 8,529 50-54 28,178 37,969 50 0.87224 50-54 4,303 9,409 55-59 19,983 30,205 55 0.84731 55-59 4,012 6,039 60-64 17,569 25,593 60 0.80885 60-64 3,832 9,442 65-69 11,216 20,802 65 0.75468 65-69 4,137 7,371 70-74 8,365 12,612 70 0.66991 70-74 3,426 4,822 75-79 5,919 8,434 75 0.56388 75-79 3,458 3,528 80-84 4,063 5,061 80+ 0.40912 80-84 3,248 2,390 85+ 2,152 2,183 85+ 3,413 -620 Total 567,613 721,949 Total 59,576 213,911

The second and third columns of Table 3 present the numbers of people living in provinces other than the Western Cape who were born in the Western Cape, as counted by the 2001 census and the 2007 Community Survey, respectively. The net number of out-migrants of those born in the Western Cape (i.e. the number of people born in the Western Cape who moved out, less those who have returned) is given in column 8. The negative numbers mean that there was negative net out-migration (i.e. the number of those born in the Western Cape who moved to other provinces in the period was less than the number born in the Western Cape who were living outside who returned during the period). Thus the total of -19,017 means that the number of people born in the Western Cape, who returned to the Western Cape during the period having lived in another province until 2001 exceed those who were born in the Western Cape and moved to another province in the period by 19,017.

These estimates were derived using the same survival factors as were used for those born outside the Western Cape who moved into the province, but if there was reason to suppose that the mortality differed for those born in the Western Cape who moved out, then a different set of survival factors would be used to estimate the Net M (born in) numbers.

The overall net in-migration for the province is thus given in the final column of Table 3. Thus in total 232,928 more people moved into the Western Cape than left the Western Cape to live in another province.

In this example those born outside the province include those born outside the country and thus the overall net migration includes immigrants who settle in the province. Excluding the foreign-born from Table 2 would produce numbers of internal in-migrants net of internal out-migrants, and the sum of these numbers for all the provinces together would be zero.

Table 3 Estimation of the net number of out-migrants of those born inside by age group, Western Cape, South Africa, 2001-2006

 Age 2001 2006 x 5Sx Age at 2nd census DI Net M (born in) Overall Net M B 0.94151 0- 4 22,055 11,747 0 0.97896 0- 4 365 12,112 7,490 5- 9 21,895 12,509 5 0.99547 5- 9 367 -9,180 21,962 10-14 21,382 11,593 10 0.99427 10-14 76 -10,226 16,737 15-19 18,265 13,455 15 0.98602 15-19 100 -7,827 30,870 20-24 14,645 10,477 20 0.96458 20-24 202 -7,587 46,531 25-29 13,501 9,534 25 0.93161 25-29 434 -4,676 23,142 30-34 13,118 11,047 30 0.90960 30-34 867 -1,587 18,257 35-39 12,121 14,614 35 0.89780 35-39 1,319 2,815 14,602 40-44 11,725 12,195 40 0.89092 40-44 1,311 1,384 8,183 45-49 10,335 10,538 45 0.88633 45-49 1,285 98 8,431 50-54 9,211 9,881 50 0.87224 50-54 1,221 768 8,642 55-59 7,264 10,568 55 0.84731 55-59 1,362 2,720 3,319 60-64 6,691 7,723 60 0.80885 60-64 1,250 1,710 7,732 65-69 4,643 5,297 65 0.75468 65-69 1,265 -128 7,499 70-74 3,954 3,766 70 0.66991 70-74 1,182 304 4,517 75-79 2,331 2,384 75 0.56388 75-79 1,240 -330 3,858 80-84 1,402 2,140 80+ 0.40912 80-84 1,336 1,145 1,244 85+ 707 555 85+ 1,024 -531 -89 Total 195,246 160,023 Total 16,206 -19,017 232,928

### C: Estimating internal migration between sub-national regions from data on place of residence at previous census

Table 4 presents the results of the answers to the question about place (province in this example) of residence at the time of the 2001 Census given by those counted in each of the provinces in the 2007 Community Survey. (In actual fact the question asked whether the person was staying at the same place at the time of the prior census and if not, where they were staying at the time they moved to the place at which they were counted in the Community Survey. However, work by Dorrington and Moultrie (2009) shows that using these data and the year of movement to back project the population in order to estimate the numbers by province of residence at the time of the previous survey suggests that the assumption that there was only one move in the five years since the previous census was reasonably accurate.)

By far the largest numbers of migrants are those that moved within each of the provinces, however, these have been excluded from Table 4 because one is usually more interested in interprovincial migration than migration within a province.

Table 4 Interprovincial migration, South Africa, 2001-2006

 Province where counted (destination) Previous residence (origin) WC EC NC FS KZ NW GT MP LM Total WC 12,173 4,060 1,745 3,221 2,113 16,400 1,405 874 41,992 EC 52,239 1,120 7,187 25,209 14,430 28,633 4,693 2,116 135,626 NC 4,813 1,942 3,480 908 3,728 4,956 1,062 357 21,246 FS 2,943 3,145 2,546 2,352 12,733 19,920 4,293 1,963 49,896 KZ 6,762 7,015 631 2,358 3,573 50,980 8,886 1,194 81,399 NW 1,478 907 9,811 5,555 2,329 47,633 3,090 4,337 75,140 GT 24,891 12,948 3,962 11,437 18,145 32,433 18,598 15,133 137,547 MP 2,134 1,317 280 1,724 4,546 5,767 42,941 8,628 67,338 LM 2,754 1,583 255 1,709 2,209 9,773 81,394 24,211 123,889 OSA 21,221 5,467 1,209 9,584 10,933 11,437 51,873 8,335 9,286 129,346 DNK 500 3 15 124 132 78 228 89 0 1,170 UNS 1,058 1,029 107 208 875 508 3,558 408 633 8,384 Total 120,794 47,528 23,996 45,111 70,860 96,573 348,516 75,070 44,524 872,973 WC = Western Cape, EC = Eastern Cape, NC = Northern Cape, FS = Free State, KZN = KwaZulu-Natal, NW = North West, GT = Gauteng, MP = Mpumalanga, LM = Limpopo, OSA = Outside SA, DNT = Do not know, UNS = Unspecified

In addition to the all-age numbers in Table 4 (in actual fact these numbers exclude, as is often the case, migration of those born between the census and survey) one can also produce numbers of in- and out-migration by age groups as shown in Table 5. For completeness these numbers include estimates of the number of migrants who were born since the previous census. However, relative to the other migrants these numbers look implausibly high, and the reason for this is discussed below.

The net number of migrants is estimated for those aged 25-29 at the time of the Community Survey (i.e. were aged 20-24 at the time of the 2001 census), for example, as follows:

$5 M x = ( ​20675−​5649+ ( ​20675−​5649 )/ ​0.96458 )/2 =15301 .$

Table 5 Estimation of the net number of in-migrants by age group, Western Cape, South Africa, 2001-2006

 Age Surviving in- migrants (I’) Surviving out- migrants (O’) x 5Sx Net in-migrants 0- 4 20,846 11,747 B 0.94151 9,381 5- 9 6586 3,554 0 0.97896 3,065 10-14 6685 2,882 5 0.99547 3,812 15-19 10402 3,967 10 0.99427 6,454 20-24 21266 4,488 15 0.98602 16,897 25-29 20675 5,649 20 0.96458 15,301 30-34 15584 6,008 25 0.93161 9,928 35-39 10584 5,098 30 0.90960 5,758 40-44 7264 3,045 35 0.89780 4,458 45-49 4648 2,714 40 0.89092 2,053 50-54 3095 1,500 45 0.88633 1,698 55-59 3940 935 50 0.87224 3,225 60-64 3776 527 55 0.84731 3,541 65-69 3127 818 60 0.80885 2,582 70-74 1540 437 65 0.75468 1,282 75-79 561 206 70 0.66991 442 80-84 797 116 75 0.56388 944 85+ 264 47 80+ 0.40912 374 Total 141,640 53,739 91,194

## Diagnostics, analysis and interpretation

### Checks and validation

Perhaps the simplest check, on the reasonableness of the ‘shape’ (i.e. distribution of the numbers by age) of the estimates but not the level, is to see if it conforms to the standard shape (or a variation thereof). Rogers and Castro (1981a; 1981b) point out that the distribution of the number (or rate) of in- and out-migrants tends to conform to standard patterns, with a peak in the young adult ages (usually associated with seeking employment), a second, usually less pronounced peak amongst very young children falling to a trough amongst young teenagers (the size depending on the extent to which it is families rather than individuals moving in the young to middle aged adults). Sometimes there is also a ‘hump’ (or trough) around retirement age if there is a strong flow of migrants moving to (or away from) the place to retire.

These patterns (not necessarily the same pattern) apply to in- and out-migration flows separately, but not necessarily to net migration (which is the difference between the two flows) unless one flow (either the in-migration or the out-migration) is much greater than the other.

Figure 1 illustrates this using some of the estimates calculated above, expressed as proportions of the total number in each case (to allow them to be presented on a single figure). From this we can see that in broad terms (with the exception in some cases, where the proportion of migrants at the very young ages looks implausibly high) each conforms to the expected shape.

The net out-migrants of those born in the Western Cape (excluded from the figure for ease of illustration) does not conform to a standard model of migration, which could indicate these numbers are not very reliable, however, they are small relative to the in-migration of those born outside the province, and thus such a deviation may tolerated. In addition to this there are two other features to be noted from Figure 1. The first is that the out-migration from the Western Cape as estimated from data on place of residence at previous census, suggests that adult out-migrants peak at a somewhat older age (and possibly are likely to represent family rather than individual migration). The second is the fact that the net immigration into the country follows the standard shape which indicates that the flow into the country is much stronger than the return flow of those migrants.

Figure 1 Age distribution of selected migrant flows, South African males, 2001-2006

If the census asked place of birth and place of residence at the previous census then one can compare the two estimates of net in-migration into a specific sub-national region. If they are similar this gives one some confidence in the results. In the case of the place of birth data for South Africa the net number of in-migrants into the Western Cape is 232,928 (Table 3) while the estimate from the data on place of residence at the time of the previous census data produced an estimate of 92,194 (Table 4), which suggests that one or both of these sets of data are suspect.

The most basic check of the estimates of migration is to project the population (of the country or the province) at the first census to the time of the second census making use of the estimates of the number of migrants and compare that with the census estimates from the second, more recent, census to see how well the two match, especially in the age range in which migration is concentrated. In the case of the net in-migration into the Western Cape, projecting the population forward from 2001 using the estimates derived from the change in the numbers by place of birth produced a much closer fit to the population in the 20-29 year age range, suggesting that the data on place of birth are probably more complete than those on the place of residence at the date of the previous census. To some extent this is supported by a comparison of the change in the number of foreign-born in the country between the two censuses, 222,693 (Table 1) with the sum of the numbers who reported that they had moved from outside South Africa to one of the provinces since the previous census, 129,346 (Table 4).

Ideally, if one had independent estimates of the number of migrants one might compare those numbers against estimates using the above methods. Unfortunately, reliable independent estimates are rare. Although most countries try to record people entering and leaving the country, these data are often not reliable, particularly in developing countries with relative porous borders. And unless the country is extremely well regulated and maintains a complete and accurate register of the population, the only other way to measure internal migration is through migration-specific surveys, which tend to be much more useful for understanding the type of migration (whether permanent, temporary, cyclical, etc.) than for producing reliable estimates of the number of migrants, given the often less structured situation that (particularly recent) migrants find themselves living in and an understandable reluctance to identify themselves as being migrants.

### Interpretation

Considering the numbers of migrants estimated from the data on place of residence at the previous census given in Table 4 (and taking into account the suspicion that these probably underestimate the true migration), some 2-4% of the population changed province of residence in the 5 years between the 2001 Census and the Community Survey. Had we included the number who moved within, but did not change, province then between 7 and 15 per cent of the population moved in the 5‑year period.

The main provinces of destination are Gauteng (by a big margin) and Western Cape, which are predominantly urban and the wealthiest provinces. The main provinces of origin are Gauteng (inspection of the age distribution would show that this is mainly return migration of ‘retiring’ workers) Eastern Cape and Limpopo, which are poor, mainly rural provinces, from which people seeking work migrate to the urban areas.

It appears that migration is predominantly of individuals (seeking work) rather than of families.

## Method-specific issues with interpretation

### Scanning errors

A particular feature of the data relying on province of birth is the apparently relatively high number of children born since the first census who have moved to another province. In all likelihood this is an artefact of the data capturing process. Scanning was used to capture the data from the questionnaires on which Western Cape was coded as a “1”, written in the appropriate space by hand. It appears that in a small percentage of cases the scanner might have had trouble distinguishing a handwritten “1” from a handwritten “7” (the code for Gauteng). The result of this is, for example, that some of the children coded as having been born outside the province in which they were counted, and thus appear to be migrants, but probably were not. Even though the percentage error in scanning is very small, the number of births can be large relative to the number migrants, and thus the error can produce noticeable errors. Since an increasing number of developing countries are using scanning to capture data, this sort of problem may be quite common.

Where scanning errors or other situations make it impossible to produce reliable estimates of the number of migrants of those born since the previous census one can use CWR from second census as follows:

$Net 5 M 0 = 1 4 CW R 0 ⋅Net ​ 30 M 15 f$

for those born in the most recent five years, and

$Net 5 M 5 = 3 4 CW R 5 ⋅Net ​ 30 M 20 f$

for those born in the five years before that if the censuses are 10 years apart, where CWRx represents ratio of the number of children aged between x and x+5 to the number of women in the population aged between 15+x and 45+x in the population (regional or national) at the time of the second census, and

$​ 30 M x f ​$

represents the number of women migrants aged between x and x+30.

Applying this to the data for the Western Cape suggest that the number of migrants born since the previous census should be less than half the numbers being estimated from the data on place of birth.

## Detailed description of method

### Mathematical exposition

The indirect estimation of migration derives from the balance equation for two censuses n years apart, namely:

$5 N x+n (t+n) = ​ 5 N x (t)− ​ 5 D x + ​ 5 I ′ x − ​ 5 O ′ x = ​ 5 N x (t)− ​ 5 D x + ​ 5 M ′ x$

where

$5 M ′ x =​ ​ 5 I ′ x − ​ 5 O ′ x$

is the net (i.e. in less out) number of in-migrants, aged x to x+5 at the time of the first census, surviving to the second census, and 5Dx, 5I’x and 5O’x, represent the number of deaths, surviving in-migrants and out-migrants, aged x to x+5 at the time of the first census, who died or moved in the period between the censuses.

For those born after the first census the equation becomes:

$n N 0 (t+n)=​B−​ D B +​ M ′ B$

and those in the open age interval:

$∞ N A (t+n)= ​ ∞ N A−n (t)− ​ ∞ D A−n + ​ ∞ M ′ A−n$

where B represents the number of births in the population between the two censuses, DB the number of deaths of those births in the period between the censuses and M’B the net number of surviving migrants, born outside the country in the period between the two censuses, DA-n the number of deaths in the intercensal period aged A-n and older at the time of the first census, and M’A-n the net number of migrants aged A-n and older at the time of the first census.

Thus

$5 M ′ x = ​ 5 N x+n (t+n)− ​ 5 N x (t)+ ​ 5 D x ​ M ′ B = ​ n N 0 (t+n)−​B+​ D B ∞ M ′ A−n = ​ ∞ N A (t+n)− ​ ∞ N A−n (t)+ ​ ∞ D A−n ​$

or alternatively

$5 M ′ x = ​ 5 N x+n (t+n)− ​ 5 N x (t) ​ 5 S x ​ M ′ B = ​ n N 0 (t+n)−​B S B ∞ M ′ A−n = ​ ∞ N A (t+n)− ​ ∞ N A−n (t) ​ ∞ S A−n ​$

where 5Sx , SB and SA-n represent the proportion of the populations aged x to x+5 at the time of the first census, born between the censuses, and aged A-n and older at the time of the first census, respectively, surviving to the second census.

The net number of migrants can thus be estimated from the net number surviving to the second census as follows:

$5 M x = ( ​ 5 M ′ x + ​ 5 M ′ x / ​ 5 S x )/2 = 5 M ′ x ( ​ 5 S x +1 ) 2 ​ 5 S x M B = M ′ B ( ​ S B +1 ) 2​ S B ∞ M A−n = ∞ M ′ A−n ( ​ ∞ S A−n +1 ) 2 ​ ∞ S A−n .$

Unfortunately, since the net number of migrants is usually small relative to the size of the population, age misstatement or errors in either or both census counts can lead to very poor estimates being produced. Better estimates of the net number of immigrants into a country can be produced by confining one’s attention to the population of foreigners (defined as those born outside the country) and assuming that return migration of emigrants from the country of interest is insignificant. Thus one replaces each of the symbols above by equivalents specific to the foreign-born population in the country. Since it is unlikely that one has an accurate record of the number of the foreign-born deaths these need to be estimated in one of the following ways:

• Option 1 (Life table survival ratios): Applying rates from a suitable model life table, then
• Option 2 (Census survival ratios): Assuming that emigration of the native-born population is insignificant and that the proportions surviving are the same as those in the native-born population, then

where the superscript “nb” designates native-born.

• Option 3 (Vital registration): Where one has access to numbers of births and deaths from another source such as vital registration (which is only likely to be the case, if at all, with internal migration), one could work with deaths and births corresponding to the migrant population directly instead of survival ratios to estimate the net number of surviving in-migrants. Alternatively the net number of migrants can be derived as above by setting

where the births and deaths are from the vital registration.

However, for most developing countries, particularly those in Africa, vital registration systems are too incomplete to be used in this way.

### Internal migration

When it comes to internal migration one can estimate net in-migration (i.e. in-migration of those born outside the region less out-migration of those born outside the region who had previously moved into the region) into each sub-national region of those born outside the region by making use of place of birth information to identify the change in numbers of those born outside the region, in the same way as described above. However, since one also has the place of residence of those born in the region who have moved out of the region since birth (but not emigrated) one can also estimate the net out-migration of those born in the region (i.e. out-migration of those born in the region less those born in the region who have returned after having previously moved out of the region) by applying the method described above to the population born in the region (as opposed to those born outside the region).

When estimating the survival of those born in the various regions the census survival ratios could have an advantage over the life table survival ratios in that any under or over count of the population by region, may well be matched by a similar distortion in the national population and hence in the survival ratios, thus resulting in a more accurate estimate of the number of migrants than would be produced by using life table survival ratios.

Apart from place of birth a census can ask of those who moved since the previous census (or some other suitable date) where they were at that census (or some other suitable date) which allows one to measure out-migration and hence (gross) in-migration separately for each sub-national region.

If the census asks for the year when the migrant moved (or how long the person has been living in the place where counted in the second census) one can get a sense of the timing of migration, and estimate yearly migration rates. This is a complicated process and is not covered here, but the interested reader is referred to the paper by Dorrington and Moultrie (2009).

### Working with total numbers only

If age-specific numbers are not available or the allocation to age is considered to be unreliable one can still produce estimates by age by estimating the total number of migrants as described below, and then apportioning this total to the age groups using either an age distribution for the same population at a different time (since the age distribution of migration flows tend be consistent over time, or (more likely) an appropriate standard model Rogers and Castro (1981a; 1981b).

where

$​ ∞ D 0 F = n 2 ( ​ ∞ N 0 F (t)+ ​ ∞ N 0 F (t+n) ) ∞ m 0$

and m0 is an estimate of the crude mortality rate of the population in the country of the census.

### Limitations

The primary limitation of using censuses to estimate immigration and net in-migration is the quality of the census, in particular the extent of undercount of the censuses, in general but more significantly one relative to the other. However, even if the census undercount is low, the census might not identify all the migrants. In general recent migrants are often difficult to include in a census because they have yet to settle. More specifically, immigrants may not be keen to identify themselves as immigrants and either avoid being counted or do not admit to being foreign-born.

Apart from this, place of birth and/or place of residence at previous census, in the case of internal migrants, might be misreported due to boundary changes or ignorance (or even bias) on the part of the respondent.

The third drawback of census data is that it cannot be used to measure emigration from the country of the census. Emigration is particularly difficult to estimate for most countries, but one option is to apply the method for identifying net immigration of the foreigners described above to the censuses of the main countries of destination to which the emigrants move to estimate the change in the numbers of emigrants to those countries. Of course, this is only useful if the censuses of these countries identify the numbers of foreign-born by their countries of birth reasonably accurately.

Generally, statistics on immigrants and particularly emigrants that are collected at border posts provide quite poor estimates of the true numbers, unless the borders of the country are quite impenetrable and there are a few well-controlled ports of entry. Even then there may still be many ‘visitors’ who end up living in the country.

A final drawback occurs when working with data aggregated over all ages. In these cases one usually has to make use of the crude death rate for the population of the country of the census in order to estimate the number of deaths of the migrant population. However, since the distribution of the migrant population by age can differ from that of the population of the country of the census quite markedly, the estimated number of deaths can be quite inaccurate.

## Extensions of the method

Some censuses ask additional questions which can be of use in interpreting the patterns of migration, if not improving the estimate of the level of migration. Most common of these is probably a question asking about when the migrant moved. These data allow one to estimate annual rates of migration, however, it possible that there could be a tendency for respondents to report moves as occurring more recently than is actually the case (Dorrington and Moultrie 2009).

Where a census asks, such as the recent censuses in South Africa, of those who moved since the previous census, where they moved from most recently and when they moved, and not where they were at the time of the previous census, it is possible to back-project the numbers of migrants by applying annual rates of migration between sub-national regions to estimate the number by place at the time of the previous census (Dorrington and Moultrie 2009). However, in the case of South Africa, at least, it appears that the assumption the most migrants moved only once in the past five years, and thus that the place of residence before the most recent move is the same as the place at the time of the previous census, is quite reasonable (Dorrington and Moultrie 2009).

Where one has data on both the sub-national region of birth and the place at the time of the previous census, one can cross-tabulate the place of residence data by the place of birth and thus be able to classify recent migrants into primary, secondary and return migrants.

For general background to the topic of migration, definition of terms and detail on the analysis and interpretation of the data on internal migration the interested reader is referred to the excellent UN Manual on topic, Manual VI (UN Population Division 1970). The textbook by Shryock and Siegel (1976) or its modern replacement by Siegel and Swanson (2004) also provides an introduction to the topic of migration and cover, in particular, the estimation of international migration.

Those interested in the estimation of annual migration rates and the back-projection of migration to estimate the numbers by place of residence at the time of the previous census from data on place of residence before the most recent move and year of move are referred to the paper by Dorrington and Moultrie (2009).

Dorrington RE and TA Moultrie. 2009. "Making use of the consistency of patterns to estimate age-specific rates of interprovincial migration in South Africa," Paper presented at Annual conference of the Population Association of America. Detroit, US, 30 April - 2 May.

Rogers A and LJ Castro. 1981a. "Age patterns of migration: Cause-specific profiles," in Rogers, A (ed). Advances in Multiregional Demography (RR-81-006). Laxenburg, Austria: International Institute for Applied Systems Analysis, pp. 125-159. http://webarchive.iiasa.ac.at/Admin/PUB/Documents/RR-81-006.pdf

Rogers A and LJ Castro. 1981b. Model Migration Schedules (RR-81-030). Laxenburg, Austria: International Institute for Applied Systems Analysis. http://webarchive.iiasa.ac.at/Admin/PUB/Documents/RR-81-030.pdf

Shryock HS and JS Siegel. 1976. The Methods and Materials of Demography (Condensed Edition). San Diego: Academic Press.

Siegel JS and D Swanson. 2004. The Methods and Materials of Demography. Amsterdam: Elsevier.

Timæus IM. 2004. "Impact of HIV on mortality in Southern Africa: Evidence from demographic surveillance," Paper presented at Seminar of the IUSSP Committee "Emerging Health Threats" HIV, Resurgent Infections and Population Change in Africa. Ougadougou, 12-14 February.

UN Population Division. 1970. Manual VI: Methods of Measuring Internal Migration. New York: United Nations, Department of Economic and Social Affairs, ST/SOA/Series A/47. http://www.un.org/esa/population/techcoop/IntMig/manual6/manual6.html