Estimation of migration from census data

Available Data
Desired Result
Method

Description of the methods

Estimating migration from census data is not technically complicated. Provided that the census(es) gather the appropriate information and are reasonably accurate it is possible to produce estimates of net immigration (i.e. immigration less emigration) of the foreign-born population (people born outside a particular country) and internal migration between (to and from) sub-national regions of a country, over the period between two censuses.

To estimate net immigration of foreigners one essentially subtracts from the number of foreign-born people enumerated in a census, the number of foreigners expected to have survived since being enumerated in the previous census.

In a similar way, if the censuses record the sub-national region of birth one can estimate net in-migration (i.e. net in-migration of those born outside the region less net out-migration of those born in the region) between sub-national regions of a country. However, if the census asks of people where they were living at some prior point in time, say at the time of the previous census, one is able to estimate directly the number of surviving migrants (i.e. migrants still alive at the time of the latest census) into and out of each sub-national region of the country since that prior point in time.

In order to estimate the number of migrants from the number of surviving migrants at the time of the second census one needs to add to these figures an estimate of the number of migrants who are expected to have died between moving and the time of the latest census.

If the latest census records other information such as year in which the migrant moved to the place at which the person was counted in the census, it is possible also to establish a trend of migration over time.

Migration is different from fertility and mortality both in that migrating is not final in the sense of a birth or death, but also that we are concerned not only with the population of origin, from which the migrant moved (which corresponds to a population exposed to the risk from which rates of migration akin to those of fertility and mortality can be calculated) but we also have a population to which the migrant moves, the destination population. Apart from this, in order to understand migration one is often interested in distinguishing between different types of migration (whether temporary or more permanent, whether circulatory or unidirectional, etc.). For these reasons there is a much wider range of measures and terminology associated with migration than there is with either fertility or mortality. It is not the purpose of this chapter to cover these issues and the interested reader is referred to the standard texts on the subject such as the UN Manual VI (UN Population Division 1970), Shryock and Siegel (1976), Siegel and Swanson (2004).

Data requirements and assumptions

Tabulations of data required

  • To estimate net immigration of foreigners:
    • the number of foreign-born females (males), in five-year age groups, and for an open age interval A+, at two points in time, typically two censuses
    • For the deaths: either a suitable model life table or the numbers of native-born females (males), in five-year age groups, and for an open age interval A+, at two points in time, typically two censuses. Failing these, the central crude death rate for the population
  • To estimate sub-national regional net in-migration from place of birth data:
    • the number of females (males) by sub-national region and by sub-national region of birth, in five-year age groups, and for an open age interval A+, at two points in time, typically two censuses
    • For the deaths: either a suitable model life table, the numbers of native-born females (males), in five-year age groups, and for an open age interval A+, at two points in time, typically two censuses or numbers of deaths by region from the vital registration. Failing these, the central crude death rate for the population
  • To estimate internal migration between sub-national regions from place of residence at previous census data:
    • The numbers of females (males) by sub-national region and by sub-national region at some prior date, typically that of the preceding census, in five-year age groups, and for an open age interval A+.
  • If age-specific numbers are not available, aggregated data is still useful for estimating all-age migration.

Important assumptions

  • Estimating net immigration of foreigners:
  • Censuses identify all foreign-born people accurately
  • One is able to estimate the mortality of the foreign-born population accurately (either that the life table used is appropriate, or that the mortality is the same as that implied by the censuses for the native-born (locally-born) national population)
  • No return migration of locally born emigrants
  • Estimating sub-national regional net in-migration from place of birth data:
  • Censuses count the population by sub-national region accurately and identify the region of birth accurately
  • One is able to estimate the mortality of people moving between two regions accurately (either that the life table used is appropriate, or that the mortality is the same as that implied by the censuses for the native-born national population).
  • Estimating internal migration between sub-national regions from data on place of residence at previous census:
  • Latest census identifies correctly all people who have moved from one region to another since the prior date (e.g. previous census)
  • One is able to estimate the mortality of people moving between two regions accurately (either that the life table used is appropriate, or that the mortality is the same as that implied by the censuses for the native-born national population). Since one is estimating in- and out-migration separately (as opposed to net migration) this assumption is of less importance.

Preparatory work and preliminary investigations

Before applying this method, you should investigate the quality of the data in at least the following dimensions

  • age structure of the population (by sub-national region as appropriate); and
  • relative completeness of the census counts (by sub-national region as appropriate).

Caveats and warnings

Estimating migration using place of birth data from two censuses not only requires that the censuses count the population reasonably completely, but that the place of birth be accurately recorded. Often this is not the case, particularly when estimating immigration, where immigrants wish to hide the fact that they are foreign, but also in the case of internal migration where there may have been boundary changes or the respondent is ignorant about the place of birth of the person.

Estimating migration by asking questions of migrants is quite dependent on the census identifying completely all those who have migrated, as well as identifying the place from which moved correctly. To the extent that recent migrants are not yet established as residents of the region to which they have moved at the time of the census, they could be missed in the count.

Net migration, by definition, underestimates the flows of migrants into and out of a region or country. Thus, for example, people who moved into a region and then returned within the period being considered will result in zero net in-migration and yet moved twice.

Application of the method

A: Estimating net immigration of foreigners using place of birth data

This method produces estimates of the net immigration of foreigners using place of birth data. It is important to stress that this method does not take into account or measure the immigration of returning native-born people who left the country prior to the previous census and returned before the second census. Thus this method is not recommended for the measurement of immigration where significant return migration of native-born people (for example, after exile or forced migration of refugees) is in progress.

Step 1: Decide on survival factors

If data on the number of foreign-born people in the population are available by age group for each census then one needs to estimate the survival factors to be applied to the numbers of foreign-born in the first census to estimate the numbers surviving to the time of the second census. The user can choose between years of life lived in five-yearly age groups (5Lx) based on the standard from the General family of United Nations model life tables or one of any of the four families of Princeton model life tables or a model life table of a population experiencing an AIDS epidemic (Timæus 2007) which appear in the Models spreadsheet of the associated workbook. This spreadsheet also allows the user to input years of life lived in five-yearly age groups of an alternative life table if there is reason to assume that the life table has a similar pattern of mortality to that of the population in question, or failing this, the survival factors can be derived from the proportion of each five-year age group of the native-born population surviving from the first to the second census (assumed to be n years apart, where n is a multiple of 5). Thus

5 S x,n S An,n

and

S B,n

, the n-year survival factor for a group of people aged x to x + 5 at the previous census, A-n and older at the previous census, and born between censuses, respectively are estimated as follows:

5 S x,n = 5 L x+n 5 L x or 5 N x+n nb (t+n) 5 N x nb (t) , S An,n = T A T An or N A nb (t+n) N An nb (t) ,and S B,n = n L 0 n l 0 or n N 0 nb (t+n) B nb .

where the superscript nb represents ‘native-born’,

5 N x nb (t)

represents the native-born population in the census at time t and Bnb represents the number of native-born births between time t and t + n.

If the data are not available in five-year age groups, the net number of immigrants can still be estimated in total provided we have an estimate of the crude death rate for the population (which might, in the absence of any evidence to the contrary, be assumed to be that of the native-born population).

Step 2: Estimate the number of deaths of the immigrants

If data on the number of foreign-born people in the population are available by age group for two censuses (n years apart) then one needs to estimate the number of deaths of foreign-born people (denoted by the superscript F) aged between x and x+5 at the first census (at time t),

5 D x F

, aged A-n and older at the first census,

D An F

, and those born between the censuses,

D B F

, as follows:

5 D x F = 1 2 ( 5 N x F (t) 5 S x,n + 5 N x+n F (t+n) )( 1 5 S x,n 1 ), D An F = 1 2 ( N An F (t) S An,n + N A F (t+n) )( 1 S An,n 1 ), and D B F = 1 2 ( n N 0 F (t+n) )( 1 S B,n 1 ).

where

5 N x F (t)

represents the number of foreign-born people according to the census at time t who were aged between x and x+5.

If data and/or survival factors are not available by age group then one can estimate the total number of deaths of the foreign-born people as follows:

D 0 F = n 2 ( N 0 F (t)+ N 0 F (t+n) ) m 0

where m0 is an estimate of the crude mortality rate of the population in the country of the census.

However, if the age distribution of the foreign-born population is markedly different from that of the population in the country of the census, then this can produce a poor approximation to the true number of deaths.

Step 3: Estimate the net number of immigrants (of foreigners)

If data are available by age group for each census then age-specific net immigration can be estimated as follows:

Net  5 M x F = 5 N x+n F (t+n) N x F (t)+ 5 D x F

for x = 0, 5, … , A-5-n where

Net  5 M x F

represents the net number of immigrants between times t and t+n who were aged between x and x + 5 at time t. For x > A - 5 - n

Net  M An F = N A F (t+n) N An F (t)+ D An F .

The net number of immigrants of those born between times t and t+n is estimated as follows:

Net  M B F = n N 0 F (t+n)+ D B F .

If data and/or survival factors are not available by age group then one would estimate of the total net number of immigrants as follows:

Net  M 0 F = N 0 F (t+n) N 0 F (t)+ D 0 F .

B: Estimating net internal migration between sub-national regions from place of birth data

Net in-migration into a particular sub-national region from other regions in the country can be estimated in exactly the same way as the international immigration, described above, by replacing the foreign-born population with the population born outside the region.

In addition, applying the same method to data on the change in the numbers of population born in (rather than outside) and living outside the region of interest allows us to estimate the net out-migration of those born in the region to other regions in the country. Subtracting this from the net in-migration of those born outside the region gives an estimate of the overall net in-migration into the region of interest.

If there is reason to suspect that there is a material difference in the mortality experienced by those born outside who moved into the region and those born in the region who moved out, and one has appropriate survival factors then one could apply different survival factors to each when estimating the net number of migrants. However, in practice it is likely that inaccuracies in the census data on place of residence at previous census are likely to outweigh any increase in accuracy achieved by using differential mortality.

C: Estimating internal migration between sub-national regions from place of residence at previous survey

Net sub-national inter-regional migration is estimated directly from the numbers of people in each region at the time of the census who moved since the previous census by place (e.g. region) they were in at a given prior date (e.g. at the time of the previous census). Confining the estimates to inter-regional flows the sum of the numbers of inter-regional in-migrants should be equal to the sum of inter-regional out-migrants; however, if the data include immigration to the sub-national regions from outside the country one can extend the estimates of in-migration to include international immigration into each region.

Since one of the major areas of interest is the magnitude of inter-regional flows of the population, one is as interested in the total numbers of migrants between regions as one is in the age distributions of particular flows.

The number of migrants is derived from the number of surviving in- and out-migrants as follows:

5 M x = ( 5 I x 5 O x + ( 5 I x 5 O x ) x / 5 S x )/2 ,

where the superscript (’) represents numbers surviving and 5Ix and 5Ox respectively represent the number of surviving in-migrants into, and the surviving number out-migrants from, a particular region at the time of the second census who were aged between x and x+5 at the second census.

Worked example

This example uses data on the numbers of males in the population from the South African Census in 2001 and a ‘census replacement survey’, the Community Survey in 2007. (Although the survey was conducted approximately 5.35 years after the night of the census in 2001, it is assumed for the purposes of presentation here to have been exactly five years after the census in 2001.) The examples appear in the Migration_South Africa_males.xlsx workbook.

A: Estimating net immigration of foreigners using place of birth

Step 1: Decide on survival factors

The survival factors are shown in the fifth column of Table 1. The values are derived from (the years of life lived in each age group of) the alternative life table entered in the Models spreadsheet, for those aged 20 to 24 last birthday and those aged 80 and over at the time of the first census, and those born between the two censuses, as follows:

5 S 20,5 = 5 L 25 5 L 20 = 4.3382 4.4975 =0.96458 S 80,5 = T 85 T 80 = 0.75180 1.19603 =0.40912 and S B,5 = 5 L 0 5 l 0 = 4.707549 5 =0.94151.

Table 1 Estimation of deaths of foreign-born and the net number of immigrants by age group, South Africa, 2001-2006

 

Age

2001

2006

x

5Sx

Age at 2nd census

DF

Net M

 

 

 

B

0.94151

 

 

 

0- 4

8,963

12,577

0

0.97896

0- 4

391

12,968

5- 9

10,390

13,724

5

0.99547

5- 9

242

5,003

10-14

13,508

13,998

10

0.99427

10-14

55

3,664

15-19

27,835

27,943

15

0.98602

15-19

119

14,555

20-24

69,787

59,493

20

0.96458

20-24

616

32,275

25-29

87,381

95,763

25

0.93161

25-29

2,994

28,970

30-34

73,338

100,450

30

0.90960

30-34

6,675

19,743

35-39

66,663

85,490

35

0.89780

35-39

7,563

19,715

40-44

59,152

75,684

40

0.89092

40-44

7,701

16,721

45-49

45,184

66,113

45

0.88633

45-49

7,274

14,234

50-54

40,398

55,913

50

0.87224

50-54

6,154

16,883

55-59

30,640

42,833

55

0.84731

55-59

5,717

8,153

60-64

24,376

34,433

60

0.80885

60-64

5,442

9,234

65-69

17,895

25,588

65

0.75468

65-69

5,353

6,564

70-74

13,561

18,989

70

0.66991

70-74

5,281

6,375

75-79

10,238

12,850

75

0.56388

75-79

5,404

4,693

80-84

7,658

7,461

80+

0.40912

80-84

5,118

2,341

85+

4,455

5,305

 

 

85+

7,410

602

Total

611,423

754,608

 

 

Total

79,509

222,693

Step 2: Estimate the number of deaths

Since we have data on the number of foreign-born people in the population by age group for each census we can estimate the number of deaths of foreign-born people which occurred in the period between the two censuses by age group using the numbers of foreigners in each census given in the second and third columns of Table 1. For those aged 20 to 24 last birthday and those aged 80 and over at the time of the first census, and those born between the two censuses, the calculations are as follows:

5 D 20 F = 1 2 ( 5 N 20 F (2001) 5 S 20,5 + 5 N 25 F (2006) )( 1 5 S 20,5 1 ) =( 697870.96458+95763 )( 1 0.96458 1 )=2994 D 80 F = 1 2 ( N 80 F (2001) S 80,5 + N 85 F (2006) )( 1 S 80,5 1 ) =( ( 7658+4455 )0.40912+5305 )( 1 0.40912 1 )=7410 and D B F = 1 2 ( 5 N 0 F (2006) )( 1 S B,5 1 )=12577( 1 0.94151 1 )=391.

If data and/or survival factors were not available by age group then one could estimate the total number of deaths of the foreign born people as follows, given an estimate of the crude mortality rate in the population of 14 per 1,000:

D 0 F = 5 2 ( N 0 F (2001)+ N 0 F (2006) ) m 0 = 5 2 ( 611423+754608 ) 14 1000 =47811.

Step 3: Estimate the net number of immigrants (of foreigners)

Since data are available by age group for each census, age-specific net immigration of those born outside the country can be estimated as follows:If data and/or survival factors were not available by age group then one could estimate the total net number of immigrants as follows:

Net  5 M 20 F = 5 N 25 F (2006) N 20 F (2001)+ 5 D 20 F =9576369787+2994=28970 Net  M 80 F = N 85 F (2006) N 80 F (2001)+ D 80 F =5305( 7658+4455 )+7410=602 Net  M B F = 5 N 0 F (2006)+ D B F =12577+391=12968.

If data and/or survival factors were not available by age group then one could estimate the total net number of immigrants as follows:

Net  M 0 F = N 0 F (2006) N 0 F (2001)+ D 0 F =754608611423+47811=190996

 

B: Estimating sub-national regional net in-migration using place of birth

The second and third column of Table 2 show the numbers of people living in the Western Cape province of South Africa who were born outside the province, as counted by the 2001 Census and the 2007 Community Survey, respectively. Although the same survival factors (column 5) have been used as were used in the example of Method A, this should not be the case if it was thought that the mortality experience of native-born and immigrants were very different. The final column of Table 2 gives the net numbers of migrants into the Western Cape who were born in provinces other than the Western Cape for the different age groups. Thus in total 213,911 people born outside the Western Cape moved to the Western Cape (after excluding those who moved out).

Table 2 Estimation of the net number of in-migrants of those born outside by age group, Western Cape, South Africa, 2001-2006

 

Age

2001

2006

x

5Sx

Age at 2nd census

DO

Net M (born out)

 

 

 

B

0.94151

 

 

 

0- 4

16,443

19,012

0

0.97896

0- 4

591

19,602

5- 9

24,406

28,743

5

0.99547

5- 9

482

12,782

10-14

31,134

30,792

10

0.99427

10-14

125

6,511

15-19

44,478

53,933

15

0.98602

15-19

245

23,043

20-24

74,011

82,526

20

0.96458

20-24

896

38,944

25-29

80,187

89,522

25

0.93161

25-29

2,954

18,466

30-34

65,833

90,783

30

0.90960

30-34

6,074

16,670

35-39

56,393

76,475

35

0.89780

35-39

6,776

17,417

40-44

44,420

59,692

40

0.89092

40-44

6,268

9,567

45-49

32,862

47,612

45

0.88633

45-49

5,338

8,529

50-54

28,178

37,969

50

0.87224

50-54

4,303

9,409

55-59

19,983

30,205

55

0.84731

55-59

4,012

6,039

60-64

17,569

25,593

60

0.80885

60-64

3,832

9,442

65-69

11,216

20,802

65

0.75468

65-69

4,137

7,371

70-74

8,365

12,612

70

0.66991

70-74

3,426

4,822

75-79

5,919

8,434

75

0.56388

75-79

3,458

3,528

80-84

4,063

5,061

80+

0.40912

80-84

3,248

2,390

85+

2,152

2,183

 

 

85+

3,413

-620

Total

567,613

721,949

 

 

Total

59,576

213,911

The second and third columns of Table 3 present the numbers of people living in provinces other than the Western Cape who were born in the Western Cape, as counted by the 2001 census and the 2007 Community Survey, respectively. The net number of out-migrants of those born in the Western Cape (i.e. the number of people born in the Western Cape who moved out, less those who have returned) is given in column 8. The negative numbers mean that there was negative net out-migration (i.e. the number of those born in the Western Cape who moved to other provinces in the period was less than the number born in the Western Cape who were living outside who returned during the period). Thus the total of -19,017 means that the number of people born in the Western Cape, who returned to the Western Cape during the period having lived in another province until 2001 exceed those who were born in the Western Cape and moved to another province in the period by 19,017.

These estimates were derived using the same survival factors as were used for those born outside the Western Cape who moved into the province, but if there was reason to suppose that the mortality differed for those born in the Western Cape who moved out, then a different set of survival factors would be used to estimate the Net M (born in) numbers.

The overall net in-migration for the province is thus given in the final column of Table 3. Thus in total 232,928 more people moved into the Western Cape than left the Western Cape to live in another province.

In this example those born outside the province include those born outside the country and thus the overall net migration includes immigrants who settle in the province. Excluding the foreign-born from Table 2 would produce numbers of internal in-migrants net of internal out-migrants, and the sum of these numbers for all the provinces together would be zero.

Table 3 Estimation of the net number of out-migrants of those born inside by age group, Western Cape, South Africa, 2001-2006

 

Age

2001

2006

x

5Sx

Age at 2nd census

DI

Net M (born in)

Overall

Net M

 

 

 

B

0.94151

 

 

 

 

0- 4

22,055

11,747

0

0.97896

0- 4

365

12,112

7,490

5- 9

21,895

12,509

5

0.99547

5- 9

367

-9,180

21,962

10-14

21,382

11,593

10

0.99427

10-14

76

-10,226

16,737

15-19

18,265

13,455

15

0.98602

15-19

100

-7,827

30,870

20-24

14,645

10,477

20

0.96458

20-24

202

-7,587

46,531

25-29

13,501

9,534

25

0.93161

25-29

434

-4,676

23,142

30-34

13,118

11,047

30

0.90960

30-34

867

-1,587

18,257

35-39

12,121

14,614

35

0.89780

35-39

1,319

2,815

14,602

40-44

11,725

12,195

40

0.89092

40-44

1,311

1,384

8,183

45-49

10,335

10,538

45

0.88633

45-49

1,285

98

8,431

50-54

9,211

9,881

50

0.87224

50-54

1,221

768

8,642

55-59

7,264

10,568

55

0.84731

55-59

1,362

2,720

3,319

60-64

6,691

7,723

60

0.80885

60-64

1,250

1,710

7,732

65-69

4,643

5,297

65

0.75468

65-69

1,265

-128

7,499

70-74

3,954

3,766

70

0.66991

70-74

1,182

304

4,517

75-79

2,331

2,384

75

0.56388

75-79

1,240

-330

3,858

80-84

1,402

2,140

80+

0.40912

80-84

1,336

1,145

1,244

85+

707

555

 

 

85+

1,024

-531

-89

Total

195,246

160,023

 

 

Total

16,206

-19,017

232,928

C: Estimating internal migration between sub-national regions from data on place of residence at previous census

Table 4 presents the results of the answers to the question about place (province in this example) of residence at the time of the 2001 Census given by those counted in each of the provinces in the 2007 Community Survey. (In actual fact the question asked whether the person was staying at the same place at the time of the prior census and if not, where they were staying at the time they moved to the place at which they were counted in the Community Survey. However, work by Dorrington and Moultrie (2009) shows that using these data and the year of movement to back project the population in order to estimate the numbers by province of residence at the time of the previous survey suggests that the assumption that there was only one move in the five years since the previous census was reasonably accurate.)

By far the largest numbers of migrants are those that moved within each of the provinces, however, these have been excluded from Table 4 because one is usually more interested in interprovincial migration than migration within a province.

Table 4 Interprovincial migration, South Africa, 2001-2006

 

 

Province where counted (destination)

 

Previous residence (origin)

WC

EC

NC

FS

KZ

NW

GT

MP

LM

Total

WC

 

12,173

4,060

1,745

3,221

2,113

16,400

1,405

874

41,992

EC

52,239

 

1,120

7,187

25,209

14,430

28,633

4,693

2,116

135,626

NC

4,813

1,942

 

3,480

908

3,728

4,956

1,062

357

21,246

FS

2,943

3,145

2,546

 

2,352

12,733

19,920

4,293

1,963

49,896

KZ

6,762

7,015

631

2,358

 

3,573

50,980

8,886

1,194

81,399

NW

1,478

907

9,811

5,555

2,329

 

47,633

3,090

4,337

75,140

GT

24,891

12,948

3,962

11,437

18,145

32,433

 

18,598

15,133

137,547

MP

2,134

1,317

280

1,724

4,546

5,767

42,941

 

8,628

67,338

LM

2,754

1,583

255

1,709

2,209

9,773

81,394

24,211

 

123,889

OSA

21,221

5,467

1,209

9,584

10,933

11,437

51,873

8,335

9,286

129,346

DNK

500

3

15

124

132

78

228

89

0

1,170

UNS

1,058

1,029

107

208

875

508

3,558

408

633

8,384

Total

120,794

47,528

23,996

45,111

70,860

96,573

348,516

75,070

44,524

872,973

WC = Western Cape, EC = Eastern Cape, NC = Northern Cape, FS = Free State, KZN = KwaZulu-Natal, NW = North West, GT = Gauteng, MP = Mpumalanga, LM = Limpopo, OSA = Outside SA, DNT = Do not know, UNS = Unspecified

In addition to the all-age numbers in Table 4 (in actual fact these numbers exclude, as is often the case, migration of those born between the census and survey) one can also produce numbers of in- and out-migration by age groups as shown in Table 5. For completeness these numbers include estimates of the number of migrants who were born since the previous census. However, relative to the other migrants these numbers look implausibly high, and the reason for this is discussed below.

The net number of migrants is estimated for those aged 25-29 at the time of the Community Survey (i.e. were aged 20-24 at the time of the 2001 census), for example, as follows:

5 M x = ( 206755649+ ( 206755649 )/ 0.96458 )/2 =15301.

Table 5 Estimation of the net number of in-migrants by age group, Western Cape, South Africa, 2001-2006

 

Age

Surviving in- migrants (I’)

Surviving out- migrants (O’)

x

5Sx

Net in-migrants

 

 

 

 

 

 

0- 4

20,846

11,747

B

0.94151

9,381

5- 9

6586

3,554

0

0.97896

3,065

10-14

6685

2,882

5

0.99547

3,812

15-19

10402

3,967

10

0.99427

6,454

20-24

21266

4,488

15

0.98602

16,897

25-29

20675

5,649

20

0.96458

15,301

30-34

15584

6,008

25

0.93161

9,928

35-39

10584

5,098

30

0.90960

5,758

40-44

7264

3,045

35

0.89780

4,458

45-49

4648

2,714

40

0.89092

2,053

50-54

3095

1,500

45

0.88633

1,698

55-59

3940

935

50

0.87224

3,225

60-64

3776

527

55

0.84731

3,541

65-69

3127

818

60

0.80885

2,582

70-74

1540

437

65

0.75468

1,282

75-79

561

206

70

0.66991

442

80-84

797

116

75

0.56388

944

85+

264

47

80+

0.40912

374

Total

141,640

53,739

 

 

91,194

Diagnostics, analysis and interpretation

Checks and validation

Perhaps the simplest check, on the reasonableness of the ‘shape’ (i.e. distribution of the numbers by age) of the estimates but not the level, is to see if it conforms to the standard shape (or a variation thereof). Rogers and Castro (1981a; 1981b) point out that the distribution of the number (or rate) of in- and out-migrants tends to conform to standard patterns, with a peak in the young adult ages (usually associated with seeking employment), a second, usually less pronounced peak amongst very young children falling to a trough amongst young teenagers (the size depending on the extent to which it is families rather than individuals moving in the young to middle aged adults). Sometimes there is also a ‘hump’ (or trough) around retirement age if there is a strong flow of migrants moving to (or away from) the place to retire.

These patterns (not necessarily the same pattern) apply to in- and out-migration flows separately, but not necessarily to net migration (which is the difference between the two flows) unless one flow (either the in-migration or the out-migration) is much greater than the other.

Figure 1 illustrates this using some of the estimates calculated above, expressed as proportions of the total number in each case (to allow them to be presented on a single figure). From this we can see that in broad terms (with the exception in some cases, where the proportion of migrants at the very young ages looks implausibly high) each conforms to the expected shape.

The net out-migrants of those born in the Western Cape (excluded from the figure for ease of illustration) does not conform to a standard model of migration, which could indicate these numbers are not very reliable, however, they are small relative to the in-migration of those born outside the province, and thus such a deviation may tolerated. In addition to this there are two other features to be noted from Figure 1. The first is that the out-migration from the Western Cape as estimated from data on place of residence at previous census, suggests that adult out-migrants peak at a somewhat older age (and possibly are likely to represent family rather than individual migration). The second is the fact that the net immigration into the country follows the standard shape which indicates that the flow into the country is much stronger than the return flow of those migrants.

[[wysiwyg_imageupload:230:]]

If the census asked place of birth and place of residence at the previous census then one can compare the two estimates of net in-migration into a specific sub-national region. If they are similar this gives one some confidence in the results. In the case of the place of birth data for South Africa the net number of in-migrants into the Western Cape is 232,928 (Table 3) while the estimate from the data on place of residence at the time of the previous census data produced an estimate of 92,194 (Table 4), which suggests that one or both of these sets of data are suspect.

The most basic check of the estimates of migration is to project the population (of the country or the province) at the first census to the time of the second census making use of the estimates of the number of migrants and compare that with the census estimates from the second, more recent, census to see how well the two match, especially in the age range in which migration is concentrated. In the case of the net in-migration into the Western Cape, projecting the population forward from 2001 using the estimates derived from the change in the numbers by place of birth produced a much closer fit to the population in the 20-29 year age range, suggesting that the data on place of birth are probably more complete than those on the place of residence at the date of the previous census. To some extent this is supported by a comparison of the change in the number of foreign-born in the country between the two censuses, 222,693 (Table 1) with the sum of the numbers who reported that they had moved from outside South Africa to one of the provinces since the previous census, 129,346 (Table 4).

Ideally, if one had independent estimates of the number of migrants one might compare those numbers against estimates using the above methods. Unfortunately, reliable independent estimates are rare. Although most countries try to record people entering and leaving the country, these data are often not reliable, particularly in developing countries with relative porous borders. And unless the country is extremely well regulated and maintains a complete and accurate register of the population, the only other way to measure internal migration is through migration-specific surveys, which tend to be much more useful for understanding the type of migration (whether permanent, temporary, cyclical, etc.) than for producing reliable estimates of the number of migrants, given the often less structured situation that (particularly recent) migrants find themselves living in and an understandable reluctance to identify themselves as being migrants.

Interpretation

Considering the numbers of migrants estimated from the data on place of residence at the previous census given in Table 4 (and taking into account the suspicion that these probably underestimate the true migration), some 2-4% of the population changed province of residence in the 5 years between the 2001 Census and the Community Survey. Had we included the number who moved within, but did not change, province then between 7 and 15 per cent of the population moved in the 5‑year period.

The main provinces of destination are Gauteng (by a big margin) and Western Cape, which are predominantly urban and the wealthiest provinces. The main provinces of origin are Gauteng (inspection of the age distribution would show that this is mainly return migration of ‘retiring’ workers) Eastern Cape and Limpopo, which are poor, mainly rural provinces, from which people seeking work migrate to the urban areas.

It appears that migration is predominantly of individuals (seeking work) rather than of families.

Method-specific issues with interpretation

Scanning errors

A particular feature of the data relying on province of birth is the apparently relatively high number of children born since the first census who have moved to another province. In all likelihood this is an artefact of the data capturing process. Scanning was used to capture the data from the questionnaires on which Western Cape was coded as a “1”, written in the appropriate space by hand. It appears that in a small percentage of cases the scanner might have had trouble distinguishing a handwritten “1” from a handwritten “7” (the code for Gauteng). The result of this is, for example, that some of the children coded as having been born outside the province in which they were counted, and thus appear to be migrants, but probably were not. Even though the percentage error in scanning is very small, the number of births can be large relative to the number migrants, and thus the error can produce noticeable errors. Since an increasing number of developing countries are using scanning to capture data, this sort of problem may be quite common.

Where scanning errors or other situations make it impossible to produce reliable estimates of the number of migrants of those born since the previous census one can use CWR from second census as follows:

Net 5 M 0 = 1 4 CW R 0 Net 30 M 15 f

for those born in the most recent five years, and

Net 5 M 5 = 3 4 CW R 5 Net 30 M 20 f

for those born in the five years before that if the censuses are 10 years apart, where CWRx represents ratio of the number of children aged between x and x+5 to the number of women in the population aged between 15+x and 45+x in the population (regional or national) at the time of the second census, and

30 M x f

represents the number of women migrants aged between x and x+30.

Applying this to the data for the Western Cape suggest that the number of migrants born since the previous census should be less than half the numbers being estimated from the data on place of birth.

Detailed description of method

Mathematical exposition

The indirect estimation of migration derives from the balance equation for two censuses n years apart, namely:

5 N x+n (t+n) = 5 N x (t) 5 D x + 5 I x 5 O x = 5 N x (t) 5 D x + 5 M x

where

5 M x = 5 I x 5 O x

is the net (i.e. in less out) number of in-migrants, aged x to x+5 at the time of the first census, surviving to the second census, and 5Dx, 5I’x and 5O’x, represent the number of deaths, surviving in-migrants and out-migrants, aged x to x+5 at the time of the first census, who died or moved in the period between the censuses.

For those born after the first census the equation becomes:

n N 0 (t+n)=B D B + M B

and those in the open age interval:

N A (t+n)= N An (t) D An + M An

where B represents the number of births in the population between the two censuses, DB the number of deaths of those births in the period between the censuses and M’B the net number of surviving migrants, born outside the country in the period between the two censuses, DA-n the number of deaths in the intercensal period aged A-n and older at the time of the first census, and M’A-n the net number of migrants aged A-n and older at the time of the first census.

Thus

5 M x = 5 N x+n (t+n) 5 N x (t)+ 5 D x M B = n N 0 (t+n)B+ D B M An = N A (t+n) N An (t)+ D An

or alternatively

5 M x = 5 N x+n (t+n) 5 N x (t) 5 S x M B = n N 0 (t+n)B S B M An = N A (t+n) N An (t) S An

where 5Sx , SB and SA-n represent the proportion of the populations aged x to x+5 at the time of the first census, born between the censuses, and aged A-n and older at the time of the first census, respectively, surviving to the second census.

The net number of migrants can thus be estimated from the net number surviving to the second census as follows:

5 M x = ( 5 M x + 5 M x / 5 S x )/2 = 5 M x ( 5 S x +1 ) 2 5 S x M B = M B ( S B +1 ) 2 S B M An = M An ( S An +1 ) 2 S An .

Unfortunately, since the net number of migrants is usually small relative to the size of the population, age misstatement or errors in either or both census counts can lead to very poor estimates being produced. Better estimates of the net number of immigrants into a country can be produced by confining one’s attention to the population of foreigners (defined as those born outside the country) and assuming that return migration of emigrants from the country of interest is insignificant. Thus one replaces each of the symbols above by equivalents specific to the foreign-born population in the country. Since it is unlikely that one has an accurate record of the number of the foreign-born deaths these need to be estimated in one of the following ways:

 

  • Option 1 (Life table survival ratios): Applying rates from a suitable model life table, then

5 S x = 5 L x+n 5 L x ,  S B = n L 0 n l 0  and  S An = T A T An .

  • Option 2 (Census survival ratios): Assuming that emigration of the native-born population is insignificant and that the proportions surviving are the same as those in the native-born population, then

5 S x = 5 N x+n nb (t+n) 5 N x nb (t) ,  S B = n N 0 nb B nb  and  S An = N A nb (t+n) N An nb (t) ,
where the superscript “nb” designates native-born.

  • Option 3 (Vital registration): Where one has access to numbers of births and deaths from another source such as vital registration (which is only likely to be the case, if at all, with internal migration), one could work with deaths and births corresponding to the migrant population directly instead of survival ratios to estimate the net number of surviving in-migrants. Alternatively the net number of migrants can be derived as above by setting

5 S x =1 5 D x 5 N x (t) ,  S B = D B B  and  S An = D An N An (t)

where the births and deaths are from the vital registration. 

However, for most developing countries, particularly those in Africa, vital registration systems are too incomplete to be used in this way.

Internal migration

When it comes to internal migration one can estimate net in-migration (i.e. in-migration of those born outside the region less out-migration of those born outside the region who had previously moved into the region) into each sub-national region of those born outside the region by making use of place of birth information to identify the change in numbers of those born outside the region, in the same way as described above. However, since one also has the place of residence of those born in the region who have moved out of the region since birth (but not emigrated) one can also estimate the net out-migration of those born in the region (i.e. out-migration of those born in the region less those born in the region who have returned after having previously moved out of the region) by applying the method described above to the population born in the region (as opposed to those born outside the region).

When estimating the survival of those born in the various regions the census survival ratios could have an advantage over the life table survival ratios in that any under or over count of the population by region, may well be matched by a similar distortion in the national population and hence in the survival ratios, thus resulting in a more accurate estimate of the number of migrants than would be produced by using life table survival ratios.

Apart from place of birth a census can ask of those who moved since the previous census (or some other suitable date) where they were at that census (or some other suitable date) which allows one to measure out-migration and hence (gross) in-migration separately for each sub-national region.

If the census asks for the year when the migrant moved (or how long the person has been living in the place where counted in the second census) one can get a sense of the timing of migration, and estimate yearly migration rates. This is a complicated process and is not covered here, but the interested reader is referred to the paper by Dorrington and Moultrie (2009).

Working with total numbers only

If age-specific numbers are not available or the allocation to age is considered to be unreliable one can still produce estimates by age by estimating the total number of migrants as described below, and then apportioning this total to the age groups using either an age distribution for the same population at a different time (since the age distribution of migration flows tend be consistent over time, or (more likely) an appropriate standard model Rogers and Castro (1981a; 1981b).

Net  M 0 F = N 0 F (t+n) N 0 F (t)+ D 0 F

where

D 0 F = n 2 ( N 0 F (t)+ N 0 F (t+n) ) m 0

and m0 is an estimate of the crude mortality rate of the population in the country of the census.

Limitations

The primary limitation of using censuses to estimate immigration and net in-migration is the quality of the census, in particular the extent of undercount of the censuses, in general but more significantly one relative to the other. However, even if the census undercount is low, the census might not identify all the migrants. In general recent migrants are often difficult to include in a census because they have yet to settle. More specifically, immigrants may not be keen to identify themselves as immigrants and either avoid being counted or do not admit to being foreign-born.

Apart from this, place of birth and/or place of residence at previous census, in the case of internal migrants, might be misreported due to boundary changes or ignorance (or even bias) on the part of the respondent.

The third drawback of census data is that it cannot be used to measure emigration from the country of the census. Emigration is particularly difficult to estimate for most countries, but one option is to apply the method for identifying net immigration of the foreigners described above to the censuses of the main countries of destination to which the emigrants move to estimate the change in the numbers of emigrants to those countries. Of course, this is only useful if the censuses of these countries identify the numbers of foreign-born by their countries of birth reasonably accurately.

Generally, statistics on immigrants and particularly emigrants that are collected at border posts provide quite poor estimates of the true numbers, unless the borders of the country are quite impenetrable and there are a few well-controlled ports of entry. Even then there may still be many ‘visitors’ who end up living in the country.

A final drawback occurs when working with data aggregated over all ages. In these cases one usually has to make use of the crude death rate for the population of the country of the census in order to estimate the number of deaths of the migrant population. However, since the distribution of the migrant population by age can differ from that of the population of the country of the census quite markedly, the estimated number of deaths can be quite inaccurate.

Extensions of the method

Some censuses ask additional questions which can be of use in interpreting the patterns of migration, if not improving the estimate of the level of migration. Most common of these is probably a question asking about when the migrant moved. These data allow one to estimate annual rates of migration, however, it possible that there could be a tendency for respondents to report moves as occurring more recently than is actually the case (Dorrington and Moultrie 2009).

Where a census asks, such as the recent censuses in South Africa, of those who moved since the previous census, where they moved from most recently and when they moved, and not where they were at the time of the previous census, it is possible to back-project the numbers of migrants by applying annual rates of migration between sub-national regions to estimate the number by place at the time of the previous census (Dorrington and Moultrie 2009). However, in the case of South Africa, at least, it appears that the assumption the most migrants moved only once in the past five years, and thus that the place of residence before the most recent move is the same as the place at the time of the previous census, is quite reasonable (Dorrington and Moultrie 2009).

Where one has data on both the sub-national region of birth and the place at the time of the previous census, one can cross-tabulate the place of residence data by the place of birth and thus be able to classify recent migrants into primary, secondary and return migrants.

Further reading and references

For general background to the topic of migration, definition of terms and detail on the analysis and interpretation of the data on internal migration the interested reader is referred to the excellent UN Manual on topic, Manual VI (UN Population Division 1970). The textbook by Shryock and Siegel (1976) or its modern replacement by Siegel and Swanson (2004) also provides an introduction to the topic of migration and cover, in particular, the estimation of international migration.

Those interested in the estimation of annual migration rates and the back-projection of migration to estimate the numbers by place of residence at the time of the previous census from data on place of residence before the most recent move and year of move are referred to the paper by Dorrington and Moultrie (2009).

 

Dorrington RE and TA Moultrie. 2009. "Making use of the consistency of patterns to estimate age-specific rates of interprovincial migration in South Africa," Paper presented at Annual conference of the Population Association of America. Detroit, US, 30 April - 2 May.

Rogers A and LJ Castro. 1981a. "Age patterns of migration: Cause-specific profiles," in Rogers, A (ed). Advances in Multiregional Demography (RR-81-006). Laxenburg, Austria: International Institute for Applied Systems Analysis, pp. 125-159. http://webarchive.iiasa.ac.at/Admin/PUB/Documents/RR-81-006.pdf

Rogers A and LJ Castro. 1981b. Model Migration Schedules (RR-81-030). Laxenburg, Austria: International Institute for Applied Systems Analysis. http://webarchive.iiasa.ac.at/Admin/PUB/Documents/RR-81-030.pdf

Shryock HS and JS Siegel. 1976. The Methods and Materials of Demography (Condensed Edition). San Diego: Academic Press.

Siegel JS and D Swanson. 2004. The Methods and Materials of Demography. Amsterdam: Elsevier.

Timæus IM. 2007. "Impact of HIV on mortality in Southern Africa: Evidence from demographic surveillance", in Caraël M and JR Glynn (eds) HIV, Resurgent Infections and Population Change in Africa. Springer, pp 229–243. doi: https://dx.doi.org/10.1007/978-1-4020-6174-5_12

UN Population Division. 1970. Manual VI: Methods of Measuring Internal Migration. New York: United Nations, Department of Economic and Social Affairs, ST/SOA/Series A/47. http://www.un.org/esa/population/techcoop/IntMig/manual6/manual6.html

Author
RE Dorrington
Suggested citation
RE Dorrington. 2013. Estimation of migration from census data. In RE Dorrington (eds). Tools for Demographic Estimation. Paris: International Union for the Scientific Study of Population. https://demographicestimation.iussp.org/content/estimation-migration-census-data. Accessed 2024-07-22.