Loglinear models of migration flows
Introduction to model applications
The loglinear modelling framework provides several valuable techniques for studying and estimating migration flows within a network of regions. To date, these methods have been applied most often to internal migration systems where regions are defined as subnational administrative units. However, they need not be restricted to domestic migration and may be applied to international systems of migration as well (Raymer 2007).
A migration flow is defined as the number of migrations from one region to another over the course of a specified time frame. There are several different ways to count migrations and each one could yield a different result. For example, Rees and Willekens (1986) make the distinction between registration systems that count the number of interregional residential moves over a reference period and censuses that count persons who reside in a place at the time of the census that is different from the place of residence at the beginning of the reference period.
Regardless of the method used to count migration flows, it is conventional to present them in contingency tables. These are square tables that report the flow counts between origin and destination regions. The flows in the migration table can be perfectly reproduced by the multiplicative component model, which is a saturated (i.e., where there are as many estimated parameters as there are data points) loglinear model. It has been used by Willekens (1983), Rogers, Willekens, Little et al. (2002)) and Rogers, Little and Raymer (2010)) to represent the matrix of flows between regions, and by Raymer and Rogers (2007), Raymer, Bonaguidi and Valentini (2006)) and Rogers, Little and Raymer (2010)) to capture the structure of interregional flows within age categories. The multiplicative components are interpretable and conveniently used to define the structure of migration between the regions of interest (Rogers, Willekens, Little et al. 2002). If calculated for more than one set of interregional flows, defined for different time periods, for example, or for different age, sex or race categories, multiplicative components are useful for comparing migration regimes across these populations.
Loglinear methods may be used to justify simplified representations of migration structure that are more parsimonious than the saturated model. The appropriateness of a reduced model is determined by fitting the predicted flows to the observed flows and by using statistical methods to evaluate the goodness of fit. If the reduced form has merit, i.e., fits the data well, the model may be used to estimate indirectly the flows. The independence model, for example, assumes interregional flows are distributed according to the pattern that could have been predicted based on the marginal distributions of flows across origin and destination regions. If the independence model is confirmed, interregional flows are predictable and can be estimated indirectly, but accurately, if the total sending and receiving flows of each region are given.
Sometimes the structure of migration is hypothesized to be invariant with respect to factors such as time, age, sex, and race. These hypotheses can be represented and tested with loglinear models. Allowing for changes in the level of migration, studies have documented remarkable stability in migration structures, in particular the rates of migration by age, over time (Mueser 1989; Nair 1985; Snickars and Weibull 1977). Other studies have shown consistency in the age patterns of interregional migration over time (Raymer and Rogers 2007). Moreover, the migration structure of the youngest ages, which can be inferred from birthplacespecific population stocks, has, in certain contexts, proven to be a “proxy” for the level of migration and allowed the estimation of migration of the older age groups (Raymer and Rogers 2007; Rogers, Little and Raymer 2010).
These studies have set the stage for establishing the method of offsets as a successful tool for indirectly estimating migration flows. It is a special application of loglinear modelling that forces a known migration structure on to a system that may have missing or unreliable interregional flow data. Using this method, the known migration structure of one time period can be borrowed from another period. In addition, when flows are disaggregated by age, the structure of agespecific interregional flows of one time period can be applied to another period. Furthermore Raymer and Rogers (2007) showed that the level of infant lifetime migration can be applied, using the method of offsets, to estimate indirectly the migration flows of the older ages.
Applications of loglinear models, and the related assumptions, are detailed in the sections that follow, beginning with the twovariable case, i.e., origin and destination. In this section, the loglinear model is defined in the context of twodimensional flow tables, and multiplicative forms as well as additive forms of the saturated model are derived and interpreted. The loglinear model of independence and the “migrants only” quasiindependence model are set out, including illustrations and a brief description of the methods for evaluating goodnessoffit.
The section concludes with an illustration of the method of offsets for indirectly estimating the interregional flow data of one period based on the migration flow patterns of another. When flow data are available for two periods, the periodinvariance assumption can be tested with a loglinear model and the method of offsets. Models that disaggregate the origin and destination of flows into age categories are considered. This is followed by an illustration of how the multiplicative model with age can be applied, using the method of offsets, to estimate indirectly the agespecific interregional flows for another period.
Applications of the twovariable model
To illustrate the twovariable loglinear model, consider the 1973 and 1976 migrations in the Netherlands between types of municipalities categorized into six different groups based on degree of urbanization. These were published by Willekens (1983)) and are presented in Table 1. In this context, there are two variables, region of origin (O) and region of destination (D). Neither is identified as the dependent variable. The outcome variable may be either the interregional migration flow, denoted n_{ij}, in the multiplicative form of the model, or the natural logarithm of the flow, denoted ln(n_{ij}), in the additive form of the model.
Decompositions of the saturated model, each one perfectly regenerating the observed data, are described in the subsections presenting the multiplicative component model and the additive linear model, and three indirect estimation techniques are illustrated in the three subsections describing the independence model, the quasiindependence model and the method of offsets subsections that follow.
Table 1 Migration between municipalities by degree of urbanization,* the Netherlands, 1973 and 1976
A. 1973 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
50,498 
23,829 
8,566 
21,846 
16,264 
18,856 
139,859 

2 
25,005 
27,536 
6,953 
14,326 
16,212 
18,282 
108,314 

3 
15,675 
10,710 
13,874 
6,266 
9,819 
19,701 
76,045 

4 
23,457 
14,169 
4,431 
10,209 
9,386 
10,973 
72,625 

5 
29,548 
25,267 
11,802 
13,160 
15,979 
20,406 
116,162 

6 
46,815 
39,123 
42,399 
25,012 
26,830 
23,304 
203,483 

Total 
190,998 
140,634 
88,025 
90,819 
94,490 
111,522 
716,488 

B. 1976 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
14,473 
14,327 
6,077 
11,689 
10,618 
9,897 
67,081 

2 
14,833 
36,258 
13,289 
17,391 
20,899 
21,869 
124,539 

3 
8,330 
17,764 
25,113 
10,489 
18,171 
29,220 
109,087 

4 
11,315 
16,498 
8,935 
10,537 
10,762 
12,519 
70,566 

5 
11,875 
24,370 
19,151 
12,312 
16,724 
22,591 
107,023 

6 
16,582 
32,336 
52,415 
22,264 
28,182 
27,810 
179,589 

Total 
77,408 
141,553 
124,980 
84,682 
105,356 
123,906 
657,885 

*1: rural municipalities 



2: industrial rural municipalities 



3: specific resident municipalities of commuters 

4: rural towns and small towns 

5. mediumsized towns 

6. large towns of more than 100,000 inhabitants 

Source: Central Bureau of Statistics, The Hague 
Application 1: The multiplicative component model
The multiplicative expression of the saturated loglinear model, called the multiplicative component model, reproduces the elements of the flow table as follows:
$$\text{\hspace{0.17em}}{n}_{ij}=(T)({O}_{i})({D}_{j})(O{D}_{ij})\text{\hspace{0.17em}}.$$
Like all saturated models, it is, strictly speaking, not a model but a way of representing the data. n_{ij} is the observed flow of migration from region i to region j, and the effect parameters are T, O_{i}, D_{j}, OD_{ij}. Therefore, any i to j flow found in the interior 6 by 6 submatrices of Table 1 can be expressed by an equation of the same form as Equation 1 with the corresponding set of parameters. T gives the overall effect, O_{i} gives the effect of origin i, D_{j} gives the effect of destination j, and OD_{ij} gives the effect of the association between O_{i }and D_{j}. Taken together, the parameters of the saturated model represent the spatial structure of migration (Rogers, Willekens, Little et al. 2002).
Two different sets of parameters that satisfy the multiplicative component model have been used in migration studies and both are presented here. Each one offers a different way of representing and interpreting the migration structure. The first is called geometric mean effect coding (Knoke and Burke 1980; Willekens 1983) and the second is called total sum reference coding (Raymer and Rogers 2007; Rogers, Little and Raymer 2010). A third multiplicative component model is derived in the subsection presenting the loglinear additive model.
Application 2: Geometric mean effect coding
Geometric mean effect coding was the first decomposition of Equation 1 used for migration analysis. It was proposed by Birch (1963) and is formally equivalent to the gravity model of migration (Willekens 1983). Table 2 shows the multiplicative components resulting from geometric mean effect coding of the Netherlands data from Table 1. Note that the overall component (T) is set out in the grand total locations of the table, the origin components (O_{i}) are set out in the rowtotal locations, the destination components (D_{j}) are set out in the columntotal locations, and the origindestination interaction components (OD_{ij}) are set out in the cells of the interior submatrices.
Table 2 Multiplicative components using geometric mean effect coding
A. 1973 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
1.457 
0.940 
0.656 
1.352 
0.933 
0.882 
1.180 

2 
0.885 
1.332 
0.653 
1.087 
1.140 
1.048 
0.962 

3 
0.771 
0.720 
1.811 
0.661 
0.959 
1.570 
0.692 

4 
1.275 
1.052 
0.639 
1.190 
1.014 
0.966 
0.627 

5 
0.943 
1.102 
1.000 
0.901 
1.013 
1.055 
1.067 

6 
0.838 
0.957 
2.015 
0.960 
0.954 
0.676 
1.903 

Total 
1.711 
1.252 
0.644 
0.798 
0.861 
1.056 
17,168.003 

B. 1976 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
1.753 
0.984 
0.571 
1.317 
0.979 
0.787 
0.656 

2 
0.986 
1.366 
0.686 
1.075 
1.057 
0.954 
1.195 

3 
0.655 
0.792 
1.533 
0.767 
1.088 
1.508 
1.010 

4 
1.277 
1.055 
0.783 
1.106 
0.925 
0.927 
0.704 

5 
0.900 
1.047 
1.127 
0.868 
0.965 
1.124 
1.048 

6 
0.769 
0.850 
1.888 
0.960 
0.995 
0.847 
1.712 

Total 
0.768 
1.354 
0.989 
0.825 
1.008 
1.169 
16,401.919 
The overall effect, T, is described as the constant of proportionality or the size main effect (Willekens 1983). It is the geometric mean of all interregional flow values:
$$\text{\hspace{0.17em}}T={\left[{\displaystyle \prod _{ij}n{}_{ij}}\right]}^{\left(\frac{1}{m\text{\hspace{0.17em}}\text{\hspace{0.17em}}\times \text{\hspace{0.17em}}m}\right)},$$
where m is the number of origin regions (rows) = the number of destination regions (columns). T equals 17,168.003 for 1973 and 16,401.919 for 1976.
For a particular region i, the main effect of that region of origin is the ratio of the geometric mean of flows originating from i divided by the overall geometric mean:
$$\text{\hspace{0.17em}}{O}_{i}=\frac{1}{T}{\left[{\displaystyle \prod _{j}n{}_{ij}}\right]}^{\frac{1}{m}}\text{\hspace{0.17em}}.$$
The main effect, O_{i}, shows the relative importance of region i as a source of migrations (Alonso 1986). For example, based on the 1973 data, the effect of originating in Category 4 is equal to:
$$\text{\hspace{0.17em}}{O}_{4}=\frac{1}{17168.003}{\left[23457\times 14169\times 4431\times 10209\times 9386\times 10973\right]}^{\frac{1}{6}}=0.627\text{\hspace{0.17em}}.$$
This is the smallest of the origin (row) effects, which suggests that Category 4 was the least important source of migrations in 1973.
Similarly, the destination main effect, D_{j}, gives the relative importance of region j as an attractor of migrants. It is ratio of the geometric mean of column j to the total geometric mean and the formula is:
$$\text{}{D}_{j}=\frac{1}{T}{\left[{\displaystyle \prod _{i}n{}_{ij}}\right]}^{\frac{1}{m}}\text{\hspace{0.17em}}\text{}.$$
For example, for municipalities in Category 4, the destination effect in 1973 is equal to:
$$\text{\hspace{0.17em}}{D}_{4}=\frac{1}{17168.00}{\left[21846\times 14326\times 6266\times 10209\times 13160\times 25012\right]}^{\frac{1}{6}}=0.\text{7}98\text{\hspace{0.17em}}.$$
All other row and column effects can be derived in the same way. Each is the geometric mean of the row (or column) elements divided by the overall geometric mean, and they are equivalent to the balancing factors in the gravity model (Willekens 1983).
They can be compared across regions and across time periods. For example, Category 6 was the most important source of migrations in 1973 (1.903 is greater than the other destination effects), and in 1976 (1.712 is greater than the other destination effects). Category 1 was less important as a destination in 1976 than in 1973 (0.768 is less than 1.711), and, in 1973, it was less important as a source of migrations than as a destination for migrations (1.180 is less than 1.711).
Panels A and B in Table 2 are sometimes called the spatial interaction matrices. The elements are the OD_{ij} interaction effects in Equation 1 and each one is equal to the observed flow between i and j divided by the expected flow, which is the product of the other three parameters. The formula is:
$$\text{\hspace{0.17em}}O{D}_{ij}=\frac{{n}_{ij}}{(T)({O}_{i})({D}_{j})}\text{\hspace{0.17em}}.$$
Each OD_{ij} expresses the departure of the observed flow, n_{ij}, from the expected flow based on the assumption of no association between the destination region j and the origin region i, i.e., (T)(O_{i})(D_{j}). They have been interpreted as indicators of the accessibility, the ease of interaction, or the attractiveness between two regions (Rogers, Willekens, Little et al. 2002).
Values equal to 1.0 indicate independence, i.e., no association between the origin and the destination. As implied by Equation 1, if an OD_{ij} parameter is equal to 1.0, n_{ij} is determined by the values of T, O_{i} and D_{j} alone. A departure from 1.0 in either direction is an indication of an association between the destination and the origin. Values greater than 1.0 indicate higher than expected levels of accessibility/attractiveness and values less than 1.0 indicate less than expected accessibility/attractiveness.
Since the 1973 diagonal effects are generally greater than 1.0, it appears migrants were unexpectedly attracted to destinations in the same category of municipality. Category 6 was an exception. Migrants from large towns of more than 100,000 inhabitants (i.e., Category 6) were more attracted to commuter municipalities (i.e., Category 3) than to other large towns (2.015 is greater than 0.676).
Table 2 shows all the parameters necessary for reproducing the 1973 and 1976 flows. To verify that any flow in Table 1 can be reproduced by the multiplicative components, take, for example, the 1973 flow from Category 2 to Category 3:
n_{2,3} =6953=17168.003×0.962×0.644×0.653 .
The parameter values, however, are not all independent of each other. In other words, some parameter values can be derived from the others. For one year of data, for all i and j combinations, there are 36 interaction effects, 6 origin main effects, 6 destination main effects, and one overall effect as reported in Table 2. However, the 49 parameters, reported for each year in Table 2, were derived from only 36 observed flows, making 13 more parameters than original data points, implying that 13 parameters must be redundant. In other words, 13 of the 49 parameters can be calculated from the other 36, and the relationship between parameters is determined by the following constraints associated with geometric mean effect coding. The first set of constraints forces the products of the origin main effects (and destination effects) to be equal to 1. This is expressed as
$$\text{}{\displaystyle \prod _{i}O{}_{i}}=1\text{\hspace{0.17em}}\text{}\text{and}\text{\hspace{0.17em}}\text{}{\displaystyle \prod _{j}D{}_{j}}=1\text{}\text{\hspace{0.17em}}.$$
The second set of constraints is imposed on the interaction elements of each row and column, making the products of the interior elements in each row (and column) equal to 1. In other words, if five of the interaction effects associated with a particular origin (or destination) are given, the sixth interaction effect would be implied.
This is expressed as
$$\text{}{\displaystyle \prod _{i}O{}_{ij}}=1\text{}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{}{\displaystyle \prod _{j}D{}_{ij}}=1\text{}\text{\hspace{0.17em}}.$$
In general, if there are m regions there are m^{2} linearly independent parameters and 1+m+m+(m×m) multiplicative components. For all of the geometric mean effect coding computations, see Table 2 in the Multiplicative Components sheet of the accompanying workbook.
Application 3: Total sum reference coding
Geometric mean effect coding, which uses the geometric mean as the reference value, was the earliest loglinear decomposition used to describe migration (Rogers, Willekens, Little et al. 2002; Willekens 1983). Recently, however, total sum reference coding has become more standard (Raymer and Rogers 2007; Rogers, Little and Raymer 2010). While both decompositions satisfy Equation 1, the effects under total sum reference coding are more transparent. For example, the main effect, T, is now the total number of migrants, denoted n_{++}. O_{i} is now the proportion of all migrants leaving from region i (i.e., n_{i+}/n_{++}), and D_{j} is the proportion of all migrants moving to region j (i.e., n_{+j}/n_{++}). The interaction component OD_{ij} is now defined as n_{ij}/[(T)(O_{i})(D_{j})] or the ratio of the observed number of migrants, n_{ij}, to the expected number, (T)(O_{i})(D_{j}). All effects taken together provide another way to represent the spatial structure of migration.
The multiplicative components derived from total sum reference coding are set out in Table 3. Consider, for example, the 8566 migrations from Category 1 to Category 3 in 1973 disaggregated into the four multiplicative components:
$$\begin{array}{l}{n}_{13}=(T)({O}_{1})({D}_{3})(O{D}_{13})\text{\hspace{0.17em}}\\ ={n}_{++}\left(\frac{{n}_{1+}}{{n}_{++}}\right)\left(\frac{{n}_{+3}}{{n}_{++}}\right)\left[\frac{{n}_{13}}{\left({n}_{++}\right)\left(\frac{{n}_{1+}}{{n}_{++}}\right)\left(\frac{{n}_{+3}}{{n}_{++}}\right)}\right]\text{\hspace{0.17em}}\\ \text{\hspace{0.17em}}=(716488)\left(\frac{139859}{716488}\right)\left(\frac{88025}{716488}\right)\left(\frac{8566}{17183}\right)\text{\hspace{0.17em}}\\ =\text{\hspace{0.17em}}716488(0.102)(0.190)(0.477)\\ =8566\text{\hspace{0.17em}}.\end{array}$$
The interpretations of these components are relatively straightforward. The overall component is the reported total number of migrations in 1973, i.e., 716,488. The origin component represents the share of all migrants from each region, i.e., 10 per cent of all migrations originated in the Category 1. The destination component represents the shares of all migrations to each region, i.e., 19 per cent of all migrations had Category 3 as the destination. Finally, the interaction component represents the ratio of observed migrants to expected migrants, and there were roughly 48 observed migrations between region 1 and 3 for every 100 expected. The expected flow is based on the marginal total information, i.e., (T)(O_{1})(D_{3}).
Table 3 Multiplicative components using total sum reference coding
A. 1973 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
1.354 
0.868 
0.499 
1.232 
0.882 
0.866 
0.195 

2 
0.866 
1.295 
0.523 
1.043 
1.135 
1.084 
0.151 

3 
0.773 
0.718 
1.485 
0.650 
0.979 
1.664 
0.106 

4 
1.212 
0.994 
0.497 
1.109 
0.980 
0.971 
0.101 

5 
0.954 
1.108 
0.827 
0.894 
1.043 
1.129 
0.162 

6 
0.863 
0.980 
1.696 
0.970 
1.000 
0.736 
0.284 

Total 
0.267 
0.196 
0.123 
0.127 
0.132 
0.156 
716,488 

B. 1976 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
1.834 
0.993 
0.477 
1.354 
0.988 
0.783 
0.102 

2 
1.012 
1.353 
0.562 
1.085 
1.048 
0.932 
0.189 

3 
0.649 
0.757 
1.212 
0.747 
1.040 
1.422 
0.166 

4 
1.363 
1.087 
0.667 
1.160 
0.952 
0.942 
0.107 

5 
0.943 
1.058 
0.942 
0.894 
0.976 
1.121 
0.163 

6 
0.785 
0.837 
1.536 
0.963 
0.980 
0.822 
0.273 

Total 
0.118 
0.215 
0.190 
0.129 
0.160 
0.188 
657,885 
Like geometric mean effect coding, the decomposition based on total sum reference coding gives more parameters than original data points. The constraints that define the relationships between parameters, and thus allow the redundant parameters to be derived, are as follows:
$$\begin{array}{l}\text{}{\displaystyle \sum _{i}{O}_{i}}=1\text{}\text{\hspace{0.17em}};\text{\hspace{0.17em}}\text{}{\displaystyle \sum _{j}{D}_{j}}=1\text{}\text{\hspace{0.17em}};\text{\hspace{0.17em}}\text{}\\ \frac{{\displaystyle \sum _{i}{O}_{i}}{\displaystyle \sum _{j}O{D}_{ij}^{}}}{m}=1\text{}\text{\hspace{0.17em}},\text{\hspace{0.17em}}\text{and}\\ \text{}\frac{{\displaystyle \sum _{j}{O}_{i}}{\displaystyle \sum _{i}O{D}_{ij}^{}}}{m}=1\text{}\text{\hspace{0.17em}}.\end{array}$$
where m is the number of regions (Raymer, Bonaguidi and Valentini 2006).
For all of the total sum reference coding computations, see Table 3 in the Multiplicative components sheet of the accompanying workbook.
Comparing two multiplicative component models
If the same decomposition scheme is applied to two sets of flow data from a given system of regions, all but the T parameter are scale free. This means that taking the ratios of two sets of components provides a simple method for examining stability in migration structure without confounding the effects of growth or decline in overall levels of migration (Rogers, Willekens, Little et al. 2002). In Table 4, ratios of the 1976 to 1973 components are displayed. Several depart substantially from 1 indicating the migration structure changed in the three years between 1973 and 1976. For example, the ratio of the components for OD_{11} is equal to 1.354, implying that migration within Category 1 was more attractive in 1976 than in 1973. In contrast, the ratio of the components for OD_{33} is equal to 0.816, suggesting migration within Category 3 was less attractive in 1976 than in 1973.
Table 4 Ratios of 1976 to 1973 multiplicative components
Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 
1 
1.354 
1.144 
0.957 
1.099 
1.121 
0.904 
0.522 
2 
1.169 
1.045 
1.075 
1.040 
0.923 
0.860 
1.252 
3 
0.839 
1.055 
0.816 
1.149 
1.062 
0.854 
1.562 
4 
1.125 
1.093 
1.342 
1.046 
0.972 
0.970 
1.058 
5 
0.988 
0.955 
1.139 
1.000 
0.936 
0.993 
1.003 
6 
0.909 
0.854 
0.906 
0.993 
0.980 
1.117 
0.961 
Total 
0.441 
1.096 
1.546 
1.015 
1.214 
1.210 
0.918 
Application 4: The loglinear additive model
Another form of the saturated loglinear model, which is an alternative to the multiplicative component model, is the linear additive model. Whether using the linear additive or the multiplicative form of the saturated loglinear model, the parameters represent the spatial structure of migration (Rogers, Willekens, Little et al. 2002) and each flow value can be fully reproduced by the parameters.
Because the multiplicative formation is formally equivalent to the gravity model (Willekens 1983), it is considered to be more appropriate than the linear additive model for representing spatial migration structures. On the other hand, the linear additive form is often found in statistics and when a standard statistical package (e.g., SPSS, Stata, R) is used to estimate a loglinear model, the parameters are always reported in the linear additive form. For that reason, the conventional calculations and interpretations of the parameters in the linear additive model are described in this subsection.
The additive formulation is a linear function of logarithms and it makes evident why the model came to be called the loglinear model (Knoke and Burke 1980). It is mathematically equivalent to the multiplicative component model and it results from taking logarithms of both sides of Equation 1 as follows:
$$\text{\hspace{0.17em}}\mathrm{ln}({n}_{ij})=\mathrm{ln}(T)+\mathrm{ln}({O}_{i})+\mathrm{ln}({D}_{j})+\mathrm{ln}(O{D}_{ij})\text{\hspace{0.17em}}$$
which can be expressed more concisely as:
$$\text{\hspace{0.17em}}\mathrm{ln}({n}_{ij})=\lambda +{\lambda}_{i}^{O}+{\lambda}_{j}^{D}+{\lambda}_{ij}^{OD}\text{\hspace{0.17em}}.$$
The λ values are simply the natural logarithms of the parameters appearing in Equation 1. The O, D, and OD superscripts are parameter descriptors (not exponents) and the subscripts i and j refer to the categories of the origin and destination variables, respectively.
Applying natural logarithmic transformations to the parameters in Table 2 and Table 3 would result in sets of corresponding linear additive parameters. However, just as there are at least two decompositions of the multiplicative component model, i.e., the geometric mean reference coding and the total sum effect coding, there are multiple strategies for arriving at sets of parameters that satisfy the linear additive model (Powers and Xie 2008), and the approaches taken by the standard statistical packages are not simply logarithmic transformations of the multiplicative components derived earlier.
Recall that a migration system with m regions has m×m linearly independent parameters. The multiplicative component models described above give an interpretable value for 1+m+m+(m×m) parameters, though they are not linearly independent of each other. On the other hand, statistical routines in SPSS, Stata, and R calculate and report only linearly independent parameters, resulting in 1 value for $$\text{}{\lambda}_{}^{T}\text{}$$, m1 values for [wysiwyg_mathjax:2334:]], m1 values for $$\text{}{\lambda}_{j}^{D}\text{}$$, and
(m1) ×(m1) values for$$\text{}{\lambda}_{ij}^{OD}\text{}.\text{\hspace{0.17em}}$$ The particular set of parameter values that is calculated and reported depends on the contrast coding scheme used by the software. Contrast coding blocks out one region by fixing all linear additive parameters for that region equal to 0. SPSS, for example, fixes the parameters for the last region, i.e., the region assigned the highest numeric value, m, in this case:
$$\text{}{\lambda}_{m}^{O}={\lambda}_{m}^{D}={\lambda}_{mj}^{OD}={\lambda}_{im}^{OD}=0\text{}\text{\hspace{0.17em}}.$$
The parameters of the Netherlands data reported by SPSS are displayed in Table 5. The SPSS commands that generate these results for the 1973migration table, along with the SPSS output, are presented in Appendix 1. Table 5 with the Excel formulae for calculation of the parameters are available in the Contrast coding sheet of the accompanying workbook.
Table 5 Additive linear parameters using "last region" contrast coding
A. 1973 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
0.288 
0.284 
1.388 
0.076 
0.289 
0.000 
0.212 

2 
0.384 
0.109 
1.565 
0.315 
0.261 
0.000 
0.243 

3 
0.926 
1.128 
0.949 
1.216 
0.837 
0.000 
0.168 

4 
0.062 
0.262 
1.505 
0.143 
0.297 
0.000 
0.753 

5 
0.327 
0.304 
1.146 
0.509 
0.385 
0.000 
0.133 

6 
0.000 
0.000 
0.000 
0.000 
0.000 
0.000 
0.000 

Total 
0.698 
0.518 
0.598 
0.071 
0.141 
0.000 
10.056 

B. 1976 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
0.897 
0.219 
1.122 
0.389 
0.057 
0.000 
1.033 

2 
0.129 
0.355 
1.132 
0.007 
0.059 
0.000 
0.240 

3 
0.738 
0.648 
0.785 
0.802 
0.488 
0.000 
0.049 

4 
0.416 
0.125 
0.971 
0.050 
0.165 
0.000 
0.798 

5 
0.126 
0.075 
0.799 
0.385 
0.314 
0.000 
0.208 

6 
0.000 
0.000 
0.000 
0.000 
0.000 
0.000 
0.000 

Total 
0.517 
0.151 
0.634 
0.222 
0.013 
0.000 
10.233 
Notice the parameters for the last region are equal to 0, and, therefore, make no contribution to Equation 2. Interpretation of the parameters in Table 5 is somewhat complicated since they are in logarithmic units. Conversion back to the multiplicative components by exponentiation gives yet another set of multiplicative components that satisfy Equation 1. These are presented in Table 6, and they are the multiplicative components associated with “last region” contrast coding. Generally, these are not used to describe the spatial structure of migration, but they are useful in describing migration systems because the interaction parameters, OD_{ij}, are equivalent to odds ratios.
Table 6 Multiplicative components using "last region" contrast coding
A. 1973 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
1.333 
0.753 
0.250 
1.079 
0.749 
1.000 
0.809 

2 
0.681 
0.897 
0.209 
0.730 
0.770 
1.000 
0.785 

3 
0.396 
0.324 
0.387 
0.296 
0.433 
1.000 
0.845 

4 
1.064 
0.769 
0.222 
0.867 
0.743 
1.000 
0.471 

5 
0.721 
0.738 
0.318 
0.601 
0.680 
1.000 
0.876 

6 
1.000 
1.000 
1.000 
1.000 
1.000 
1.000 
1.000 

Total 
2.009 
1.679 
1.819 
1.073 
1.151 
1.000 
23,304 

B. 1976 Migration table 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
2.453 
1.245 
0.326 
1.475 
1.059 
1.000 
0.356 

2 
1.138 
1.426 
0.322 
0.993 
0.943 
1.000 
0.786 

3 
0.478 
0.523 
0.456 
0.448 
0.614 
1.000 
1.051 

4 
1.516 
1.133 
0.379 
1.051 
0.848 
1.000 
0.450 

5 
0.882 
0.928 
0.450 
0.681 
0.731 
1.000 
0.812 

6 
1.000 
1.000 
1.000 
1.000 
1.000 
1.000 
1.000 

Total 
0.596 
1.163 
1.885 
0.801 
1.013 
1.000 
27,810 
For example, the overall parameter from the 1973migration data reported in Table 5, λ^{T}, gives the natural logarithm of the observed migrations for the reference region:
 ln(n_{66})=10.056, and from Table 6, the companion parameter T gives the n_{66} migration flow: n_{66}=exp(10.056)=23304.
Another illustration from the 1973migration table in Table 5 shows how the origin main effects, $$\text{}{\lambda}_{i}^{O}\text{}$$, are added to the overall parameter to reproduce the migrations from Category 1 to the reference destination, Category 6, reported in Table 1. For example:
 ln(n_{16})= 10.056.212=9.845, and the corresponding multiplicative component, O_{1} times T from Table 6 gives: n_{16}=27810*.356=18856.
Using the same approach, the logarithms of all the migration flows can be reproduced by applying Equation 1 with the appropriate parameters from Table 6, or the observed flows can be reproduced by applying Equation 2 using the parameters in Table 5.
The association parameters in the linear form, $$\text{}{\lambda}_{ij}^{OD}\text{}$$, are logged odds ratios (LORs), which are the logarithm of the ratio of two odds: 1) the odds of migration to region j rather than the reference region, conditional on originating in region i; and 2) the odds of migration to region j rather than the reference region, conditional on originating in the reference region. For example, from the 1973 submatrix in Table 5,$$\text{}{\lambda}_{23}^{OD}\text{}$$ = 1.565, which is calculated as:
$$\text{\hspace{0.17em}}{\lambda}_{23}^{OD}=\mathrm{ln}\left[\frac{\frac{{n}_{23}}{{n}_{26}}}{\frac{{n}_{63}}{{n}_{66}}}\right]=\mathrm{ln}\left[\frac{\frac{6,953}{18,282}}{\frac{42,399}{23,304}}\right]=1.565\text{\hspace{0.17em}}.$$
In words, the parameter is described as the logged ratio of the odds of migration to Category 3, rather than to Category 6, between a migrant originating in Category 2 and one originating in Category 6.
Odds ratios measure the relative likelihood of one outcome to another, and because they are more standard than LOR, it may be easier to exponentiate the LORs and interpret the association parameters, presented in Table 6, as odds ratios. For example, the model parameter OD_{23}, for the 1973 data, is calculated as:
$$\text{\hspace{0.17em}}O{D}_{23}=\mathrm{exp}(1.565)=\left[\frac{\frac{{n}_{23}}{{n}_{26}}}{\frac{{n}_{63}}{{n}_{66}}}\right]=0.209\text{\hspace{0.17em}}.$$
In words, the odds that a migrant from Category 2 will choose Category 3 over Category 6 is approximately 1/5^{th }the odds that a migrant from Category 6 will choose Category 3 over Category 6. Oddsratios are always positive and always depend on the choice of reference category. An odds ratio equal to 1 means a null relationship, i.e., statistical independence. Values higher than 1 mean a positive association and values less than 1 indicate a negative association.
Stata and R use a different contrast coding scheme to SPSS. Both of these statistical packages use the “first region” contrast coding as opposed to the “last region” contrast coding used by SPSS. In these two programs, the parameters for the first region, i.e., the region assigned the lowest numeric value, are fixed to be equal to 0, i.e.,v$$\text{}{\lambda}_{1}^{O}={\lambda}_{1}^{D}={\lambda}_{1j}^{OD}={\lambda}_{i1}^{OD}=0\text{}\text{\hspace{0.17em}}.$$ The Stata and R commands for generating the linear additive parameters, as well as the corresponding output, for the 1973 migration data can be downloaded from Appendix 1.
All forms of the saturated model and all statistical methods for estimating the interaction parameters are in agreement and provide substantively similar results. The formulae for the calculations of the parameters are available in the Linear Additive Parameters sheet of the accompanying workbook. Furthermore, tests that each linear additive interaction parameter is equal to 0 are done automatically by SPSS and Stata. These results are available from Appendix 1 and they show that each nonredundant interaction parameter is statistically significant. See Agresti and Finlay (2009) and Powers and Xie (2008)) for descriptions of the standard errors of the estimates.
Application 5: The independence model
All the models presented to this point have been saturated, and, therefore, perfectly represent the observed flows. Generally, the substantively interesting parameters are the interaction parameters because they indicate associations between pairs of regions. The independence model, however, hypothesizes that the interaction parameters are uninteresting and unnecessary because all multiplicative interaction parameters, OD_{ij}, are equal to 1, or, equivalently, all linear additive interaction parameters, $$\text{}{\lambda}_{ij}^{OD}\text{}$$, are equal to 0. The independence model implies that the interaction terms should fall out of the model, reducing it to the most parsimonious form of a twovariable model, i.e. $$\text{}{n}_{ij}=(T)({O}_{i})({D}_{j})\text{}$$,or, equivalently, $$\text{}\mathrm{ln}({n}_{ij})=\lambda +{\lambda}_{i}^{O}+{\lambda}_{j}^{D}\text{}.\text{\hspace{0.17em}}$$
Visual inspection of the interaction parameters in the saturated loglinear model is one strategy for investigating the independence hypothesis. Another method is to calculate row or column conditional distributions. If the conditional distributions within rows (origins) are identical, there is independence between origins and destinations. In addition, since independence is a symmetric property, if the conditional distributions within rows (origins) are identical, the distributions within columns (destinations) also will be identical (Agresti and Finlay 2009; Powers and Xie 2008). In the Independence sheet of the accompanying workbook, the percentages of the Netherlands migrations within columns (destinations) are calculated. The column percentages are quite varied, suggesting, like the interaction parameters, that statistical independence is unfounded in this example.
The independence hypothesis implies that each particular interregional flow can be determined by the sizes of the marginal flows. Let N_{ij} be the expected flow between regions i and j if the independence hypothesis is true. N_{ij} is then equal to the total number of flows in the migration system, n_{++}, multiplied by the proportion of the all migrants leaving from region i, n_{i+}/n_{++}, times the proportion of all migrants moving to region j, n_{+j}/n_{++}, i.e., N_{ij} = n_{++}(n_{i+}/n_{++})(n_{+j}/n_{++}). If independence can be assumed, a good estimate of an interregional flow is N_{ij}, and the problem of estimating interregional migration flows is truly simplified.
The differences between the observed flows, n_{ij}, and the expected flows, N_{ij}, form the basis of the goodnessoffit evaluation and the Pearson ChiSquared Statistic, denoted Χ^{2}, which is widely used to summarize these discrepancies. It is calculated as:
$${\chi}^{2}=\text{\hspace{0.17em}}{\displaystyle \sum \frac{{({n}_{ij}{N}_{ij})}^{2}}{{N}_{ij}}}\text{}\text{\hspace{0.17em}},$$
where the summation is taken over all internal cells in the migration matrix. When there is perfect agreement between the observed and the expected flows, over all cells, the Χ^{2} equals 0 indicating the independence model fits the data perfectly. Larger differences between n_{ij} and N_{ij} produce larger Χ^{2 }values and increasingly stronger evidence that the independence model is inadequate. In general, smaller values indicate a good fit and larger values a poor fit.
If the independence hypothesis is true, the Χ^{2} statistic is governed by the Χ^{2 }probability distribution with (m1)×(m1) degrees of freedom. This distribution provides the basis for testing the significance of the Χ^{2 }statistic (Agresti 2007; Agresti and Finlay 2009). If the Χ^{2 }statistic falls in the rightsided extremes of its distribution, it signifies a low probability, e.g., p<0.05, that the independence hypothesis is true, and the model is rejected. The Χ^{2} values associated with independence model applied to the Netherlands data in Table 1 are calculated and reported in the Independence sheet of the accompanying workbook. See Appendix 2 for the SPSS, Stata and R commands for testing the independence model with the 1973 example data.
The Χ^{2} value associated with the 1973 example data is 47,623, and the degrees of freedom (df) are 25. The associated pvalue is less than 0.000, and the hypothesis of independence is rejected. (However, see the comments below about the limitations of this test when the sample size is large.) This is not surprising given the three multiplicative decompositions of the Netherlands data, presented in Table 2, Table 3 and Table 6. The evidence consistently shows strong associations between regions and many of the multiplicative association parameters are not close to 1. Furthermore, the standard errors reported in Appendix 1 by SPSS and Stata indicate the linear additive interaction parameters are significantly different from 0.
One alternative to the Χ^{2} statistic is called either the likelihood ratio statistic, the deviance, or the G^{2} statistic. All are different names for the same test statistic, and which name is used is determined by the preferences of authors of text books and software packages. For simplicity, G^{2 }will be adopted here. It is similar to the Χ^{2} in that values close to 0 indicate a wellfitting model and large values indicate a poor fit. If the hypothesized independence model holds, the G^{2} statistic also has a Χ^{2} distribution.
The G^{2} statistic has general utility that goes well beyond the independence model in loglinear analysis. It is widely used for comparing a simpler model to a more complex model. The G^{2} statistic is derived from the ratio of two likelihoods: 1) the likelihood that the constrained model (here the model of independence) fits the data; and 2) the likelihood that the unconstrained model (here the saturated model) fits the data. If the ratio is close to 1, the simpler, constrained, and more parsimonious model is preferred because it represents the data as well as the more complex model does.
The ratio of the two likelihoods does not have a Χ^{2} distribution. However, when the ratio is transformed into natural logarithm units and multiplied by 2, it becomes G^{2}, which is a Χ^{2} distributed variable with (m1)×(m1)degrees of freedom. If L_{c} is the likelihood associated with the constrained (i.e., independence) model, and L_{u} is the likelihood under the unconstrained (i.e., saturated) model, then G^{2} is calculated as:
$$\text{}{G}^{2}=2\mathrm{ln}\left(\frac{{L}_{c}}{{L}_{u}}\right)=2\mathrm{ln}{L}_{c}+2\mathrm{ln}{L}_{u}\text{}\text{\hspace{0.17em}}.$$
Because the saturated model fits the data perfectly, i.e., L_{u} = 1, G^{2} = –2ln L_{c}. The values, based on the example and the statistical software, are reported in Appendix 2. The value is reported to be 46,477.63 and it is called “Deviance” by SPSS and Stata. It is rounded and reported to be equal to 46,480 by R, where it is called “Residual Deviance.” With 25 degrees of freedom the probability that the independence model holds is effectively 0.
The Χ^{2} and the G^{2} statistics are asymptotically equivalent (Powers and Xie 2008) and they form the bases of the Pearson Chisquare and the likelihood ratio tests, respectively. As with all inferential tests, effective use requires attention to underlying assumptions as well as limitations. Both tests rely on the assumption that each interregional flow count in the migration table follows an independent Poisson distribution (Powers and Xie 2008) and both tests have the important limitations that are related to sample size. The Χ^{2} statistic is inflated by large samples. Therefore, the Pearson Chisquare test is not appropriate to when the sample size is large. The G^{2} statistic and the likelihood ratio test is preferred in this situation (Powers and Xie 2008). The Pearson Chisquare test is preferred when the expected frequencies average between 1 and 10, but neither statistic works well if most of the expected frequencies are less than 5 (Agresti and Finlay 2009; Powers and Xie 2008).
Criticism has been made of the G^{2} statistic as well when samples are large (Raftery 1986, 1995) and there is growing consensus that information measures should be considered along with traditional significance tests in assessing model fit. The Bayesian Information Criterion (BIC) is closely related to G^{2}, and it is calculated by Stata as: BIC = G^{2}–df ln(mxm), and by SPSS as:
$$\text{}BIC=2\mathrm{ln}{L}_{c}+p\mathrm{ln}(m\times m)\text{}\text{\hspace{0.17em}},$$
where p is the number of parameters estimated in the independence model, i.e., 2m1. A low value suggests choosing the independence model over the saturated model (Powers and Xie 2008).
Akaike’s Information Criterion (AIC) is another alternative that takes on smaller values for better fitting models, since it judges how close the fitted values are to the expected values (Agresti 2007). In SPSS and R, it is calculated as $$\text{}AIC=2(\mathrm{ln}{L}_{c}p)\text{}\text{\hspace{0.17em}},$$ where p is the number of parameters estimated in the independence model, i.e., 2m1. In Stata, it is calculated as:
$$\text{\hspace{0.17em}}AIC=\frac{2(\mathrm{ln}{L}_{c}p)}{m\times m}\text{\hspace{0.17em}}.$$
As shown in Appendix 2, SPSS and Stata report the BIC and AIC, and R reports only the rounded AIC. As previously stated, there are differences in the formulae used. The BIC reported by SPSS equals 46,934.237, and the BIC reported by Stata equals 46,388.04. R reports only the AIC, which is equal to 46,920, the rounded value reported by SPSS, 46,916.818. Stata’s AIC value is substantially smaller and is equal to 1,303.245. All reported BIC and AIC values are large and add to the growing evidence that discredits the independence model for this example.
The quasiindependence model
The independence model rarely provides an adequate fit to migration data. This is due, in part, to the overwhelming tendency to continue to reside in the same region. The quasiindependence model allows these “immobility” effects (Powers and Xie 2008) to be removed from the model, and this often results in improved predictions of interregional migration flows. The quasiindependence model has been applied effectively to migration data obtained from national censuses (Agresti 1990; Rogers, Little and Raymer 2010; Rogers, Willekens, Little et al. 2002), where persons who reported living in the same region at the time of the census as at the beginning of the reference period are represented in the diagonal elements of a migration table.
To illustrate, United States nativeborn migration data between 1985 and 1990 are reported in Panel A of Table 7. Clearly, the flows reported in the four diagonal elements of the interior submatrix are substantially larger than the offdiagonal elements, indicating that the propensity to maintain residence in the same region is much more typical than migration between regions.
The clustering along the diagonal cells contributes significantly to the poor fit of the independence model, and the dominating influence of the persons remaining in the region of origin have caused researchers to favour omitting them from the model. If migrants are defined as people changing their region of residence, this type of flow matrix is sometimes called a “migrants only” matrix. It is particularly useful for studying migration structure since it eliminates people who made no move or moved within the same region. Panel B of Table 7 displays the flow table with the diagonal elements set to 0, and the marginal totals adjusted accordingly.
Table 7 United States nativeborn migration flows, 19851990
A. Full migration table 

Destination 

Origin 
Northeast 
Midwest 
South 
West 
Total 

Northeast 
40,262,319 
336,091 
1,645,843 
479,819 
42,724,072 

Midwest 
351,029 
50,677,007 
1,692,687 
958,696 
53,679,419 

South 
778,868 
1,197,134 
69,563,871 
1,150,649 
72,690,522 

West 
348,892 
668,979 
1,082,104 
37,872,893 
39,972,868 

Total 
41,741,108 
52,879,211 
73,984,505 
40,462,057 
209,066,881 

B. Migrantsonly table 

Destination 

Origin 
Northeast 
Midwest 
South 
West 
Total 

Northeast 
0 
336,091 
1,645,843 
479,819 
2,461,753 

Midwest 
351,029 
0 
1,692,687 
958,696 
3,002,412 

South 
778,868 
1,197,134 
0 
1,150,649 
3,126,651 

West 
348,892 
668,979 
1,082,104 
0 
2,099,975 

Total 
1,478,789 
2,202,204 
4,420,634 
2,589,164 
10,690,791 
The multiplicative components, using total sum reference coding, for the full migration table and the migrantonly table are reported in Table 8. The magnitude of the multiplicative component model parameters for the full data certainly departs from what is expected under the hypothesis of independence. They are substantially above 1.0 on the diagonal and the offdiagonal components are far below 1.0. In comparison, the multiplicative components for the migrantsonly table are constrained to be equal to 0 in order to reproduce the structural zeros on the diagonal, and, as a result, the offdiagonal components are closer to 1.0
Table 8 Multiplicative components* of United States nativeborn migration flows, 19851990
A. Full migration table 

Destination 

Origin 
Northeast 
Midwest 
South 
West 
Total 

Northeast 
4.720 
0.031 
0.109 
0.058 
0.204 

Midwest 
0.033 
3.733 
0.089 
0.092 
0.257 

South 
0.054 
0.065 
2.704 
0.082 
0.348 

West 
0.044 
0.066 
0.076 
4.896 
0.191 

Total 
0.200 
0.253 
0.354 
0.194 
209,066,881 

B. Migrantsonly table 

Destination 

Origin 
Northeast 
Midwest 
South 
West 
Total 

Northeast 
0.000 
0.663 
1.617 
0.805 
0.230 

Midwest 
0.845 
0.000 
1.363 
1.318 
0.281 

South 
1.801 
1.859 
0.000 
1.520 
0.292 

West 
1.201 
1.547 
1.246 
0.000 
0.196 

Total 
0.138 
0.206 
0.413 
0.242 
10,690,791 

*Total sum reference coding 
The quasiindependence model requires that only migrations between different regions satisfy the independence assumption. This is estimated in two different but equivalent ways. The first method takes the full migration table data as in Panel A of Table 8, and fixes the weights on the interactive effects, OD_{ij} , to be zero when the regions of origin and destination are the same, i.e., i=j, insuring that n_{ij}=0. These are called structural zeros. When the origin and destination regions are different, i.e. $$\text{\hspace{0.17em}}\text{}i\ne j\text{}$$, the interaction effects are fixed at 1.0, which is the familiar independence model and gives the predicted offdiagonal flows under the quasiindependence hypothesis. Implementation of this method in SPSS, Stata and R is illustrated in Appendix 3.
The second method does not use the full migration data, but uses the migrantsonly data as in Panel B of Table 7. It is best presented with the additive form:
$$\text{\hspace{0.17em}}\text{}\mathrm{ln}({n}_{ij})=\lambda +{\lambda}_{i}^{O}+{\lambda}_{j}^{D}+{\delta}_{i}I\text{}$$,
where I is an indicator variable taking on values of 1 for the diagonal flows, i.e., when i=j, and values of 0 for the offdiagonal flows, i.e., when $$\text{\hspace{0.17em}}\text{}i\ne j\text{}$$ (Agresti 2002). Therefore, an extra parameter, $$\text{}{\delta}_{i}\text{}$$, is necessary to estimate each diagonal flow, and for the other interregional flows the $$\text{}{\delta}_{i}I\text{}$$ term falls out and the quasiindependence model reduces to the independence model. Consequently, just like the independence model, the offdiagonal interaction terms are constrained to be equal to 0 in the additive form of the model (and equal to 1 in the multiplicative form). Application of this method in Stata is illustrated in Appendix 3.
In the first method, the quasiindependence model fixes m parameters, OD_{ii} , for i = 1 to m, to be equal to 0. In the second method, m additional parameters, $$\text{}{\delta}_{i}\text{}$$, are estimated, and when exponentiated will be very close to 0. Using either method, the quasiindependence model has m more parameters than the full independence model and the degrees of freedom are reduced by m.
Appendix 3 illustrates how the quasiindependence model is estimated with statistical software packages SPSS, Stata and R, using the United States nativeborn migration flow data, 19851990. When the independence model is estimated with the full data, as expected, all goodnessoffit indicators are extremely large: Χ^{2} =544,479,395 (df= 9); G^{2} = 461,411,576 (df= 9); Stata values for BIC and AIC are 461,000,000 and 28,800,000, respectively. When the quasiindependence model is estimated, all values were reduced substantially: Χ^{2} =327,233 (df=5); G^{2} =330,220(df=5); Stata values for BIC and AIC equal 330,207 and 27,535, respectively.
The inferential tests remain significant, and the quasiindependence model must be rejected as the true migration model. The independence and the quasiindependence models should not be compared, inferentially, with the likelihood ratio test because they are not nested models. However, the information measures may be compared directly. Both the BIC and AIC are reduced substantially, favouring the quasiindependence model over the independence model.
In addition, the predicted flows from the independence model can be contrasted with those from the quasiindependence model in Table 9. Visually comparing the predicted flows in Table 9 with the observed data in Table 7 reveals how much closer the quasiindependence model comes to representing the data. Two additional summary statistics are reported: R^{2} and Mean Absolute Percent Error (MAPE). A comparison of the R^{2} values shows the independence model explains 10% of the variation in the observed data and the quasiindependence model explains 95%. Furthermore, the average percent error for the quasiindependence model (MAPE=28) is dramatically reduced in comparison to the independence model (MAPE=2,492).
Since the fit of the quasiindependence model is not close enough to the observed data, it must be rejected as the “true” model. However, without observed migration data, the quasiindependence model may still offer a reasonable, but coarse, method for estimating interregional flows.
Table 9 Predicted United States nativeborn migration flows, 19851990, under independence and quasiindependence
A. Independence 

Destination 

Origin 
1 
2 
3 
4 

1 
8,530,046 
10,806,184 
15,119,178 
8,268,664 

2 
10,717,328 
13,577,116 
18,996,052 
10,388,923 

3 
14,512,977 
18,385,588 
25,723,693 
14,068,264 

4 
7,980,756 
10,110,323 
14,145,583 
7,736,206 

R^{2}= 
0.104 
MAPE= 
2492.322 

B. Quasiindependence 

Destination 

Origin 
1 
2 
3 
4 

1 
0 
535,839 
1,349,561 
576,353 

2 
442,768 
0 
1,793,640 
766,005 

3 
720,681 
1,159,163 
0 
1,246,806 

4 
315,340 
507,201 
1,277,434 
0 

R^{2}= 
0.945 
MAPE= 
27.575 
Application 6: The method of offsets
The validity of the independence and quasiindependence models can be evaluated with the inferential test statistics that accompany the loglinear model output, and, even when the models are not supported with significance tests, these models may be applied, in some contexts, to produce meaningful estimates of migration flows. The method of offsets assumes the auxiliary data have an implied structure of interregional associations that resembles the unknown migration structure. The method of offsets borrows the structure of the auxiliary data to derive the estimates of the missing migration flow data.
In past research, the auxiliary information, typically, has been a table of migration flows from another period in history (Rogers, Little and Raymer 2010; Rogers, Willekens, Little et al. 2002; Rogers, Willekens and Raymer 2003; Willekens 1983), but it could be from another age (Raymer and Rogers 2007), another sex or race group. It could be from another data source all together such as tax return data or motor vehicle registration data.
Given the auxiliary flow data, $$\text{}{n}_{ij}^{*}\text{}$$, the loglinearwithoffsets model is specified as:
$$\text{}\mathrm{ln}\left({\widehat{n}}_{ij}\right)=\lambda +{\lambda}_{i}^{O}+{\lambda}_{j}^{D}+\mathrm{ln}\left({n}_{ij}^{*}\right)\text{}\text{\hspace{0.17em}}.$$
This model will estimate flows, $$\text{}{\widehat{n}}_{ij}\text{}$$, that have a migration structure that comes as close as possible to that of the auxiliary flow data, and, at the same time, the estimated flows are adjusted to sum to the marginal totals prespecified by the researcher. In this way, the method of offsets is similar to the independence and quasiindependence models in that it provides an expected distribution of the flows such that the marginal row and column totals are equal to the a priori estimates.
To illustrate the workings of the method of offsets, consider the Netherlands 1976 migration flow matrix in Table 1. Suppose we wish to keep the numerical values of the row and column marginal totals, but, at the same time, wish to replace the migration interaction effects observed during that year by those observed during 1973, using the method of offsets. What would be the corresponding set of loglinear parameters? Table 10 sets out the predicted flow matrix obtained by the method of offsets in Panel A, and Panel B presents the associated multiplicative components derived using the total sum reference coding. Note that the T, O_{i} and D_{j} values of the predicted matrix, i.e., Panel B of Table 10, are identical to those reported for the observed 1976 flow matrix in Panel B of Table 3. However, the other terms (i.e., the interaction effects, OD_{ij}) reflect the influence of the migration structure of the observed 1973 data, Panel A of Table 3, as well as the row and column totals taken from the 1976 data. Therefore, the method of offsets applies the structure of the auxiliary data, the 1973 data in this case, to the interior flows, and at the same time, preserves the total number of flows observed in the 1976 data.
Table 10 Interregional migration flows in the Netherlands (1976), predicted with the method of offsets from the marginal totals (1976) and the migration flow table (1973)

PANEL A: Predicted using method of offsets 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
12,344 
13,769 
6,890 
12,199 
10,361 
11,518 
67,081 

2 
13,329 
34,695 
12,195 
17,445 
22,522 
24,353 
124,539 

3 
9,728 
15,711 
28,330 
8,883 
15,881 
30,553 
109,087 

4 
11,281 
16,107 
7,011 
11,216 
11,764 
13,187 
70,566 

5 
12,609 
25,486 
16,570 
12,828 
17,770 
21,760 
107,023 

6 
18,116 
35,786 
53,984 
22,110 
27,058 
22,535 
179,589 

Total 
77,408 
141,553 
124,980 
84,682 
105,356 
123,906 
657,885 

R^{2}= 
0.966 
MAPE= 
8.364 

Panel B. Multiplicative components using total sum reference coding 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
1.564 
0.954 
0.541 
1.413 
0.964 
0.912 
0.102 

2 
0.910 
1.295 
0.515 
1.088 
1.129 
1.038 
0.189 

3 
0.758 
0.669 
1.367 
0.633 
0.909 
1.487 
0.166 

4 
1.359 
1.061 
0.523 
1.235 
1.041 
0.992 
0.107 

5 
1.001 
1.107 
0.815 
0.931 
1.037 
1.080 
0.163 

6 
0.857 
0.926 
1.582 
0.956 
0.941 
0.666 
0.273 

Total 
0.118 
0.215 
0.190 
0.129 
0.160 
0.188 
657,885 
The predicted results in Panel A of Table 10 were taken from the output of the SPSS, Stata, and R commands for implementing the method of offsets found in Appendix 4. See the Method of offsets sheet in the accompanying Excel spreadsheet for other calculations.
Since the flows were observed directly in 1976, there are several ways to evaluate the suitability of the method of offsets for predicting the data. One simple method is to inspect visually the ratios of the association multiplicative components, as demonstrated in Table 4. Another method is to use the inferential tests and information measures reported by the loglinear procedures. These would be testing the hypothesis that the structure of the migration flows, i.e., the interaction parameters, did not change from 1973 to 1976. In the example reported in Table 10, the corresponding G^{2} statistic is equal to 5,914 (df=25), and the hypothesis that the auxiliary data represent the same migration process as the observed data must be rejected. The final method, of those suggested here, relies on the standard R^{2} and MAPE statistics to assess the fit between the observed and the predicted flows. These are reported in Panel A of Table 10 and are equal to 0.97 and 8.36, respectively. These statistics, as well as the ratios in Table 4, suggest this application of the method of offsets offers a set of estimates for the migration flows in 1976 that may be quite satisfactory.
The importance placed on the goodnessoffit statistics depends on the quality of the observed flows used as inputs to the method of offsets. If the method is to be useful in a practical situation, it must be applicable when the interregional flows are not directly observed. In the absence of flow data, the method would still require preestimates of the marginal totals. Furthermore, if the method is implemented as illustrated in Appendix 4, initial estimates of the interregional flows are required. Therefore, the preestimates of the row and column totals would need to be distributed across the internal cells of the flow matrix so they add up to the respective marginal totals. Table 11, Panel A, presents a typical scenario, albeit continuing to use the marginal totals from the Netherlands 1976 data, which were observed. A simple solution is to distribute the flows according to the independence model, i.e., , which results in the initial estimates of the flows displayed in Panel B of Table 11.
As long as the initial interregional flows add up to the marginal totals, the predicted flows are not affected by the method used to distribute the flows within the cells. This is true because the flows will be predicted, ultimately, from the auxiliary data through the method of offsets, using the iterative proportional fitting algorithm (Agresti 1990; Deming and Stephan 1940). In other words, the initial estimates of the 1976 Netherland flows, used as input to the offsets loglinear model, could be the internal cells of Table 1, Panel B, or those in Table 11, Panel B. Either set of initial estimates would yield the predicted flows that are reported in Table 10, Panel A.
On the other hand, it is important to note that the associated inferential test statistics and the information measures that accompany the method of offsets must be interpreted with respect to the initial flow estimates. For example, if the initial flows were taken from Panel B of Table 11, the associated X^{2} and G^{2} test statistics would be testing the hypothesis that the predicted data are distributed in a manner that is consistent with the independence model.
Table 11 The inputs to the method of offsets in the absence of observed flows
Panel A. Preestimation marginal totals from the Netherlands, 1976 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 






67,081 

2 






124,539 

3 






109,087 

4 






70,566 

5 






107,023 

6 






179,589 

Total 
77,408 
141,553 
124,980 
84,682 
105,356 
123,906 
657,885 

Panel B. Independence model distribution scheme for initial flow estimates 

Destination 

Origin 
1 
2 
3 
4 
5 
6 
Total 

1 
7,893 
14,433 
12,744 
8,635 
10,743 
12,634 
67,081 

2 
14,654 
26,796 
23,659 
16,030 
19,944 
23,456 
124,539 

3 
12,835 
23,472 
20,724 
14,042 
17,470 
20,545 
109,087 

4 
8,303 
15,183 
13,406 
9,083 
11,301 
13,290 
70,566 

5 
12,593 
23,027 
20,331 
13,776 
17,139 
20,157 
107,023 

6 
21,131 
38,641 
34,117 
23,116 
28,760 
33,824 
179,589 

Total 
77,408 
141,553 
124,980 
84,682 
105,356 
123,906 
657,885 
It is a simple matter to modify the method of offsets to apply it to the problem of predicting a table of “migrants only.” The SPSS, Stata and R commands require minor modifications that are specified in comments in Appendix 4. A worked example is included in the Method of offsets, migrants only sheet of the accompanying workbook. It uses the observed U.S. flows, 19851990, to retrospectively estimate the 197580 migrant flows reported by Rogers, Willekens, Little et al. (2002).
References
Agresti A. 1990. Categorical Data Analysis. New York: Wiley.
Agresti A. 2002. Categorical Data Analysis. New York: WileyInterscience.
Agresti A. 2007. An Introduction to Categorical Data Analysis. Hoboken, NJ: WileyInterscience.
Agresti A and B Finlay. 2009. Statistical Methods for the Social Sciences. Upper Saddle River, NJ: Pearson Prentice Hall.
Alonso W. 1986. Systemic and loglinear models: From here to there, then to now, and this to that. Discussion paper 8610. Cambridge, MA: Harvard University, Center for Population Studies.
Birch MW. 1963. "Maximum likelihood in threeway contingency tables", Journal of the Royal Statistical Society Series BStatistical Methodology 25(1):220233.
Deming WE and FF Stephan. 1940. "On a least squares adjustment of a sampled frequency table when the expected marginal totals are known", Annals of Mathematical Statistics 11(4):427444. doi: https://dx.doi.org/10.1214/aoms/1177731829
Knoke D and PJ Burke. 1980. Loglinear Models. Beverly Hills, CA: Sage Publications.
Mueser P. 1989. "The spatial structure of migration: An analysis of flows between states in the USA over three decades", Regional Studies 23(3):185200. doi: https://dx.doi.org/10.1080/00343408912331345412
Nair PS. 1985. "Estimation of periodspecific gross migration flows from limited data: Biproportional adjustment approach", Demography 22(1):133142. doi: https://dx.doi.org/10.2307/2060992
Powers DA and Y Xie. 2008. Statistical Methods for Categorical Data Analysis. Bingley, UK: Emerald.
Raftery AE. 1986. "Choosing models for crossclassifications", American Sociological Review 51(1):145146. doi: https://dx.doi.org/10.2307/2095483
Raftery AE. 1995. "Bayesian model selection in social research", Sociological Methodology 25(1):111163. doi: https://dx.doi.org/10.2307/271063
Raymer J. 2007. "The estimation of international migration flows: A general technique focused on the origindestination association structure", Environment and Planning A 39(4):985995. doi: https://dx.doi.org/10.1068/a38264
Raymer J, A Bonaguidi and A Valentini. 2006. "Describing and projecting the age and spatial structures of interregional migration in Italy", Population, Space and Place 12(5):371388. doi: https://dx.doi.org/10.1002/psp.414
Raymer J and A Rogers. 2007. "Using age and spacial flow structures in the indirect estimation of migration streams", Demography 44(2):199–223. doi: https://dx.doi.org/10.1353/dem.2007.0016
Rees P and FJ Willekens. 1986. "Data and accounts," in Rogers, A and FJ Willekens (eds). Migration and Settlement: A Multiregional Comparative Study. Dordrecht: D. Reidel, pp. 1958.
Rogers A, JS Little and J Raymer. 2010. The Indirect Estimation of Migration: Methods for Dealing with Irregular, Inadequate, and Missing Data. Dordrecht: Springer.
Rogers A, F Willekens, JS Little and J Raymer. 2002. "Describing migration spatial stucture", Papers in Regional Science 81(1):2948.
Rogers A, FJ Willekens and J Raymer. 2003. "Imposing age and spatial structures on inadequate migrationflow datasets", The Professional Geographer 55(1):5669.
Snickars F and JW Weibull. 1977. "A minimum information principle: Theory and practice", Regional Science and Urban Economics 7(12):137168. doi: https://dx.doi.org/10.1016/01660462(77)900217
Willekens F. 1983. "Loglinear modeling of spatial interaction", Papers of the Regional Science Association 52:187205. doi: https://dx.doi.org/10.1007/BF01944102
 Printerfriendly version
 Log in or register to post comments