Using Migration Microdata from the Samples of Anonymised Records and the Longitudinal Studies

Using Migration Microdata from the Samples of Anonymised Records and the Longitudinal Studies

Paul Norman (University of Leeds, United Kingdom) and Paul Boyle (University of St Andrews, United Kingdom)
DOI: 10.4018/978-1-61520-755-8.ch007
OnDemand PDF Download:
No Current Special Offers


In this chapter we describe the Samples of Anonymised Records (SARs) and Longitudinal Studies (LSs). The SARs are cross-sectional data like the area and interaction data, but the LSs track people over time. These datasets differ from the United Kingdom’s other census outputs being individual-level ‘microdata’ and population samples. The microdata files are very versatile, allowing multi-way crosstabulations and statistical techniques and enabling application-relevant re-coded variables and study populations to be defined. The SARs files offer UK coverage although a UK-wide study is challenging because data for each country may be in separate files with different access arrangements and variable detail may be country specific. The Office for National Statistics (ONS) Longitudinal Study for England and Wales has underpinned a wide range of research since the 1970s. This well-established source is now complemented by longitudinal data for Scotland and Northern Ireland. Largely driven by the need to ensure respondent confidentiality, the SARs and LSs have some drawbacks for migration-related research. In addition to stringent access arrangements, the geographical area to which individuals are located in the SARs tend to be coarse and although the LS databases record the small area in which the LS member was living at each census, specific ‘place’ information is unlikely to be considered non-disclosive unless for large geographies. However, generic, contextual information about the ‘space’ in which people live is useful even though actual places are not identified. Whilst the SARs and LSs are samples, they are, however, very large samples in comparison with other national surveys and represent first rate resources to complement other sources. In the course of this chapter, along with other references to SARs and LS-based migration research, we review work which utilised these sources to investigate inter-relationships between health, deprivation and migration. The SARs data show that migration is health-selective by age and distance moved and that those persons living in the public housing tenure who are moving into or within deprived areas are most likely to be ill. The role of migration in changing health inequalities between differently deprived areas can be explored using longitudinal data on both origins and destinations. The ONS LS reveals that migrants into and between the least deprived areas have better health than non-migrants, but migrants into and between the most deprived areas have the worst health. The effect of these changes has been to increase the inequality in health between differently deprived areas. A sorting, largely driven by selective migration occurs.
Chapter Preview


As described in Chapter 1, internal migration is captured in the census from the question that asks what the respondent’s usual address was one year ago, with international migration also indicated by the question that asks for the respondent’s country of birth. These are the sources for the migration information provided in census data outputs, including the standard area tables and the Special Migration Statistics (SMS). Both of these data sources provide migration (and other) data, aggregated into different administrative or census-based geographies. The standard tables of data that are released generally include one variable, or a small number of cross-tabulated variables, allowing counts of certain types of migrants to be determined.

While the Census Offices carry out extensive consultations prior to each census to determine demand for the content and detail within these tables, they are limited in the amount of information that is provided about the migrant’s demographic and socio-economic characteristics. The pre-defined cross-tabulations may not provide the information for a particular analysis and the types of statistical modelling which can be applied to aggregate data can be restrictive (Marsh, 1993; Norman, 2003). Although tables can be commissioned from the National Statistics Agencies, this route to data access is potentially time-consuming and costly, and the number of variables in the output would still be limited.

These restrictions are not apparent in the Samples of Anonymised Records (SARs), the use of which is described here. These datasets include individual-level records for samples of either individuals or groups of individuals within households. Because these include most of the census variables collected about each individual, they allow analysis of a wide variety of different demographic and socio-economic factors at any one time; much is therefore known about the characteristics of migrants and non-migrants. Having individual-level detail allows the user to define custom multi-dimensional tabulations, to derive information using combinations of variables, to recode variable categories to be application-relevant; to extract custom study populations and to carry out sophisticated statistical analyses. Known as ‘microdata’, these individual records cannot be supplied for 100% of the population for confidentiality reasons, so samples are extracted which maintain the anonymity of respondents.

As we shall see below, while the SARs have the advantage of relatively complete individual characteristics, these files lack detailed geography so that only crude flow information can be derived. Like virtually all census outputs they also tell us little about the characteristics of migrants prior to the move, because retrospective information is not collected in the census (except for the migration question itself). In contrast, the Longitudinal Studies (LSs) are census-based outputs which provide continuous, multi-cohort samples of the population, allowing people to be tracked over time. The principal aim of the LSs is to link people’s census records at successive censuses. These datasets have the additional advantage that migrants can also be identified by comparing the respondent’s addresses at the time of a census with their address at the time of the previous Census ten years before.

Below we describe the SARs and LSs and their usefulness for migration studies in more detail. Then, following a brief review of the relevant literature, we describe a variety of studies which have used these sources to investigate the relationships between health, migration and area deprivation.

Complete Chapter List

Search this Book: