[Introduction] The MEASURE DHS project collects nationally representative data on health and population. Typically, multiple surveys have been conducted for each country. To date (October 2013), the DHS website provides data from over 300 surveys conducted in more than 90 countries. This makes the DHS a very interesting data source. Needless to say, MDHS data have been used in a wide variety of studies (Van Malderen et al. 2013; Van de Poel and Speybroeck 2009; Masanja et al. 2008). The MEASURE DHS project maintains a database of survey-based publications.1 Analysing DHS data can be complicated because of the format in which the data are made available. The data for each survey are provided as a number of dataset types with records for different units of analysis such as households, household members, women or children. The same variables are often already included in different dataset types to avoid having to merge them before analysis. For example, most variables for household characteristics are included in the women, men, and children dataset types. In spite of this, there are instances when researchers have to merge different dataset types to obtain a data set including all required variables ready for analysis. This paper is aimed at lowering the barrier to using the DHS by researchers. As illustration we will use the inequality in child mortality and access to medical care among households with different wealth levels. We investigate whether or not, in the DR of the Congo, poorer households (1) live further away from a health facility and (2) have a higher child mortality rate. Using this example, we show how the open source software R can be used to perform the following tasks: • Load DHS data and extract variables • Merge data from different files • Analyse the data using the survey package While we use an example on health inequality and child mortality, it should be noted that exactly the same procedure can be used to analyse the DHS data on HIV, malaria or any other topic the DHS collects data on. We assume a basic knowledge of R for which good tutorials are available online.2 Furthermore, many introductory books, including on using R to analyse survey data, e.g. Lumley (2011), can be found.
Vanderelst, D., & Speybroeck, N. (2014). Loading, merging and analysing demographic and health surveys using R. International Journal of Public Health, 59(2), 415-422. https://doi.org/10.1007/s00038-013-0538-2 (Original work published 2014)