Inequality analysis using R: Disaggregated data from surveys

Household surveys are a key data source for health inequality monitoring. In this practical course, students will learn the main steps and considerations for preparing disaggregated data from household surveys using the statistical software R, including its preparation for the subsequent analysis of health inequalities using the WHO Health Equity Assessment Toolkit (HEAT Plus) software application.

Language: English
Not disease specific

Course information

Overview: For many countries, data from household health surveys are the main source of information for health inequality monitoring because they include data on both health indicators and dimensions of inequality. Yet, there are several considerations, including sampling design complexities and data quality, that must be taken into consideration when analysing and reporting disaggregated survey data.

The aim of this course is to provide learners with a practical guide to the preparation of disaggregated data sourced from household surveys using the statistical software R and RStudio. These data can then be used to monitor and analyse health inequalities using the WHO Health Equity Assessment Toolkit (HEAT Plus) software. The course offers an overview of the main considerations for the analysis of complex survey data and introduces learners to a set of R functions for disaggregated data preparation. The use of R code is demonstrated through examples using a sample dataset. The target audience is monitoring and evaluation officers, data analysts and other technical officers with an interest in data analysis. The course may also be of interest to students and researchers.

Course duration: Approximately 2 hours.

Certificates: A Certificate of Achievement will be available to participants who score at least 80% of the total points available in the final assessment. Participants who receive a certificate of achievement can also download an Open Badge for this course. Click here to learn how.

What you'll learn

  • The main defining characteristics of household survey data and their implications for monitoring health inequalities
  • How to identify health variables and inequality dimensions available in household surveys
  • How to construct key coverage indicators, accounting for issues such as missing values
  • How to produce estimates of health indicators disaggregated by a variety of inequality dimensions
  • How to prepare disaggregated data for analysis using the WHO Health Equity Assessment Toolkit (HEAT Plus)
  • How to identify and use a range of R functions to prepare and manage disaggregated datasets

Course contents

  • Introduction:

    This introduction module gives a brief overview of household survey data and their use in health inequality monitoring, introduces R and RStudio for survey data analysis, and sets the scene for what skills learners can expect to gain from the course.
  • Module 1: Survey design considerations:

    Module 1 provides an overview of key considerations for data analysis using survey data, including the main elements of survey design. By the end of this module, you will identify aspects of survey sampling design that should be incorporated in data preparation and recall the importance of reviewing survey metadata.
  • Module 2: Preparing data for analysis using R:

    Module 2 covers how to perform initial data exploration and preparation tasks in R. By the end of this module, you will read and save datasets using R; identify and describe different types of variables; and incorporate key considerations in the construction of health indicators and inequality dimensions.
  • Module 3: Calculating disaggregated estimates using R:

    Module 3 covers how to calculate a health indicator disaggregated by relevant inequality dimensions (such as age, education, economic status and place of residence), as well as other key estimates for inequality monitoring. By the end of the module, you will specify survey design in R; calculate national averages; calculate disaggregated estimates for each population subgroup; perform double disaggregation; calculate weighted and unweighted population sizes by subgroup; and export a dataset with the results.
  • Module 4: Formatting disaggregated data for use in the Health Equity Assessment Toolkit (HEAT Plus):

    Module 4 covers the final steps to format a dataset with disaggregated data for its exploration using the WHO Health Equity Assessment Toolkit (HEAT Plus) software. By the end of the module, you will identify the key variables necessary for HEAT Plus and construct them, if missing; combine data from different datasets; structure the dataset in the required format; perform relevant quality checks; and export to Excel.
  • Final Assessment

Enroll me for this course

The course is free. Just register for an account on OpenWHO and take the course!
Enroll me now
Learners enrolled: 2933

Certificate Requirements

  • Gain a Record of Achievement by earning at least 80% of the maximum number of points from all graded assignments.