Using data integration to meet the ambitions of the 2030 Agenda
The far-reaching and ambitious vision of the 2030 Agenda for Sustainable Development to leave no one behind requires quality, timely, reliable and disaggregated data and statistics. This has drawn official statistics into a challenging situation to meet increasing and evolving data and statistical demands, and no National Statistical Office (NSO) has yet to fulfill all the statistical requirements of the 2030 Agenda.
In the Asia-Pacific region, a collective vision and framework for action for advancing official statistics for the 2030 Agenda calls for collective actions in five areas, in order to strengthen statistical capacity. Amongst the five is integrated statistics for integrated analysis.
What are integrated statistics?
Integrated statistics can be defined as the ultimate fruit of an integrated system of producing and using statistics, where integration in all its dimensions is in place.
A group of national and international experts in Asia and the Pacific met in 2017 to discuss integrated statistics for integrated analysis, and agreed upon four different dimensions: a) process integration, b) data integration, c) conceptual integration, and d) disciplinary integration.
Process, conceptual and disciplinary integration call for fundamental actions to harmonize all factors and unite all actors involved in the production and use of statistics. Whereas, data integration seems to be more in reach for meeting the demands for quality, timely, reliable and disaggregated statistics compiled from national statistical systems as mandated by the 2030 Agenda.
Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information. Statistics New Zealand, one of the region’s leading NSOs, defines data integration as linking data from separate data sources, designed and collected primarily without the intention of being used together. Data integration assists NSOs to produce more comprehensive and disaggregated statistics by pulling together available data from different sources, i.e. census, survey, administrative data, and other new sources.
However, in integrating data from different sources, NSOs face challenges including access to data, interoperability and technical capacity. Although there are common solutions for shared problems, each data integration exercise can introduce unique challenges that require tailored solutions. Thus, the best way to develop capacity on data integration is to practice with variety of data sources and learn from case studies.
To support countries’ efforts to build technical capacity, ESCAP supported two case studies, presenting methods for improving the availability of disaggregated data for inequality and poverty indicators (Bangladesh case study) and women’s economic empowerment indicators (Sri Lanka case study). While the Bangladesh case study uses non-traditional data sources (geo-spatial data), the Sri Lanka study combines data from more traditional data sources (household surveys).
The Bangladesh case study aims to draw a more comprehensive image of the variables related to poverty and inequality in order to assist decision makers in making more targeted decisions. Poverty and inequality may partially be explained by geographical differences, but other factors such as cultural differences and individual characteristics also play important roles. Therefore, this study combines both geographic and individual characteristics to show how different factors such as access to infrastructure, natural resources, difference in climate, etc. contribute to indicators related to poverty, living standards, education, health and attitude towards women. To do so, the study integrates data from the 2014 Bangladesh Demographic Health Survey with the data on geo-covariates.
The Sri Lanka case study reviews data needs for studying poverty-related issues regarding women economic empowerment and combines data from two different survey sources (Labour Force Survey and Household Income and Expenditure Survey) to enhance data availability and data disaggregation. By combining data from these two different surveys, sample sizes are increased to provide a stronger base for producing more disaggregated statistics. The study also shows how to merge new variables into available survey data to enhance data usability.
Countries are facing a significant challenge to meet data requirements for implementation of the 2030 Agenda. Statistical systems need to find new and more cost-efficient approach of producing statistics. Data integration, attempting to use all possible sources of data can be a proper solution to obtain more timely, frequent and granular data at less cost and respondent burden. Although, technical complexities together with challenges related to access, and coordination continue to hinder data integration.