Estimating Poverty using Satellite Imagery and Machine Learning: A Revolutionizing Approach for Policy-Makers
Understanding the poverty situation is of utmost importance for policymakers from both government and non-government agencies, as poverty estimation is crucial to measure the effectiveness of national programs and guide the country's development strategy in an ever-changing economic landscape. Identifying the most vulnerable population is vital in determining resource allocation to achieve Sustainable Development Goals (SDGs) and disaster management. To achieve this, the Bangladesh Bureau of Statistics (BBS), in consultation with the Prime Minister's Office's Principal Coordinator for Sustainable Development Goal Affairs (PMO) and senior officials from various line ministries, recognized the need for more frequent geographically disaggregated estimation of poverty, enabling the government to target its policies more effectively.
Under the technical supervision of the SDG Coordinator's Office, the BBS implemented an exercise for poverty estimation using satellite image data under the Data4Now initiatives with support from the UN Statistics Division, A2i, and UN Data Group, and University of Southampton, coordinated by the UN Resident Coordinator's Office, Bangladesh. The initiative developed a poverty estimation model using freely available satellite imagery and Nighttime Light (NTL) data, which was validated with the 2016 Household and Income Expenditure Survey (HIES) data, achieving an accuracy of 84%. During the exercise, the team estimated poverty for 2022 using a square grid of 3890x3890 square meters, providing poverty estimates up to the union level, with the estimated national average of poverty at 19.0 in 2022.
BBS organized a knowledge-sharing event on 30 March 2023, to present the methodology and results of this innovative poverty estimation model, which uses big data and satellite images to assess poverty levels upto union level. The model's accuracy level can be further enhanced by incorporating additional data such as POI, land cover maps, road maps, and division headquarters location data, among others.
During the event, it was highlighted that this new methodology will enable policymakers to access more frequent and comprehensive poverty estimates, aiding in decision-making and resource allocation to achieve SDGs and disaster management. By adopting this new methodology, the BBS can make significant strides in improving the accuracy and relevance of poverty estimation in Bangladesh, particularly during survey interval periods.
Key initiatives under Data4Now: Two capacity development training workshops were organized in 2022 to carry out the poverty estimation using satellite images and a cloud-based platform. A total of 25 officials from BBS, Bangladesh Bank, General Economics Division, Finance Division, Bangladesh Telecommunication Regulatory Commission, and others received training.
While running the model three types of data (raster, vector, and tabular data) were needed. The VIIRS Cloud Mask–Outlier Removed (vcm–orm) annual composite NPP-VIIRS DNB data collected from the National Oceanic and Atmospheric Administration's National Centers for Environmental Information (NOAA/NCEI) of the United States were used to reflect the NTL intensity in Bangladesh. The moderate-resolution satellite image (January 2016) from Sentinel 2 was collected from the United States Geological Survey (USGS). Both raster data are freely available. Bangladesh district and upazila level administrative data were also collected and used as input to the model and poverty data (as tabular) estimated using the ELL method for 2016 from HIES survey was also incorporated into this model.
A grid-based image (approx. 30,000 nos) processing was utilized using R (programming language) and Google Earth Engine (GEE). Advanced machine learning techniques using Python were incorporated to estimate poverty in Bangladesh. By combining satellite imagery, machine learning algorithms, and poverty data, a Random Forest (RF) based Convolutional Neural Network (CNN) model was fitted using the training samples from the HIES 2016 poverty data. Then, ELL-based poverty estimation in 2016 was compared to the RF-based CNN model for the validated samples for accuracy assessment.
Seven types of layers were used in feature extraction, where out of 63617923 numbers of parameters, 5246979 were trainable, and 58370944 were non-trainable. Fifty epochs were used in the RF-based model to fit the model with poverty estimation using the ELL method with the HIES 2016 data and the accuracy of the feature extraction was 71%. Therefore, the model accuracy was 84 percent while running the code on the validation data. A data trend line indicating value concentration with the line along with distribution of poverty rate is shown here. A comparative statistics and maps for satellite and HIES maps are also shown in the report. According to this model, the estimated national average of poverty for 2022 is 19.0 where the lowest mean is recorded for Khulna division (18.3) and the highest is Barishal (20.0). Therefore, this model could be used for estimating the poverty situation in any inter-survey period, where sometimes the survey takes time to initiate timely.
Incorporating POI data, a land cover map, a road map, division headquarter location data, and other relevant ancillary data into the poverty estimation model can improve the accuracy and comprehensiveness of poverty estimates. Adopting this new model will enable BBS to provide policymakers with more frequent poverty estimates, which can assist in making informed decisions.