The digital revolution continues to generate an abundance of data, which provides new opportunities to capture information about socio-economic conditions at different levels of abstraction to infer development progress. The availability of data on poverty and inequality are nevertheless limited. But such data can be used to monitor changes in prosperity level, as well as to measure the impact of government programmes.
In collaboration with Knowledge Sector Initiative (KSI), Pulse Lab Jakarta is organising an upcoming research dive for development, with a view to addressing this gap by enhancing researchers' familiarity with several related datasets, including satellite imagery, e-commerce data, social media, and socio-economic data. The seventh research dive aims to generate insights on how to leverage new and emerging datasets and Artificial Intelligence (AI) for alleviating poverty across Indonesia. The results from measuring poverty with big data are intended to complement the national socioeconomic survey (SUSENAS) data that has been collected by the National Statistics Agency (BPS).
The participants will be grouped into four research teams, focusing on:
Group 1 - Estimating poverty at the provincial level with satellite data
Group 2 - Estimating poverty at the city level with e-commerce data
Group 3 - Estimating poverty at the district level with social media data
Group 4 - Estimating poverty at the household level with social media data and household survey results
For further information, please refer to the information package.
Research Dive Advisors and Participants
Advisors |
|
Prof. Arief Anshory Yusuf |
Universitas Padjajaran |
Faizal Thamrin |
DM Innovation |
Prof. Dedi Rosadi |
Universitas Gadjah Mada |
Researchers |
|
Group 1 – Estimating Poverty at the Provincial Level with Satellite Data |
|
Benny Istanto |
World Food Programme |
I Wayan Gede Astawa Karang |
Universitas Udayana |
Nursida Arif |
Universitas Muhammadiyah Gorontalo |
Pamungkas Jutta Prahara |
Pulse Lab Jakarta |
Group 2 – Estimating Poverty at the City Level with E-Commerce Data |
|
Ana Uluwiyah |
Central Statistics Agency |
Dedy Rahman Wijaya |
Telkom University |
Dwi Rani Puspa Artha |
LPEM UI/Bappenas |
Ni Luh Putu Satyaning Pradnya Paramita |
Institut Teknologi Sepuluh Nopember |
Anissa Zahara |
Pulse Lab Jakarta |
Muhammad Rheza |
Pulse Lab Jakarta |
Group 3 – Estimating Poverty at the District Level with Social Media Data |
|
Lili Ayu Wulandhari |
Bina Nusantara University |
Sri Redjeki |
STMIK AKAKOM Yogyakarta |
Widaryatmo |
Bappenas |
Yunita Sari |
Universitas Gadjah Mada |
Muhammad Rizal Khaefi |
Pulse Lab Jakarta |
Group 4 – Estimating Poverty at the Household Level with Social Media Data and Household Survey Results |
|
Eka Puspitawati |
Pertamina University |
Eko Fadilah |
TNP2K |
Hizkia H. D. Tasik |
Sam Ratulangi University |
Nurlatifah |
Central Statistics Agency (BPS) Bogor |
Rajius Idzalika |
Pulse Lab Jakarta |
Group 1: Estimating Poverty at the Provincial Level with Satellite Data
Abstract
Research suggest that satellite data could be used as a proxy for a different parameters, including urbanisation, density, and eco- nomic growth. One of the parameters that could be extracted from satellite images is land use. In the research, land use classification has been done using Landsat 7 image and topographic map with visual interpretation method. The research took Yogyakarta Province for study case for its diverse historical poverty line and regions. In Kulon Progo, the region with high poverty and vulnerability is generally an area with a small population and physically land, including disaster prone areas.
Results
The results of land use analysis, poverty-prone areas in the Province of Yogyakarta Special Region of Kulon Progo and Gunung Kidul since the number of built areas smaller than other regions. In Kulon Progo, the region with high poverty and vulnerability is generally an area with a small population and physically land, including disaster prone areas such as Kecamatan Kokap.
Limitation and Reccomendation
This study uses medium-resolution imagery with analysis at the Provincial level so that land use classification can not be done in more detail. Poverty is very complex if only done based on land use. An assessment of poverty estimates using satellite imagery should be used for smaller areas at sub-district and village levels with high-resolution imagery. So a more detailed analysis can show the type of roof, the type of settlement patterns, access roads and other attributes that support the identification of poverty levels. Future research can be estimated poverty through the integration of remote sensing images with other parameters such as hazard vulnerability and other social factors such as income and nutrient intake per household.
Abstract
Indonesia abundantly produces big data from various resources, e.g. social media, financial transaction, transportation, call detail records, e-commerce. These types of data have been considered as potential resources to complement periodic survey, even census, to monitor development indicators in which poverty rate is included. This research aims to estimate poverty rate at city-level based on ecommerce data using machine learning methods i.e. Support Vector Regression (SVR) and Arti.cial Neural Network (ANN). Feature selection has been performed with Fast Correlation-Based Filter (FCBF). The result shows that ANN-based model predicts the city level poverty rate very well, with high accuracy, low error and low bias. This research suggests that e-commerce is potential to be used as proxy for city-level poverty rate.
Results
Based on the results discussed above, this research concludes and suggests that:
(1) e-commerce data is potential to be used as proxy for citylevel poverty rate,
(2) e-commerce data can be used to complement o.cial data to poverty rate in between surveys and censuses,
(3) in the future, the method presented in this research is potential to be replicated and scaled-up for all cities in Indonesia or other administrative levels (i.e. province and sub-district), time series or panel data, and data from di.erent e-commerce platforms.