Skip to content

Title

World Bank Projects

Introduction

World Bank, in the entire World Bank Group, is an international organization affiliated with the United Nations (UN) and designed to finance projects that enhance the economic development of member states. Headquartered in Washington, D.C., the bank is the largest source of financial assistance to developing countries. It also provides technical assistance and policy advice and supervises—on behalf of international creditors—the implementation of free-market reforms. Together with the International Monetary Fund (IMF) and the World Trade Organization, it plays a central role in overseeing economic policy, reforming public institutions in developing countries, and defining the global macroeconomic agenda. The World Bank offers developing nations low-interest loans, zero- to low-interest credits, and grants. These funds a wide range of initiatives in education, health, public administration, infrastructure, financial and private sector growth, agriculture, and environmental and natural resource management, among other things. Governments, other international institutions, commercial banks, export credit agencies, and private sector investors also contribute to some of their programmes. The World Bank provides low-interest loans, zero to low-interest credits, and grants to developing countries. These support a wide array of investments in such areas as education, health, public administration, infrastructure, financial and private sector development, agriculture, and environmental and natural resource management. Some of our projects are cofinanced with governments, other multilateral institutions, commercial banks, export credit agencies, and private sector investors.

Source: World Bank Projects

Tags: Multilateral, Government Announcement, Public Procurement, Infrastructure, Construction ​

Modules

Scrapping:

Scrapper uses the World Bank Projects page to get a list of all projects listed there. It also scrapes the geospatial data. Scrapper goes through all the pages for each project, collecting all the data and stores it in a single CSV. This CSV is stored in the bucket.

Cleaning:

Currency conversion, handling null values, removing duplicate entries, timestamp formatting, and sector & subsector cleaning is done.​

Geocoding:

Relevant geocoding information like map coordinates, country and region codes, etc are extracted from the scraped geospatial data.

Standardization:

Additional information like sample frequency, units, source and description are included in the metadata. Predefined sectors and subsectors are added in this step.

MetaData:

Metadata contains timestamp(approval date) range, sectors, subsectors and country codes. And is also stored in the s3 bucket.

Ingest:

Metadata and project data are ingested into an Elastic Search cluster. Index created based on fields - sector, subsector, map coordinates.

Metadata​

Metadata Attributes

Attributes Descriptions
access_level Access Level
api_url API Endpoint for JSON data
approvalfy Approval fiscal year
aug_id ID genereated for unique Identification of asset
board_approval_month Board Approval Month
board_approval_year Board Approval Year
boardapprovaldate Board Approval Date
borrower Borrower
borrowername Borrower Name
budget Budget
closingdate Closing Date
cmt_usd_amt Commitment Usd Amount
completion_riskdo Completion Riskdo
country_code Country Code in 3-ISO format
country_name Country Name
countrycode Country Code in 2-ISO format
countryid Country ID
countryiddesc Country ID description
source Data Source Abbreviation
disbursement Disbursement Amount
esrc_env_risk_rate_name Esrc Env Risk Rate Name
esrc_ovrl_risk_rate Esrc Ovrl Risk Rate
evaluation_riskdo Evaluation Riskdo
fincr_usd_amt Financier Usd Amount
fincrname Financier Name
fiscalyear Fiscal year
geojson Geojson
goal Goal
grant_usd_amt Grant Usd Amount
ibrd_cmt_usd_amt IBRD Commitment Usd Amt
icrdate ICR date
ida_cmt_usd_amt IDA Commitment Usd Amt
identified_sector Identified Sector
identified_sector_subsector_tuple Identified Sector Subsector Tuple
identified_subsector Identified Subsector
impagency Impagency
implementingname Implementing Agency Name
keywords Keywords
laststatusdate Last Status Date
lendinginstr Lending Instrument Abbreviation
lendinginstrumenttypename Lending instrument type name
lendprojectcost Lending Project Cost
locations Locations
map_coordinates Map Coordinates
mjsector Major Sector
original_id Original ID used by source
overall_comments Overall Comments
overall_currentrating Overall Current Rating
overall_prevrating Overall Prev Rating
overall_templatename Overall Template Name
p2a_updated_date P2A Updated Date
parentprojid Parent Project ID
performance_comments Performance Comments
performance_currentrating Performance Current Rating
performance_prevrating Performance Prev Rating
performance_templatename Performance Template Name
prodlinetext Product Line Text
productlinetypename Product Line Type Name
proj_last_upd_date Project Last Update Date
project_abstract Project Abstract
project_development_objective Project Development Objective
project_or_tender Project Or Tender
projectcost Project Cost
projectfinancialtype Project Financial Type
region_code Region Code
region_name Region Name
regionabbr Region abbreviation from source
regionlongname Region Longname
regionname Region Name
sector Sector
sector1 Sector1
sector1_name Sector 1 Name
sector1_percent Sector 1 Percent
sector2 Sector 2
sector2_name Sector 2 Name
sector2_percent Sector 2 Percent
sector3 Sector 3
sector3_name Sector 3 Name
sector3_percent Sector 3 Percent
sector4 Sector 4
sector4_name Sector 4 Name
sector4_percent Sector 4 Percent
sector5 Sector 5
sector5_name Sector 5 Name
sector5_percent Sector 5 Percent
WBPROJ_data_source Source from which WB Projects has gathered the data
status Status
statusdate Status Date
teamleadname Team Lead Name
teammemfullname Team Members Full Name
theme1 Theme 1
theme1_name Theme 1 Name
theme1_percent Theme 1 Percent
theme2 Theme 2
theme2_name Theme 2 Name
theme2_percent Theme 2 Percent
theme3 Theme 3
theme3_name Theme 3 Name
theme3_percent Theme 3 Percent
theme4 Theme 4
theme4_name Theme 4 Name
theme4_percent Theme 4 Percent
theme5 Theme 5
theme5_name Theme 5 Name
theme5_percent Theme 5 Percent
theme_list Theme List
totalcommamt Total Commitment Amount
url Url to official page

Data Flow

The above data pipeline runs on Argo and it will be executed on a daily frequency (except Sunday).

DAGs:

  • WorldBank: Total No of DAGs file is 1

Scope for Improvement

The following can be improved in the next version of the data product:

  • In future, we might want to improve it only to scrap the data that we don’t already have.
  • https://projects.worldbank.org/en/projects-operations/projects-home