Skip to content

IMF - CPI

Introduction

The IMF publishes a range of time series data on IMF lending, exchange rates and other economic and financial indicators. The current Data Product is divided into two sub-DPs named as IMF-one and IMF-two. The reason for that is, in some of the datasets under IMF have additional parameters associated with them Ex: Country Counterpart (for DOTS and CDIS) and Reference Sector (for Fiscal Decentralization). we are currently focusing on G20 countries as country counterpart.

Consumer price indexes (CPIs) are index numbers that measure changes in the prices of goods and services purchased or otherwise acquired by households, which households use directly, or indirectly, to satisfy their own needs and wants. In practice, most CPIs are calculated as weighted averages of the percentage price changes for a specified set, or ‘‘basket’’, of consumer products, the weights reflecting their relative importance in household consumption in some period. CPIs are widely used to index pensions and social security benefits. CPIs are also used to index other payments, such as interest payments or rents, or the prices of bonds. CPIs are also commonly used as a proxy for the general rate of inflation,even though they measure only consumer inflation. They are used by some governments or central banks to set inflation targets for purposes of monetary policy. The price data collected for CPI purposes can also be used to compile other indices, such as the price indices used to deflate household consumption expenditures in national accounts, or the purchasing power parities used to compare real levels of consumption in different countries.

Source: IMF

Tags: Time-series, Risk, Daily

Modules

Scrapping:

IMF has a JSON RESTful Web services that we are calling to get the required data from each of these datasets. Below is the API endpoints and the parameters that we need to pass to get the data.

http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/{Se ries}/{Frequency}.{Area}.{Indicator}.{Date Range}

Series: The broad group of indicators, in case of International Financial Statistics it is IFS

Frequency: For example, monthly M, quarterly Q, or annually A

Area: The country, region, or set of countries, for example GB for the U.K., or GB+US for the U.K. and the U.S.

Indicator: Indicator code of the datasets, they are different depending of the datasets.

Date Range: Use this to limit the data range returned, for example ? startPeriod=2010&endPeriod=2017

Additional Params (optional): For CDIS, DOTS and Fiscal Decentralization we might have to pass these additional parameters.

Note : IMF API can only return maximum number of 3000 as time series. This is why in some cases we have to partition country into groups to query and get the JSON response.

In order to obtain the IFS indicators lists and respective code use the following request:

http://dataservices.imf.org/REST/SDMX_JSON.svc/CodeList/CL_IN DICATOR_CPI

Cleaning:

Duplicate and additional columns are removed from the data. Location names are rectified and country names are formatted correctly.

Geocoder:

Coordinates are added to the metadata for the country. Region and region code are also appended. Geocoder library is used for getting coordinates. We also have a separate JSON file for country’s coordinates to avoid calling third party library to make geocoding process more efficient and faster.

Standardization:

Additional information like sample frequency, units, source and description are included in the metadata. Function for fetching ISO country code and appending it is present in standardization. Predefined domain and subdomain are added in this step.

MetaData:

Timeseries reference id (ts_ref_id) is added to the timeseries data and final timeseries is stored in the bucket. Metadata format is finalized and also stored in the s3 bucket.

Ingest:

Metadata and timeseries data are ingested in the mongoDB and latest timestamp id (mongoDB id for latest timestamp) is appended to metadata for decreasing search for latest data point.

Data Format:

Timeseries Attributes

Attributes Descriptions
ts_ref_id Id used to connect timeseries data to the metadata.
value Timeseries information stored for IMF datasets.
timestamp standard timestamp used for the timeseries

Metadata Attributes

Attributes Descriptions
ts_ref_id Id used to connect metadata to the timeseries
map_coordinates Latitude and Longitude of the station location (geojson format).
country country of the timeseries data.
country_code ISO 3-letter country code
description description of the indictors
domain Predefined domain by Taiyo.
Indicator indicator code of the IMF datasets
name name of the IMF datasets
original_id orginal id defined by IMF (in this case its only {IMF_datasets}_{country iso2 code}
region region for a country according to World Bank Standards.
region_code region code for a region according to World Bank Standards.
sample_frequency frequency in which data gets updated on the source.
sub_domain Predefined subdomain by Taiyo.
time_of_sampling time of data collection
date_of_sampling date of data collection
timezone Timezone for the time and date
units Type of value stored in timeseries
url url for the each of the datasets under IMF.
latest_timestamp_id mongoDB id for latest timestamp in the timeseries.

Data Flow

The above data pipeline runs on Argo and it will be executed on a periodic frequency.

DAGs:

  • imf-cpi: Total No of DAGs file is 1

Taiyo Data Format

Entity IMF CPI
Frequency Monthly/Quarterly/Yearly
Updated On 16-04-2022 UTC 06:47:50 PM
Coverage Total of 13 Indicators covering 250 countries and region
Uncertainties For some countries, the older data might not be available.
## Scope for Improvement

Following can be improved in the next version of the data product:

  • Every time Argo Workflow run, we overwrite existing data on the S3bucket.
  • In future we might want to improve it to only scrap the data that we don’t already have.

  • For IMF-two, we are only collecting data for G20 countries as a countrycounterpart, we might want to add other countries in future as well.