Skip to content

USGS: Earthquake

Introduction

The USGS monitors and reports on earthquakes, assesses earthquake impacts and hazards, and conducts targeted research on the causes and effects of earthquakes. The USGS undertake these activities as part of the larger National Earthquake Hazards Reduction Program (NEHRP), a four-agency partnership established by Congress.

Source: USGS Earthquake

Tags: Climate and Environment, Disasters, Earthquake, Risk, Event reocrds, Daily

Modules

Scrapping:

We are using a USGS API to fetch the data. We are using “updateafter” parameters to get the data for the events after this particular date. In this way, we are making sure we are not getting duplicate data.

API URL: https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=201 4-01-01&endtime=2014-01-02

Below is the API endpoints and the parameters that we can pass to get the data:

1) format - Specify the output format. 1) starttime - Limit to events on or after the specified start time 1) endtime - Limit to events on or before the specified end time 1) updateafter - Limit to events updated after the specified time

After getting the response we store them as timeseries value for each of the country in csv file.

Cleaning:

Duplicate and additional columns are removed from the data. Location names are rectified and country names are formatted correctly.

Geocoder:

Coordinates are added to the metadata for the country. Region and region code are also appended. Geocoder library is used for getting coordinates. We also have a separate JSON file for country’s coordinates to avoid calling third party library to make geocoding process more efficient and faster.

Standardization:

Additional information like sample frequency, units, source and description are included in the metadata. Function for fetching ISO country code and appending it is present in standardization. Predefined domain and subdomain are added in this step. We are creating one single metadata file that includes all the country and the respective keywords.

MetaData:

Timeseries reference id (ts_ref_id) is added to the timeseries data and final timeseries is stored in the bucket. Metadata format is finalized and also stored in the s3 bucket.

Ingest:

Metadata and timeseries data are ingested in the mongoDB and latest timestamp id (mongoDB id for latest timestamp) is appended to metadata for decreasing search for latest data point.

Metadata

GeoJson Data:

o_id: MongoDB unique document id

otype: “Features” (geojson standard attribute)

Attributes Descriptions
url url for the event for more details
source source of the dataset
region_name region for a country according to World Bank Standards.
region_code region code for a region according to World Bank Standards.
country country name of the data
country_code ISO 3-letter country code
description url for more detail of the event
location location of the place
distance_from_city_km distance of origin from the city
original_id original_id of the event from the Source
magnitude magnitude of the earthquake
mag_type type of magnitude
depth depth of origin
types Network that originally authored the reported magnitude for this event
units unit of the earthquake measure
rms The root-mean-square (RMS)
dmin Horizontal distance from the epicenter to the nearest station
felt The total number of felt reports submitted to the DYFIsystem
other_ids other id associated to this event
usgs_source source
sample_frequency frequency of the data
timestamp date and time of the event occurrence
updated date and time of the event update
gap The largest azimuthal gap between azimuthally adjacent stations
significance
net The ID of a data contributor. Identifies the network considered to be the preferred source of information for this event
nst Number of seismic stations which reported P- and S-arrival times
timezone Timezone for the time and date
domain Predefined domain by Taiyo.
subdomain Predefined subdomain by Taiyo.
time_of_sampling time of data collection
date_of_sampling date of data collection
community_determined_intensity
modified_mercalli_intensity
alert alert sent for the event
status Status is either automatic or reviewed
tsunami if the event resulted to a Tsunami (0 for no, 1 for yes)
commmunity_determined_intensity_description description of the cdid
modified_mercalli_intensity_description description of mmid
dmin_description description of dmin
felt_description description of felt
rms_description description of rms
nst_description description of nst
net_description description of net
significance_description description of significance
type_description description of type
status_description description of status
gap_description description of gap
geometry:
type “Point”
coordinates Latitude and Longitude of the event

Data Flow

The above data pipeline runs on Argo and it will be executed on a periodic frequency.

DAGs:

  • USGS-Earthquake: Total No of DAGs file is 1

Taiyō Data Format

Entity USGS-Earthquake
Frequency Daily/ Event Based
Updated On 29-04-2022 UTC 01:27:19 PM
Coverage Around the world
Uncertainties For some keywords, the older data might not be available.

Scope for Improvemen

Every time Argo Workflow run, we over write existing data on the S3 bucket. In future we might want to improve it to only scrap the data that we don’t already have.

  • https://earthquake.usgs.gov/fdsnws/event/1/
  • https://www.usgs.gov/programs/earthquake-hazards